# How to conduct a repeated measures ANOVA (in R, using rstatix and tidyr and car)

## Task

In a repeated measures test, the same subject receives multiple treatments. When you have a dataset that includes the responses of a repeated measures test where the measurements are dependent (within subjects design), you may wish to check if there is a difference in the treatment effects. How would you conduct a repeated measures ANOVA to answer that question?

Related tasks:

- How to do a one-way analysis of variance (ANOVA)
- How to do a two-way ANOVA test with interaction
- How to do a two-way ANOVA test without interaction
- How to compare two nested linear models using ANOVA
- How to conduct a mixed designs ANOVA
- How to perform an analysis of covariance (ANCOVA)

## Solution

We create a hypothetical repeated measures dataset where the 5 subjects undergo all 4 skin treatments and their rating of the treatment is measured.

1
2
3
4
5
6

subject <- as.factor(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5))
skin.treatment <- c('W','X','Y','Z','W','X','Y','Z','W','X',
'Y','Z','W','X','Y','Z','W','X','Y','Z')
rating <- c(7,5,8,4,8,10,7,5,7,6,5,4,7,7,4,5,8,8,6,6)
df <- data.frame(subject,skin.treatment,rating)
head(df)

1
2
3
4
5
6
7

subject skin.treatment rating
1 1 W 7
2 1 X 5
3 1 Y 8
4 1 Z 4
5 2 W 8
6 2 X 10

Before we conduct a repeated measures ANOVA, we need to decide which approach to use - Univariate or Multivariate. We decide this using Mauchly’s test of sphericity. If we fail to reject the null hypothesis then we use the univariate approach.

- $H_0 =$ the sphericity assumption holds
- $H_A =$ the sphericity assumption is violated

We use the `rstatix`

package to conduct the test.

- The dependent variable is
`rating`

. - The within-group factor is
`skin.treatment`

. - The
`Error()`

term is critical in differentiating between a between subjects and within subjects model. It tells R that there is one observation per`subject`

for each level of`skin.treatment`

.

1
2
3

# install.packages("rstatix") # If you have not already installed it
library(rstatix)
anova_test(rating ~ skin.treatment + Error(subject/skin.treatment), data=df)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Attaching package: ‘rstatix’
The following object is masked from ‘package:stats’:
filter
ANOVA Table (type III tests)
$ANOVA
Effect DFn DFd F p p<.05 ges
1 skin.treatment 3 12 5.118 0.017 * 0.43
$`Mauchly's Test for Sphericity`
Effect W p p<.05
1 skin.treatment 0.062 0.207
$`Sphericity Corrections`
Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF]
1 skin.treatment 0.541 1.62, 6.49 0.051 0.858 2.57, 10.3 0.023
p[HF]<.05
1 *

The $p$-value we care about in the output is under “Macuhly’s test for sphericity,” for the variable `skin.treatment`

. Because the $p$-value is 0.207, we fail to reject the sphericity assumption at a 5% significance level and use the univariate approach. to conduct the repeated measures ANOVA.

### Repeated measures ANOVA - univariate

1
2

aov1 <- aov(rating ~ skin.treatment + Error(subject/skin.treatment), data=df)
summary(aov1)

1
2
3
4
5
6
7
8
9
10

Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 4 11.8 2.95
Error: subject:skin.treatment
Df Sum Sq Mean Sq F value Pr(>F)
skin.treatment 3 21.75 7.250 5.118 0.0165 *
Residuals 12 17.00 1.417
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You can find the $p$-value at the end of the row of output marked for `skin.treatment`

; it is 0.0165. This is less than 0.05, so we conclude that there is significant evidence of a treatment effect.

### Repeated measures ANOVA - multivariate

If instead the first test had rejected the sphericity assumption, we would have used a multivariate approach for the repeated measures ANOVA. We show here how to do such a test, even though it does not apply to this situation. We must first reorganize the data into a matrix where each row represents a single subject, and columns represent levels of the treatment factor. This is possible using the `tidyr`

package.

1
2
3
4
5

# install.packages("tidyr") # If you have not already installed it
library(tidyr)
multi.data <- spread(df, skin.treatment, rating)
multi.data <- as.matrix(multi.data[,-c(1)])
multi.data

1
2
3
4
5
6

W X Y Z
[1,] 7 5 8 4
[2,] 8 10 7 5
[3,] 7 6 5 4
[4,] 7 7 4 5
[5,] 8 8 6 6

We then create a multivariate model and also set up a variable that defines the design of the study.

1
2
3
4

# In this model there are no between-subjects factors, so we write ~ 1:
multi.ml <- lm(multi.data ~ 1)
# The design of the study is a single factor with four levels:
rfactor <- factor(c("f1", "f2", "f3", "f4"))

Conduct the repeated measures ANOVA using a multivariate approach. This requires creating a new model using the `Anova()`

function that calculates ANOVA tables. The `car`

package provides the `Anova()`

function. The parameters have the following meanings.

`idata`

includes information about the number of levels, in this case four.`idesign`

states that`rfactor`

describes a repeated-measures variable.`type`

tells`Anova()`

to calculate the “Type-III” sums of squares when forming the ANOVA table.`multivariate`

suppresses output about multivariate statistical tests, which are relevant only when the experimental design includes multiple*dependent*variables.

1
2
3
4

# install.packages("car") # If you have not already installed it
library(car)
multi.ml <- Anova(multi.ml, idata=data.frame(rfactor), idesign = ~rfactor, type="III")
summary(multi.ml, multivariate=FALSE)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

Loading required package: carData
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 806.45 1 11.8 4 273.3729 7.837e-05 ***
rfactor 21.75 3 17.0 12 5.1176 0.0165 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Mauchly Tests for Sphericity
Test statistic p-value
rfactor 0.062101 0.20708
Greenhouse-Geisser and Huynh-Feldt Corrections
for Departure from Sphericity
GG eps Pr(>F[GG])
rfactor 0.5412 0.05068 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
HF eps Pr(>F[HF])
rfactor 0.858156 0.02319302

Although this test was run just as an example, and does not actually apply in this dataset, the output shows a $p$-value of 0.0165, at the end of the first `rfactor`

row. That $p$-value could be compared to a chosen $\alpha$.

(We also see that Mauchly’s test was performed, which is not significant, and is the reason this data actually demands a univariate approach.)

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)