# How to conduct a repeated measures ANOVA

## Description

In a repeated measures test, the same subject receives multiple treatments. When you have a dataset that includes the responses of a repeated measures test where the measurements are dependent (within subjects design), you may wish to check if there is a difference in the treatment effects. How would you conduct a repeated measures ANOVA to answer that question?

## Using pandas and pingouin, in Python

View this solution alone.

We create a hypothetical repeated measures dataset where the 5 subjects undergo all 4 skin treatments and their rating of the treatment is measured.

1
2
3
4
5
6
7
8
import pandas as pd
df = pd.DataFrame( {
'Subject':        [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5],
'Skin Treatment': ['W','X','Y','Z','W','X','Y','Z','W','X',
'Y','Z','W','X','Y','Z','W','X','Y','Z'],
'Rating':         [7,5,8,4,8,10,7,5,7,6,5,4,7,7,4,5,8,8,6,6]
} )

Subject Skin Treatment Rating
0 1 W 7
1 1 X 5
2 1 Y 8
3 1 Z 4
4 2 W 8

Before we conduct a repeated measures ANOVA, we need to decide which approach to use - Univariate or Multivariate. We decide this using Mauchly’s test of sphericity. If we fail to reject the null hypothesis then we use the univariate approach.

• $H_0 =$ the sphericity assumption holds
• $H_A =$ the sphericity assumption is violated

We use the pingouin statistics package to conduct the test. Most of the parameters below are self-explanatory, except that dv stands for dependent variable.

1
2
import pingouin as pg
pg.sphericity( dv='Rating', within='Skin Treatment', subject='Subject', method='mauchly', data=df )

1
SpherResults(spher=True, W=0.06210054956238558, chi2=7.565056754547507, dof=5, pval=0.20708214225927316)


Since the $p$ value of skin_treatment is about $0.2071$, we fail to reject the sphericity assumption at a 5% significance level and use the univariate approach to conduct the repeated measures ANOVA.

1
2
# Compute a repeated measures ANOVA using a function pingouin adds to our DataFrame:
df.rm_anova( dv='Rating', within='Skin Treatment', subject='Subject', detailed=False )

Source ddof1 ddof2 F p-unc ng2 eps
0 Skin Treatment 3 12 5.117647 0.016501 0.430267 0.541199

Since the $p$ value of about $0.017$ is less than 0.05, we conclude that there is significant evidence of a treatment effect.

Note: If there is more than 1 repeated measures factor, you can add a list of them to the within parameter and conduct the test.

See a problem? Tell us or edit the source.

## Using rstatix and tidyr and car, in R

View this solution alone.

We create a hypothetical repeated measures dataset where the 5 subjects undergo all 4 skin treatments and their rating of the treatment is measured.

1
2
3
4
5
6
subject <- as.factor(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5))
skin.treatment <- c('W','X','Y','Z','W','X','Y','Z','W','X',
'Y','Z','W','X','Y','Z','W','X','Y','Z')
rating <- c(7,5,8,4,8,10,7,5,7,6,5,4,7,7,4,5,8,8,6,6)
df <- data.frame(subject,skin.treatment,rating)

1
2
3
4
5
6
7
subject skin.treatment rating
1 1       W               7
2 1       X               5
3 1       Y               8
4 1       Z               4
5 2       W               8
6 2       X              10


Before we conduct a repeated measures ANOVA, we need to decide which approach to use - Univariate or Multivariate. We decide this using Mauchly’s test of sphericity. If we fail to reject the null hypothesis then we use the univariate approach.

• $H_0 =$ the sphericity assumption holds
• $H_A =$ the sphericity assumption is violated

We use the rstatix package to conduct the test.

• The dependent variable is rating.
• The within-group factor is skin.treatment.
• The Error() term is critical in differentiating between a between subjects and within subjects model. It tells R that there is one observation per subject for each level of skin.treatment.
1
2
3
# install.packages("rstatix") # If you have not already installed it
library(rstatix)
anova_test(rating ~ skin.treatment + Error(subject/skin.treatment), data=df)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Attaching package: ‘rstatix’

The following object is masked from ‘package:stats’:

filter

ANOVA Table (type III tests)

$ANOVA Effect DFn DFd F p p<.05 ges 1 skin.treatment 3 12 5.118 0.017 * 0.43$Mauchly's Test for Sphericity
Effect     W     p p<.05
1 skin.treatment 0.062 0.207

$Sphericity Corrections Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF] 1 skin.treatment 0.541 1.62, 6.49 0.051 0.858 2.57, 10.3 0.023 p[HF]<.05 1 *  The$p$-value we care about in the output is under “Macuhly’s test for sphericity,” for the variable skin.treatment. Because the$p$-value is 0.207, we fail to reject the sphericity assumption at a 5% significance level and use the univariate approach. to conduct the repeated measures ANOVA. ### Repeated measures ANOVA - univariate 1 2 aov1 <- aov(rating ~ skin.treatment + Error(subject/skin.treatment), data=df) summary(aov1)  1 2 3 4 5 6 7 8 9 10 Error: subject Df Sum Sq Mean Sq F value Pr(>F) Residuals 4 11.8 2.95 Error: subject:skin.treatment Df Sum Sq Mean Sq F value Pr(>F) skin.treatment 3 21.75 7.250 5.118 0.0165 * Residuals 12 17.00 1.417 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  You can find the$p$-value at the end of the row of output marked for skin.treatment; it is 0.0165. This is less than 0.05, so we conclude that there is significant evidence of a treatment effect. ### Repeated measures ANOVA - multivariate If instead the first test had rejected the sphericity assumption, we would have used a multivariate approach for the repeated measures ANOVA. We show here how to do such a test, even though it does not apply to this situation. We must first reorganize the data into a matrix where each row represents a single subject, and columns represent levels of the treatment factor. This is possible using the tidyr package. 1 2 3 4 5 # install.packages("tidyr") # If you have not already installed it library(tidyr) multi.data <- spread(df, skin.treatment, rating) multi.data <- as.matrix(multi.data[,-c(1)]) multi.data  1 2 3 4 5 6 W X Y Z [1,] 7 5 8 4 [2,] 8 10 7 5 [3,] 7 6 5 4 [4,] 7 7 4 5 [5,] 8 8 6 6  We then create a multivariate model and also set up a variable that defines the design of the study. 1 2 3 4 # In this model there are no between-subjects factors, so we write ~ 1: multi.ml <- lm(multi.data ~ 1) # The design of the study is a single factor with four levels: rfactor <- factor(c("f1", "f2", "f3", "f4"))  Conduct the repeated measures ANOVA using a multivariate approach. This requires creating a new model using the Anova() function that calculates ANOVA tables. The car package provides the Anova() function. The parameters have the following meanings. • idata includes information about the number of levels, in this case four. • idesign states that rfactor describes a repeated-measures variable. • type tells Anova() to calculate the “Type-III” sums of squares when forming the ANOVA table. • multivariate suppresses output about multivariate statistical tests, which are relevant only when the experimental design includes multiple dependent variables. 1 2 3 4 # install.packages("car") # If you have not already installed it library(car) multi.ml <- Anova(multi.ml, idata=data.frame(rfactor), idesign = ~rfactor, type="III") summary(multi.ml, multivariate=FALSE)  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Loading required package: carData Univariate Type III Repeated-Measures ANOVA Assuming Sphericity Sum Sq num Df Error SS den Df F value Pr(>F) (Intercept) 806.45 1 11.8 4 273.3729 7.837e-05 *** rfactor 21.75 3 17.0 12 5.1176 0.0165 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Mauchly Tests for Sphericity Test statistic p-value rfactor 0.062101 0.20708 Greenhouse-Geisser and Huynh-Feldt Corrections for Departure from Sphericity GG eps Pr(>F[GG]) rfactor 0.5412 0.05068 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 HF eps Pr(>F[HF]) rfactor 0.858156 0.02319302  Although this test was run just as an example, and does not actually apply in this dataset, the output shows a$p$-value of 0.0165, at the end of the first rfactor row. That$p$-value could be compared to a chosen$\alpha\$.

(We also see that Mauchly’s test was performed, which is not significant, and is the reason this data actually demands a univariate approach.)