How to conduct a repeated measures ANOVA (in Python, using pandas and pingouin)
Task
In a repeated measures test, the same subject receives multiple treatments. When you have a dataset that includes the responses of a repeated measures test where the measurements are dependent (within subjects design), you may wish to check if there is a difference in the treatment effects. How would you conduct a repeated measures ANOVA to answer that question?
Related tasks:
- How to do a one-way analysis of variance (ANOVA)
- How to do a two-way ANOVA test with interaction
- How to do a two-way ANOVA test without interaction
- How to compare two nested linear models using ANOVA
- How to conduct a mixed designs ANOVA
- How to perform an analysis of covariance (ANCOVA)
Solution
We create a hypothetical repeated measures dataset where the 5 subjects undergo all 4 skin treatments and their rating of the treatment is measured.
1
2
3
4
5
6
7
8
import pandas as pd
df = pd.DataFrame( {
'Subject': [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5],
'Skin Treatment': ['W','X','Y','Z','W','X','Y','Z','W','X',
'Y','Z','W','X','Y','Z','W','X','Y','Z'],
'Rating': [7,5,8,4,8,10,7,5,7,6,5,4,7,7,4,5,8,8,6,6]
} )
df.head()
Subject | Skin Treatment | Rating | |
---|---|---|---|
0 | 1 | W | 7 |
1 | 1 | X | 5 |
2 | 1 | Y | 8 |
3 | 1 | Z | 4 |
4 | 2 | W | 8 |
Before we conduct a repeated measures ANOVA, we need to decide which approach to use - Univariate or Multivariate. We decide this using Mauchly’s test of sphericity. If we fail to reject the null hypothesis then we use the univariate approach.
- $H_0 =$ the sphericity assumption holds
- $H_A =$ the sphericity assumption is violated
We use the pingouin
statistics package to conduct the test.
Most of the parameters below are self-explanatory, except that dv
stands for dependent variable.
1
2
import pingouin as pg
pg.sphericity( dv='Rating', within='Skin Treatment', subject='Subject', method='mauchly', data=df )
1
SpherResults(spher=True, W=0.06210054956238558, chi2=7.565056754547507, dof=5, pval=0.20708214225927316)
Since the $p$ value of skin_treatment
is about $0.2071$, we fail to reject the sphericity assumption at a 5% significance level and use the univariate approach to conduct the repeated measures ANOVA.
1
2
# Compute a repeated measures ANOVA using a function pingouin adds to our DataFrame:
df.rm_anova( dv='Rating', within='Skin Treatment', subject='Subject', detailed=False )
Source | ddof1 | ddof2 | F | p-unc | ng2 | eps | |
---|---|---|---|---|---|---|---|
0 | Skin Treatment | 3 | 12 | 5.117647 | 0.016501 | 0.430267 | 0.541199 |
Since the $p$ value of about $0.017$ is less than 0.05, we conclude that there is significant evidence of a treatment effect.
Note: If there is more than 1 repeated measures factor, you can add a list of them to the within
parameter and conduct the test.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)