How to conduct a repeated measures ANOVA (in Python, using pandas and pingouin)
Task
In a repeated measures test, the same subject receives multiple treatments. When you have a dataset that includes the responses of a repeated measures test where the measurements are dependent (within subjects design), you may wish to check if there is a difference in the treatment effects. How would you conduct a repeated measures ANOVA to answer that question?
Related tasks:
- How to do a one-way analysis of variance (ANOVA)
- How to do a two-way ANOVA test with interaction
- How to do a two-way ANOVA test without interaction
- How to compare two nested linear models using ANOVA
- How to conduct a mixed designs ANOVA
- How to perform an analysis of covariance (ANCOVA)
Solution
We create a hypothetical repeated measures dataset where the 5 subjects undergo all 4 skin treatments and their rating of the treatment is measured.
1
2
3
4
5
6
7
8
import pandas as pd
df = pd.DataFrame( {
'Subject': [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5],
'Skin Treatment': ['W','X','Y','Z','W','X','Y','Z','W','X',
'Y','Z','W','X','Y','Z','W','X','Y','Z'],
'Rating': [7,5,8,4,8,10,7,5,7,6,5,4,7,7,4,5,8,8,6,6]
} )
df.head()
Subject | Skin Treatment | Rating | |
---|---|---|---|
0 | 1 | W | 7 |
1 | 1 | X | 5 |
2 | 1 | Y | 8 |
3 | 1 | Z | 4 |
4 | 2 | W | 8 |
Before we conduct a repeated measures ANOVA, we need to decide which approach to use - Univariate or Multivariate. We decide this using Mauchly’s test of sphericity. If we fail to reject the null hypothesis then we use the univariate approach.
the sphericity assumption holds the sphericity assumption is violated
We use the pingouin
statistics package to conduct the test.
Most of the parameters below are self-explanatory, except that dv
stands for dependent variable.
1
2
import pingouin as pg
pg.sphericity( dv='Rating', within='Skin Treatment', subject='Subject', method='mauchly', data=df )
SpherResults(spher=True, W=0.06210054956238558, chi2=7.565056754547507, dof=5, pval=0.20708214225927316)
Since the skin_treatment
is about
1
2
# Compute a repeated measures ANOVA using a function pingouin adds to our DataFrame:
df.rm_anova( dv='Rating', within='Skin Treatment', subject='Subject', detailed=False )
Source | ddof1 | ddof2 | F | p-unc | ng2 | eps | |
---|---|---|---|---|---|---|---|
0 | Skin Treatment | 3 | 12 | 5.117647 | 0.016501 | 0.430267 | 0.541199 |
Since the
Note: If there is more than 1 repeated measures factor, you can add a list of them to the within
parameter and conduct the test.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)