How to perform an analysis of covariance (ANCOVA) (in Python, using pingouin)
Task
Recall that covariates are variables that may be related to the outcome but are unaffected by treatment assignment. In a randomized experiment with one or more observed covariates, an analysis of covariance (ANCOVA) addresses this question: How would the mean outcome in each treatment group change if all groups were equal with respect to the covariate? The goal is to remove any variability in the outcome associated with the covariate from the unexplained variability used to determine statistical significance.
Related tasks:
- How to do a one-way analysis of variance (ANOVA)
- How to compare two nested linear models
- How to conduct a mixed designs ANOVA
- How to conduct a repeated measures ANOVA
Solution
The solution below uses an example dataset about car design and fuel consumption from a 1974 Motor Trend magazine. (See how to quickly load some sample data.)
1
2
from rdatasets import data
df = data('mtcars')
Let’s use ANCOVA to check the effect of the engine type (0 = V-shaped, 1 = straight, in the variable vs
) on the miles per gallon when considering the weight of the car as a covariate. We will use the ancova
function from the pingouin
package to conduct the test.
1
2
from pingouin import ancova
ancova(data=df, dv='mpg', covar='wt', between='vs')
Source | SS | DF | F | p-unc | np2 | |
---|---|---|---|---|---|---|
0 | vs | 54.228061 | 1 | 7.017656 | 1.292580e-02 | 0.194839 |
1 | wt | 405.425409 | 1 | 52.466123 | 5.632548e-08 | 0.644024 |
2 | Residual | 224.093877 | 29 | NaN | NaN | NaN |
The $p$-value for each variable is in the p-unc
column.
The $p$-value for the wt
variable tests the null hypothesis, “The quantities wt
and mpg
are not related.” Since it is below 0.05, we reject the null hypothesis, and conclude that wt
is significant in predicting mpg
.
The $p$-value for the vs
variable tests the null hypothesis, “The quantities vs
and mpg
are not related if we hold wt
constant.” Since it is below 0.05, we reject the null hypothesis, and conclude that vs
is significant in predicting mpg
even among cars with equal weight (wt
).
Note: Unfortunately, a two-factor ANCOVA is not possible in pingouin. However, a model with more than one covariate is possible, as you can provide a list as the covar
parameter when calling ancova
.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)