How to do a one-way analysis of variance (ANOVA) (in R)
Task
If we have multiple independent samples of the same quantity (such as students’ SAT scores from several different schools), we may want to test whether the means of each of the samples are the same. Analysis of Variance (ANOVA) can determine whether any two of the sample means differ significantly. How can we do an ANOVA?
Related tasks:
- How to do a two-sided hypothesis test for two sample means (which is just an ANOVA with only two samples)
- How to do a two-way ANOVA test with interaction
- How to do a two-way ANOVA test without interaction
- How to compare two nested linear models
- How to conduct a mixed designs ANOVA
- How to conduct a repeated measures ANOVA
- How to perform an analysis of covariance (ANCOVA)
- How to do a Kruskal-Wallis test
Solution
R expects you to have all the samples in one vector, and the groups they came from in a separate, categorical vector. So, for example, if we had SAT scores from four different schools (named A, B, C, and D), then our data might be arranged like this.
1
2
3
4
5
6
7
8
SAT.scores <- c(
1100, 1250, 1390, 970, 1510, 1010, 1050, 1090, 1110,
900, 1550, 1300, 1270, 1210, 900, 850, 1110, 1070, 910, 920
)
school.names <- c(
'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B',
'C', 'C', 'C', 'C', 'C', 'D', 'D', 'D', 'D', 'D', 'D'
)
ANOVA tests the null hypothesis that all group means are equal.
You choose
1
2
3
# Run a one-way ANOVA and print a summary of all the output
result <- aov( SAT.scores ~ school.names )
summary( result )
Df Sum Sq Mean Sq F value Pr(>F)
school.names 3 321715 107238 3.689 0.0342 *
Residuals 16 465140 29071
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The
Or you could ask R to do the comparison for you, but getting the
1
2
3
alpha <- 0.05
p.value <- unname( unlist( summary( result ) ) )[9]
p.value < alpha
[1] TRUE
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Nathan Carter (ncarter@bentley.edu)