# How to do a one-way analysis of variance (ANOVA) (in Python, using SciPy)

See all solutions.

If we have multiple independent samples of the same quantity (such as students’ SAT scores from several different schools), we may want to test whether the means of each of the samples are the same. Analysis of Variance (ANOVA) can determine whether any two of the sample means differ significantly. How can we do an ANOVA?

## Solution

Let’s assume we have our samples in several different Python lists. (Although anything like a list is also supported, including pandas Series.) Here I’ll construct some made-up data about SAT scores at four different schools.

1
2
3
4
school1_SATs = [ 1100, 1250, 1390, 970, 1510 ]
school2_SATs = [ 1010, 1050, 1090, 1110 ]
school3_SATs = [ 900, 1550, 1300, 1270, 1210 ]
school4_SATs = [ 900, 850, 1110, 1070, 910, 920 ]


ANOVA tests the null hypothesis that all group means are equal. You choose $\alpha$, the probability of Type I error (false positive, finding we should reject $H_0$ when it’s actually true). I will use $\alpha=0.05$ in this example.

1
2
3
4
5
6
7
8
9
alpha = 0.05

# Run a one-way ANOVA and print out alpha, the p value,
# and whether the comparison says to reject the null hypothesis.
from scipy import stats
F_statistic, p_value = stats.f_oneway(
school1_SATs, school2_SATs, school3_SATs, school4_SATs )
reject_H0 = p_value < alpha
alpha, p_value, reject_H0

1
(0.05, 0.0342311478489849, True)


The result we see above is to reject $H_0$, and therefore conclude that at least one pair of means is statistically significantly different.