How to use Bonferroni’s Correction method
Description
If we run a one-way ANOVA test and find that there is a significant difference between population means, we might want to know which means are actually different from each other. One way to do so is with the Bonferroni correction. This method runs a $t$-test for each pair of categories using a conservative confidence level.
Related tasks:
- How to do a one-way analysis of variance (ANOVA)
- How to do a two-sided hypothesis test for two sample means (which is just an ANOVA with only two samples)
- How to do a Kruskal-Wallis test
Solution, in R
Let’s assume that you have already done an analysis of variance (ANOVA). (See how to do a one-way analysis of variance (ANOVA) for details.)
As an example, we will use the fake data below, which looks at the number of transactions at an ice cream shop on the weekends. Let’s assume that we chose $\alpha$ to be 0.05 in that ANOVA.
1
2
3
4
5
6
7
8
9
# Store our fake data in vectors. (You can replace this with your real data.)
num.transactions <- c(91, 134, 98, 105, 93, 89, 145, 132, 109,
94, 105, 99, 84, 128, 120, 115, 118)
days <- c("Fri", "Sun", "Sun", "Sat", "Fri", "Fri", "Sat", "Sun", "Sun",
"Fri", "Sat", "Sat", "Fri", "Sun", "Fri", "Sat", "Sun")
# Perform an ANOVA and print a summary.
model <- aov(num.transactions ~ days)
summary(model)
1
2
3
4
5
Df Sum Sq Mean Sq F value Pr(>F)
days 2 1965 982.7 4.348 0.034 *
Residuals 14 3164 226.0
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The top-right value in the output is the $p$-value for the test, $0.034$. Because it is below our chosen significance level of $\alpha=0.05$, there are significant differences between the mean number of transactions at the ice cream shop across at least two of these weekend days. But specifically which two, or is it more than two?
We’ll use the PostHocTest()
function in the DescTools
package, and specify
that we want to use the Bonferroni method to make the confidence intervals for
each pair of days. Let’s let $\alpha$ be equal to 0.05 again, but the Bonferroni
correction implies that the overall probability of a Type I Error in any of
the tests below is now at most 0.05, rather than each one being 0.05 separately.
1
2
3
4
5
# install.packages("DescTools") # If you have not already installed it
library(DescTools)
# Run the test and print the confidence intervals for each pair of days
PostHocTest(model, method = "bonferroni", conf.level = 0.95)
1
2
3
4
5
6
7
8
9
10
11
Posthoc multiple comparisons of means : Bonferroni
95% family-wise confidence level
$days
diff lwr.ci upr.ci pval
Sat-Fri 18.633333 -6.108523 43.37519 0.1798
Sun-Fri 24.666667 1.076232 48.25710 0.0392 *
Sun-Sat 6.033333 -18.708523 30.77519 1.0000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In the output, R has highlighted the second row for us by placing a *
after
it. That is the one row where the $p$-value (in the final column) is below our
chosen $\alpha=0.05$.
Therefore, the only significant difference in mean number of transactions is
between Sundays and Fridays. Notice also that the confidence interval in that
row (from lwr.ci
to upr.ci
) does not include zero. (In that particular row,
the confidence interval is $(1.076232,48.25710)$.)
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Python
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.