How to compute Fisher’s confidence intervals (in R)
Task
If we run a one-way ANOVA test and find that there is a significant difference between population means, we might want to know which means are actually different from each other. One way to do so is with Fisher’s Least Significant Difference Confidence Intervals, which forms a confidence interval for each pair of samples. How do we go about making these confidence intervals?
Solution
We will use some fake data for the purposes of an example, but you can replace it with your real data in the code below. Consider an ice cream shop’s sales data over several weekends.
1
2
3
4
num.transactions <- c(91, 134, 98, 105, 93, 89, 145, 132, 109,
94, 105, 99, 84, 128, 120, 115, 118)
days <- c("Fri", "Sun", "Sun", "Sat", "Fri", "Fri", "Sat", "Sun", "Sun",
"Fri", "Sat", "Sat", "Fri", "Sun", "Fri", "Sat", "Sun")
Let’s assume that you have already performed an ANOVA on this data, as shown below. (If you’re not familiar with ANOVA, see how to do a one-way ANOVA test.) Let’s assume that we chose $\alpha$ to be 0.05.
1
2
model <- aov(num.transactions ~ days)
summary(model)
1
2
3
4
5
Df Sum Sq Mean Sq F value Pr(>F)
days 2 1965 982.7 4.348 0.034 *
Residuals 14 3164 226.0
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From the $p$-value in the Pr(>F)
column, we can see that, at the 5%
significance level, there are significant differences between the mean number
of transactions at the ice cream shop across these weekend days.
We’ll use the LSD.test
function (Least Significant Difference) from R’s
agricolae
package to get the confidence intervals for each pair of days.
Let’s use $\alpha=0.05$ again so that we get 95% confidence intervals.
1
2
3
4
5
# install.packages("agricolae") # if you have not already done so
library(agricolae)
test <- LSD.test(model, alpha=0.05, "days")
test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$statistics
MSerror Df Mean CV
226.0333 14 109.3529 13.74851
$parameters
test p.ajusted name.t ntr alpha
Fisher-LSD none days 3 0.05
$means
num.transactions std r se LCL UCL Min Max Q25 Q50
Fri 95.16667 12.67149 6 6.13777 82.00246 108.3309 84 120 89.50 92
Sat 113.80000 18.36301 5 6.72359 99.37933 128.2207 99 145 105.00 105
Sun 119.83333 14.23259 6 6.13777 106.66913 132.9975 98 134 111.25 123
Q75
Fri 93.75
Sat 115.00
Sun 131.00
$comparison
NULL
$groups
num.transactions groups
Sun 119.83333 a
Sat 113.80000 ab
Fri 95.16667 b
attr(,"class")
[1] "group"
The portion of this lengthy output on which to focus is the $groups
section.
If the categories share a letter in the “groups” column, then their means are
not significantly different from each other. Therefore:
- Sunday and Saturday share the letter “a,” so we know that the number of transactions on these two days are not significantly different from each other.
- The same goes for Saturday and Friday, which share the letter “b.”
- But Sunday and Friday do not share a letter, so the number of transactions on these two days is significantly different.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)