How to compute Fisher’s confidence intervals (in R)

Task

If we run a one-way ANOVA test and find that there is a significant difference between population means, we might want to know which means are actually different from each other. One way to do so is with Fisher’s Least Significant Difference Confidence Intervals, which forms a confidence interval for each pair of samples. How do we go about making these confidence intervals?

Solution

We will use some fake data for the purposes of an example, but you can replace it with your real data in the code below. Consider an ice cream shop’s sales data over several weekends.

num.transactions <- c(91, 134, 98, 105, 93, 89, 145, 132, 109,
                      94, 105, 99, 84, 128, 120, 115, 118)
days <- c("Fri", "Sun", "Sun", "Sat", "Fri", "Fri", "Sat", "Sun", "Sun",
          "Fri", "Sat", "Sat", "Fri", "Sun", "Fri", "Sat", "Sun")

Let’s assume that you have already performed an ANOVA on this data, as shown below. (If you’re not familiar with ANOVA, see how to do a one-way ANOVA test.) Let’s assume that we chose $α$ to be 0.05.

model <- aov(num.transactions ~ days)
summary(model)

            Df Sum Sq Mean Sq F value Pr(>F)  
days         2   1965   982.7   4.348  0.034 *
Residuals   14   3164   226.0                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From the $p$ -value in the Pr(>F) column, we can see that, at the 5% significance level, there are significant differences between the mean number of transactions at the ice cream shop across these weekend days.

We’ll use the LSD.test function (Least Significant Difference) from R’s agricolae package to get the confidence intervals for each pair of days. Let’s use $α = 0.05$ again so that we get 95% confidence intervals.

# install.packages("agricolae") # if you have not already done so
library(agricolae)

test <- LSD.test(model, alpha=0.05, "days")
test

$statistics
   MSerror Df     Mean       CV
  226.0333 14 109.3529 13.74851

$parameters
        test p.ajusted name.t ntr alpha
  Fisher-LSD      none   days   3  0.05

$means
    num.transactions      std r      se       LCL      UCL Min Max    Q25 Q50
Fri         95.16667 12.67149 6 6.13777  82.00246 108.3309  84 120  89.50  92
Sat        113.80000 18.36301 5 6.72359  99.37933 128.2207  99 145 105.00 105
Sun        119.83333 14.23259 6 6.13777 106.66913 132.9975  98 134 111.25 123
       Q75
Fri  93.75
Sat 115.00
Sun 131.00

$comparison
NULL

$groups
    num.transactions groups
Sun        119.83333      a
Sat        113.80000     ab
Fri         95.16667      b

attr(,"class")
[1] "group"

The portion of this lengthy output on which to focus is the $groups section. If the categories share a letter in the “groups” column, then their means are not significantly different from each other. Therefore:

Sunday and Saturday share the letter “a,” so we know that the number of transactions on these two days are not significantly different from each other.
The same goes for Saturday and Friday, which share the letter “b.”
But Sunday and Friday do not share a letter, so the number of transactions on these two days is significantly different.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)