How to compute a confidence interval for a mean difference (matched pairs)

Description

Say we have two sets of data that are not independent of each other and come from a matched-pairs experiment, and we want to construct a confidence interval for the mean difference between these two samples. How do we make this confidence interval? Let’s assume we’ve chosen a confidence level of $α$ = 0.05.

Related tasks:

Using NumPy and SciPy, in Python

View this solution alone.

We’ll use Numpy and SciPy to do some statistics later.

import numpy as np
from scipy import stats

This example computes a 95% confidence interval, but you can choose a different level by choosing a different value for $α$ .

alpha = 0.05

We have two samples of data, $x_{1}, x_{2}, x_{3}, \dots, x_{k}$ and $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots, x_{k}^{'}$ . We’re going to use some fake data below just as an example; replace it with your real data.

sample1 = np.array([15, 10,  7, 22, 17, 14])
sample2 = np.array([ 9,  1, 11, 13,  3,  6])

And now the computations:

diff_samples = sample1 - sample2                        # differences between the samples
n = len(sample1)                                        # number of observations per sample
diff_mean = np.mean(diff_samples)                       # mean of the differences
diff_variance = np.var( diff_samples, ddof=1 )          # variance of the differences
critical_val = stats.t.ppf(q = 1-alpha/2, df = n - 1)   # critical value
radius = critical_val*np.sqrt(diff_variance)/np.sqrt(n) # radius of confidence interval
( diff_mean - radius, diff_mean + radius )              # confidence interval

(0.7033861582274517, 13.296613841772547)

Our 95% confidence interval for the mean difference is $[0.70338, 13.2966]$ .

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

sample.1 <- c(15, 10, 7, 22, 17, 14)
sample.2 <- c(9, 1, 11, 13, 3, 6)

The shortest way to create the confidence interval is with R’s t.test() function. It’s just one line of code (after we choose $α$ ).

alpha <- 0.05       # replace with your chosen alpha (here, a 95% confidence level)
t.test(sample.1, sample.2, paired = TRUE, conf.level = 1-alpha)

	Paired t-test

data:  sample.1 and sample.2
t = 2.8577, df = 5, p-value = 0.0355
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
  0.7033862 13.2966138
sample estimates:
mean difference 
              7 

If you need the lower and upper bounds later, you can save them as variables as follows.

conf.interval <- t.test(sample.1, sample.2, paired = TRUE, conf.level = 1-alpha)
lower.bound <- conf.interval$conf.int[1]
upper.bound <- conf.interval$conf.int[2]

It’s also possible to do the computation manually, using the code below.

diff.samples <- sample.1 - sample.2                # differences between the samples
n = length(sample.1)                               # number of observations per sample
diff.mean <- mean(diff.samples)                    # mean of the differences
diff.variance <- var( diff.samples )               # variance of the differences
critical.val <- qt(p = alpha/2, df = n - 1,
    lower.tail=FALSE)                              # critical value
radius <- critical.val*sqrt(diff.variance)/sqrt(n) # radius of confidence interval
c( diff.mean - radius, diff.mean + radius )        # confidence interval

[1]  0.7033862 13.2966138

Either method gives the same result. Our 95% confidence interval is $[0.70338, 13.2966]$ .

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Bentley University MA214

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

Excel
Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.