# How to compute a confidence interval for a mean difference (matched pairs)

## Description

Say we have two sets of data that are not independent of each other and come from a matched-pairs experiment, and we want to construct a confidence interval for the mean difference between these two samples. How do we make this confidence interval? Let’s assume we’ve chosen a confidence level of $\alpha$ = 0.05.

Related tasks:

- How to do a hypothesis test for a mean difference (matched pairs)
- How to compute a confidence interval for a regression coefficient
- How to compute a confidence interval for a population mean
- How to compute a confidence interval for a single population variance
- How to compute a confidence interval for the difference between two means when both population variances are known
- How to compute a confidence interval for the difference between two means when population variances are unknown
- How to compute a confidence interval for the difference between two proportions
- How to compute a confidence interval for the expected value of a response variable
- How to compute a confidence interval for the population proportion
- How to compute a confidence interval for the ratio of two population variances

## Using NumPy and SciPy, in Python

We’ll use Numpy and SciPy to do some statistics later.

1
2

import numpy as np
from scipy import stats

This example computes a 95% confidence interval, but you can choose a different level by choosing a different value for $\alpha$.

1

alpha = 0.05

We have two samples of data, $x_1, x_2, x_3, \ldots, x_k$ and $x’_1, x’_2, x’_3, \ldots, x’_k$. We’re going to use some fake data below just as an example; replace it with your real data.

1
2

sample1 = np.array([15, 10, 7, 22, 17, 14])
sample2 = np.array([ 9, 1, 11, 13, 3, 6])

And now the computations:

1
2
3
4
5
6
7

diff_samples = sample1 - sample2 # differences between the samples
n = len(sample1) # number of observations per sample
diff_mean = np.mean(diff_samples) # mean of the differences
diff_variance = np.var( diff_samples, ddof=1 ) # variance of the differences
critical_val = stats.t.ppf(q = 1-alpha/2, df = n - 1) # critical value
radius = critical_val*np.sqrt(diff_variance)/np.sqrt(n) # radius of confidence interval
( diff_mean - radius, diff_mean + radius ) # confidence interval

1

(0.7033861582274517, 13.296613841772547)

Our 95% confidence interval for the mean difference is $[0.70338, 13.2966]$.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

## Solution, in R

We have two samples of data, $x_1, x_2, x_3, \ldots, x_k$ and $x’_1, x’_2, x’_3, \ldots, x’_k$. We’re going to use some fake data below just as an example; replace it with your real data.

1
2

sample.1 <- c(15, 10, 7, 22, 17, 14)
sample.2 <- c(9, 1, 11, 13, 3, 6)

The shortest way to create the confidence interval is with R’s `t.test()`

function.
It’s just one line of code (after we choose $\alpha$).

1
2

alpha <- 0.05 # replace with your chosen alpha (here, a 95% confidence level)
t.test(sample.1, sample.2, paired = TRUE, conf.level = 1-alpha)

1
2
3
4
5
6
7
8
9
10

Paired t-test
data: sample.1 and sample.2
t = 2.8577, df = 5, p-value = 0.0355
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
0.7033862 13.2966138
sample estimates:
mean difference
7

If you need the lower and upper bounds later, you can save them as variables as follows.

1
2
3

conf.interval <- t.test(sample.1, sample.2, paired = TRUE, conf.level = 1-alpha)
lower.bound <- conf.interval$conf.int[1]
upper.bound <- conf.interval$conf.int[2]

It’s also possible to do the computation manually, using the code below.

1
2
3
4
5
6
7
8

diff.samples <- sample.1 - sample.2 # differences between the samples
n = length(sample.1) # number of observations per sample
diff.mean <- mean(diff.samples) # mean of the differences
diff.variance <- var( diff.samples ) # variance of the differences
critical.val <- qt(p = alpha/2, df = n - 1,
lower.tail=FALSE) # critical value
radius <- critical.val*sqrt(diff.variance)/sqrt(n) # radius of confidence interval
c( diff.mean - radius, diff.mean + radius ) # confidence interval

1

[1] 0.7033862 13.2966138

Either method gives the same result. Our 95% confidence interval is $[0.70338, 13.2966]$.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

## Topics that include this task

## Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

- Excel
- Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.