How to do a hypothesis test for a mean difference (matched pairs)
Description
Say we have two sets of data that are not independent of each other and come from a matched-pairs experiment, $(x_1,x’_1),(x_2,x’_2),\ldots,(x_n,x’_n)$. We want to perform inference on the mean of the differences between these two samples, that is, the mean of $x_1-x’_1,x_2-x’_2,\ldots,x_n-x’_n$, called $\mu_D$. We want to determine if it is significantly different from, greater than, or less than zero (or any other hypothesized value). We can do so with a two-tailed, right-tailed, or left-tailed hypothesis test for matched pairs.
Related tasks:
- How to compute a confidence interval for a mean difference (matched pairs)
- How to do a hypothesis test for a mean difference (matched pairs)
- How to do a hypothesis test for a population proportion
- How to do a hypothesis test for population variance
- How to do a hypothesis test for the difference between means when both population variances are known
- How to do a hypothesis test for the difference between two proportions
- How to do a hypothesis test for the mean with known standard deviation
- How to do a hypothesis test for the ratio of two population variances
- How to do a hypothesis test of a coefficient’s significance
- How to do a one-sided hypothesis test for two sample means
- How to do a two-sided hypothesis test for a sample mean
- How to do a two-sided hypothesis test for two sample means
Using SciPy, in Python
We choose a value, $0 \le \alpha \le 1$, as the Type I Error rate, and in this case we will set it to be 0.05.
We’re going to use fake fata here, but you can replace our fake data with your real data below. Because the data are matched pairs, the samples must be the same size.
1
2
3
# Replace the following example data with your real data
sample1 = [15, 10, 7, 22, 17, 14]
sample2 = [ 9, 1, 11, 13, 3, 6]
Two-tailed test
In a two-sided hypothesis test, the null hypothesis states that the mean difference is equal to 0 (or some other hypothesized value), $H_0: \mu_D = 0$.
1
2
from scipy import stats
stats.ttest_rel(sample1, sample2, alternative = "two-sided")
1
TtestResult(statistic=2.8577380332470415, pvalue=0.03550038112896236, df=5)
Our $p$-value, 0.0355, is smaller than $\alpha$, so we have sufficient evidence to reject the null hypothesis and conclude that the mean difference between the two samples is significantly different from zero.
Note that the function above specifically tests whether the mean of $x_i-x’_i$ is zero. If we want instead to test whether it is some other value $d\neq0$, then that’s equivalent to testing whether the mean of $(x_i-d)-x’_i$ is zero. We could do so with the code below, which uses an example value of $d$. The null hypothesis is now $H_0: \mu_D=d$.
1
2
d = 6 # as an example
stats.ttest_rel([ x - d for x in sample1 ], sample2, alternative = "two-sided")
1
TtestResult(statistic=0.4082482904638631, pvalue=0.6999865427788738, df=5)
The above $p$-value is greater than $\alpha=0.05$, so we could not conclude that the mean difference is significantly different from our chosen $d=6$.
Right-tailed test
If instead we want to test whether the mean difference is less than or equal to zero, $H_0: \mu_D\le0$, we can use a right-tailed test, as follows.
1
stats.ttest_rel(sample1, sample2, alternative = "greater")
1
TtestResult(statistic=2.8577380332470415, pvalue=0.01775019056448118, df=5)
Our $p$-value, 0.01775, is smaller than $\alpha$, so we have sufficient evidence to reject the null hypothesis and conclude that the mean difference between the two samples is significantly greater than zero.
A similar change could be made to the code above to test $H_0:\mu_D\le d$, as in the example code further above that uses $d=6$.
Left-tailed test
If instead we want to test whether the mean difference is greater than or equal to zero, $H_0: \mu_D\ge 0$, we can use a right-tailed test, as follows.
1
stats.ttest_rel(sample1, sample2, alternative = "less")
1
TtestResult(statistic=2.8577380332470415, pvalue=0.9822498094355188, df=5)
Our $p$-value, 0.98225, is larger than $\alpha$, so we do not have sufficient evidence to reject the null hypothesis; we must continue to assume that the mean difference between the two samples is greater than or equal to zero.
A similar change could be made to the code above to test $H_0:\mu_D\ge d$, as in the example code further above that uses $d=6$.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Solution, in R
We choose a value, $0 \le \alpha \le 1$, as the Type I Error rate, and in this case we will set it to be 0.05.
We’re going to use fake fata here, but you can replace our fake data with your real data below. Because the data are matched pairs, the samples must be the same size.
1
2
3
# Replace the following example data with your real data
sample.1 <- c(15, 10, 7, 22, 17, 14)
sample.2 <- c( 9, 1, 11, 13, 3, 6)
Two-tailed test
In a two-sided hypothesis test, the null hypothesis states that the mean difference is equal to 0 (or some other hypothesized value), $H_0: \mu_D = 0$.
1
2
3
alpha = 0.05
t.test(sample.1, sample.2, alternative = "two.sided",
mu = 0, paired = TRUE, conf.level = 1-alpha)
1
2
3
4
5
6
7
8
9
10
Paired t-test
data: sample.1 and sample.2
t = 2.8577, df = 5, p-value = 0.0355
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
0.7033862 13.2966138
sample estimates:
mean difference
7
Our $p$-value, 0.0355, appears in the third line of the output. It is smaller than $\alpha$, so we have sufficient evidence to reject the null hypothesis and conclude that the mean difference between the two samples is significantly different from zero.
If we want instead to test whether it is some other value $d\neq0$, then just
use that value as the mu
parameter to the t.test
function instead of zero.
Right-tailed test
If instead we want to test whether the mean difference is less than or equal to zero, $H_0: \mu_D\le0$, we can use a right-tailed test, as follows.
1
2
t.test(sample.1, sample.2, alternative = "greater",
mu = 0, paired = TRUE, conf.level = 1-alpha)
1
2
3
4
5
6
7
8
9
10
Paired t-test
data: sample.1 and sample.2
t = 2.8577, df = 5, p-value = 0.01775
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
2.06416 Inf
sample estimates:
mean difference
7
Our $p$-value, 0.01775, is smaller than $\alpha$, so we have sufficient evidence to reject the null hypothesis and conclude that the mean difference between the two samples is significantly greater than zero.
Again, you can use another value $d\neq0$ in place of mu = 0
in the code.
Left-tailed test
If instead we want to test whether the mean difference is greater than or equal to zero, $H_0: \mu_D\ge0$, we can use a right-tailed test, as follows.
1
2
t.test(sample.1, sample.2, alternative = "less",
mu = 0, paired = TRUE, conf.level = 1-alpha)
1
2
3
4
5
6
7
8
9
10
Paired t-test
data: sample.1 and sample.2
t = 2.8577, df = 5, p-value = 0.9822
alternative hypothesis: true mean difference is less than 0
95 percent confidence interval:
-Inf 11.93584
sample estimates:
mean difference
7
Our $p$-value, 0.9822, is larger than $\alpha$, so we do not have sufficient evidence to reject the null hypothesis; we must continue to assume that the mean difference between the two samples is greater than or equal to zero.
Again, you can use another value $d\neq0$ in place of mu = 0
in the code.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.