How to do a two-sided hypothesis test for two sample means

Description

If we have two samples, $x_{1}, \dots, x_{n}$ and $x_{1}^{'}, \dots, x_{m}^{'}$ , and we compute the mean of each one, we might want to ask whether the two means seem approximately equal. Or more precisely, is their difference statistically significant at a given level?

Related tasks:

Solution, in Julia

View this solution alone.

If we call the mean of the first sample ${\bar{x}}_{1}$ and the mean of the second sample ${\bar{x}}_{2}$ , then this is a two-sided test with the null hypothesis $H_{0} : {\bar{x}}_{1} = {\bar{x}}_{2}$ . We choose a value $0 \leq α \leq 1$ as the probability of a Type I error (false positive, finding we should reject $H_{0}$ when it’s actually true).

# Replace these first three lines with the values from your situation.
alpha = 0.10
sample1 = [ 6, 9, 7, 10, 10, 9 ]
sample2 = [ 12, 14, 10, 17, 9 ]

# Run a one-sample t-test and print out alpha, the p value,
# and whether the comparison says to reject the null hypothesis.
using HypothesisTests
p_value = pvalue( UnequalVarianceTTest( sample1, sample2 ) )
reject_H0 = p_value < alpha
alpha, p_value, reject_H0

(0.1, 0.050972837418476996, true)

In this case, the $p$ -value was less than $α$ , so the sample gives us enough evidence to reject the null hypothesis at the $α = 0.10$ level. The data suggest that ${\bar{x}}_{1} \neq {\bar{x}}_{2}$ .

When you are using the most common value for $α$ , which is $0.05$ for the $95 %$ confidence interval, you can simply print out the test itself and get a detailed printout with all the information you need, thus saving a few lines of code. Note that this gives a different answer below than the one above, because above we chose to use $α = 0.10$ , but the default below is $α = 0.05$ .

UnequalVarianceTTest( sample1, sample2 )

Two sample t-test (unequal variance)
------------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -3.9
    95% confidence interval: (-7.823, 0.02309)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.0510

Details:
    number of observations:   [6,5]
    t-statistic:              -2.4616581720814326
    degrees of freedom:       5.720083530052662
    empirical standard error: 1.584297951775486

Here we did not assume that the two samples had equal variance. If in your case they do, you can use EqualVarianceTTest() instead of UnequalVarianceTTest().

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Using SciPy, in Python

View this solution alone.

from scipy import stats

# Replace these first three lines with the values from your situation.
alpha = 0.10
sample1 = [ 6, 9, 7, 10, 10, 9 ]
sample2 = [ 12, 14, 10, 17, 9 ]

# Run a one-sample t-test and print out alpha, the p value,
# and whether the comparison says to reject the null hypothesis.
stats.ttest_ind( sample1, sample2, equal_var=False )

Ttest_indResult(statistic=-2.4616581720814326, pvalue=0.05097283741847698)

The output says that the $p$ -value is about $0.05097$ , which is less than $α = 0.10$ . In this case, the samples give us enough evidence to reject the null hypothesis at the $α = 0.10$ level. That is, the data suggest that ${\bar{x}}_{1} \neq {\bar{x}}_{2}$ .

The equal_var parameter tells SciPy not to assume that the two samples have equal variances. If in your case they do, you can omit that parameter, and it will revert to its default value of True.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

# Replace these first three lines with the values from your situation.
alpha <- 0.10
sample1 <- c( 6, 9, 7, 10, 10, 9 )
sample2 <- c( 12, 14, 10, 17, 9 )

# Run a one-sample t-test and print out alpha, the p value,
# and whether the comparison says to reject the null hypothesis.
t.test( sample1, sample2, conf.level=1-alpha )

	Welch Two Sample t-test

data:  sample1 and sample2
t = -2.4617, df = 5.7201, p-value = 0.05097
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
 -7.0057683 -0.7942317
sample estimates:
mean of x mean of y 
      8.5      12.4 

Although we can deduce the answer to our question from the above output, by comparing the $p$ value with $α$ manually, we can also ask R to do it.

# Is there enough evidence to reject the null hypothesis?
result <- t.test( sample1, sample2, conf.level=1-alpha )
result$p.value < alpha

[1] TRUE

In this case, the samples give us enough evidence to reject the null hypothesis at the $α = 0.10$ level. The data suggest that ${\bar{x}}_{1} \neq {\bar{x}}_{2}$ .

Here we did not assume that the two samples had equal variance. If in your case they do, you can pass the parameter var.equal=TRUE to t.test.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

Excel

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.