How to do a hypothesis test for a population proportion
Description
When we have qualitative data, we’re often interested in performing inference on
population proportions. That is, the proportion (between 0.0 and 1.0) of the
population that is in a certain category with respect to the qualitative
variables. Given a sample proportion,
Related tasks:
- How to compute a confidence interval for the population proportion
- How to do a hypothesis test for a mean difference (matched pairs)
- How to do a hypothesis test for the difference between means when both population variances are known
- How to do a hypothesis test for the difference between two proportions
- How to do a hypothesis test for the mean with known standard deviation
- How to do a hypothesis test for the ratio of two population variances
- How to do a hypothesis test of a coefficient’s significance
- How to do a one-sided hypothesis test for two sample means
- How to do a two-sided hypothesis test for a sample mean
- How to do a two-sided hypothesis test for two sample means
Using SciPy, in Python
We’re going to use fake data here for illustrative purposes, but you can replace our fake data with your real data in the code below.
Let’s say that we’ve hypothesized that about one-third of Bostonians are unhappy with the Red Sox’ performance. To test this hypothesis, we surveyed 460 Bostonians and found that 76 of them were unhappy with the Red Sox’ performance.
We summarize this situation with the following variables.
We will do a test with a Type I error rate of
1
2
3
4
n = 460 # Number of respondents in sample
x = 76 # Number of respondents in chosen subset
sample_prop = x/n # Proportion of sample in chosen subset
population_prop = 1/3 # Hypothesized population proportion
Two-tailed test
A two-tailed test is for the null hypothesis stats
module.
1
2
3
4
5
import numpy as np
from scipy import stats
test_stat = ( (sample_prop - population_prop) /
np.sqrt(population_prop*(1 - population_prop)/n) )
stats.norm.sf(abs(test_stat))*2 # p-value
2.0284218907806657e-14
The
Right-tailed test
A right-tailed test is for the null hypothesis
1
2
3
4
5
import numpy as np
from scipy import stats
test_stat = ( (sample_prop - population_prop) /
np.sqrt(population_prop*(1 - population_prop)/n) )
stats.norm.sf(test_stat)
0.9999999999999899
The
Left-tailed test
A left-tailed test is for the null hypothesis
1
2
3
4
5
import numpy as np
from scipy import stats
test_stat = ( (sample_prop - population_prop) /
np.sqrt(population_prop*(1 - population_prop)/n) )
stats.norm.sf(abs(test_stat))
1.0142109453903328e-14
The
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Solution, in R
We’re going to use fake data here for illustrative purposes, but you can replace our fake data with your real data in the code below.
Let’s say that we’ve hypothesized that about one-third of Bostonians are unhappy with the Red Sox’ performance. To test this hypothesis, we surveyed 460 Bostonians and found that 76 of them were unhappy with the Red Sox’ performance.
We summarize this situation with the following variables.
We will do a test with a Type I error rate of
1
2
3
n <- 460 # Number of respondents in sample
x <- 76 # Number of respondents in chosen subset
population_prop <- 1/3 # Hypothesized population proportion
Two-tailed test
A two-tailed test is for the null hypothesis prop.test()
function and provide it the data from above,
requesting a two-tailed test.
1
prop.test(x = x, n = n, p = population_prop, alternative = "two.sided")
1-sample proportions test with continuity correction
data: x out of n, null probability population_prop
X-squared = 57.75, df = 1, p-value = 2.976e-14
alternative hypothesis: true p is not equal to 0.3333333
95 percent confidence interval:
0.1330899 0.2030664
sample estimates:
p
0.1652174
The
R also has a binom.test()
function that takes the same arguments.
Right-tailed test
A right-tailed test is for the null hypothesis prop.test()
function and provide it the data from above,
requesting a right-tailed test.
1
prop.test(x = x, n = n, p = population_prop, alternative = "greater")
1-sample proportions test with continuity correction
data: x out of n, null probability population_prop
X-squared = 57.75, df = 1, p-value = 1
alternative hypothesis: true p is greater than 0.3333333
95 percent confidence interval:
0.1377034 1.0000000
sample estimates:
p
0.1652174
The
Again, binom.test()
takes the same arguments.
Left-tailed test
A left-tailed test is for the null hypothesis prop.test()
function and provide it the data from above,
requesting a left-tailed test.
1
prop.test(x = x, n = n, p = population_prop, alternative = "less")
1-sample proportions test with continuity correction
data: x out of n, null probability population_prop
X-squared = 57.75, df = 1, p-value = 1.488e-14
alternative hypothesis: true p is less than 0.3333333
95 percent confidence interval:
0.0000000 0.1967951
sample estimates:
p
0.1652174
The
Again, binom.test()
takes the same arguments.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.