# How to do a hypothesis test for population variance

## Description

Assume we want to estimate the variability of a quantity across a population, starting from a sample of data, $x_1, x_2, x_3, \ldots x_k$. How might we test whether the population variance is equal to, greater than, or less than a hypothesized value?

## Solution, in R

View this solution alone.

We’ll use R’s dataset EuStockMarkets to do an example. This dataset has information on the daily closing prices of 4 European stock indices. We’re going to look at the variability of Germany’s DAX closing prices.

Let’s load the dataset. (See how to quickly load some sample data.) If using your own data, place it into the values variable instead of using the code below.

1
2
3
4
# install.packages("datasets") # If you have not already done this
library(datasets)
EuStockMarkets <- data.frame(EuStockMarkets)
values <- EuStockMarkets$DAX  ### Two-tailed test We may ask whether the population variance is significantly different from a hypothesized value. Let’s test against a variance of 1,000,000. Our null hypothesis states that the population variance is equal to 1,000,000,$H_0: \sigma^2 = 1,000,000$. We calculate the test statistic and$p$-value as follows, using a$\chi^2$distribution. We can use any$\alpha$between 0.0 and 1.0 as our Type I Error Rate; we will use$\alpha=0.05$here. 1 2 3 4 hyp.var <- 1000000 # hypothesized variance df <- length(values) - 1 # degrees of freedom test.statistic <- df*var(values)/hyp.var # test statistic 2*pchisq(test.statistic, df=df, lower.tail=FALSE) # two-tailed p-value  1 [1] 3.189769e-07  Our$p$-value,$3.189769\times10^{-7}$, is smaller than$\alpha$, so we have sufficient evidence to reject the null hypothesis. The variance of closing prices on Germany’s DAX is signficantly different from 1,000,000. ### Left-tailed test What if we wanted to determine if the population variance were significantly less than 1,000,000? Our null hypothesis would therefore be$H_0: \sigma^2 \ge 1,000,000$. The computations are very similar to the previous case, but with a different formula for the$p$-value. We repeat the code that’s in common, for ease of use when copying and pasting. 1 2 3 4 hyp.var <- 1000000 # hypothesized variance df <- length(values) - 1 # degrees of freedom test.statistic <- df*var(values)/hyp.var # test statistic pchisq(test.statistic, df=df, lower.tail=TRUE) # left-tailed p-value  1 [1] 0.9999998  Our p-value, 0.9999998, is greater than$\alpha$, so we do not have sufficient evidence to reject the null hypothesis. We should continue to assume that the variance of closing prices on Germany’s DAX is greater than or equal to 1,000,000. ### Right-tailed test What if we wanted to determine if the population variance were significantly less than 1,000,000? Our null hypothesis would therefore be$H_0: \sigma^2 \ge 1,000,000$. The computations are very similar to the previous case, but with a different formula for the$p$-value. We repeat the code that’s in common, for ease of use when copying and pasting. 1 2 3 4 hyp.var <- 1000000 # hypothesized variance df <- length(values) - 1 # degrees of freedom test.statistic <- df*var(values)/hyp.var # test statistic pchisq(test.statistic, df=df, lower.tail=FALSE) # right-tailed p-value  1 [1] 1.594884e-07  Our p-value,$1.594884\times10^{-7}$, is smaller than$\alpha\$, so have sufficient evidence to reject the null hypothesis. We conclude that the variance of closing prices on Germany’s DAX is significantly greater than 1,000,000.

See a problem? Tell us or edit the source.

## Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

• Python
• Excel
• Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.