# How to compute a confidence interval for the ratio of two population variances

## Description

Let’s say we want to compute a confidence interval for two population variances. We take two samples of data, $x_1, x_2, x_3, \ldots, x_k$ and $x’_1, x’_2, x’_3, \ldots, x’_k$, and compute their variances, $\sigma_1^2$ and $\sigma_2^2$. How do we compute a confidence interval for $\frac{\sigma_1^2}{\sigma_2^2}$?

## Using SciPy, in Python

View this solution alone.

We’ll use R’s dataset EuStockMarkets as an example; of course you should replace this example data with your actual data when using this code. This dataset has information on the daily closing prices of 4 European stock indices. We’re going to compare the variability of Germany’s DAX and France’s CAC closing prices here. Let’s load in the dataset using the process explained in how to quickly load some sample data.

1
2
3
4
5
6
7
8
9
10
from rdatasets import data
import pandas as pd

# Load in the EuStockMarkets data and convert to a DataFrame
EuStockMarkets = data('EuStockMarkets')
df = pd.DataFrame(EuStockMarkets[['DAX', 'CAC']])

# Our two samples are its DAX and CAC columns
sample1 = df['DAX'].tolist()
sample2 = df['CAC'].tolist()


Now that we have our data loaded we can compute the confidence interval. You can change the confidence level by changing the value of $\alpha$ below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# The degrees of freedom in each sample is its length minus 1
sample1_df = len(sample1) - 1
sample2_df = len(sample2) - 1

# Compute the ratio of the variances
import statistics
ratio = statistics.variance(sample1) / statistics.variance(sample2)

# Find the critical values from the F-distribution
from scipy import stats
alpha = 0.05       # replace with your chosen alpha (here, a 95% confidence level)
lower_critical_value = 1 / stats.f.ppf(q = 1 - alpha/2, dfn = sample1_df, dfd = sample2_df)
upper_critical_value = stats.f.ppf(q = 1 - alpha/2, dfn = sample2_df, dfd = sample1_df)

# Compute the confidence interval
lower_bound = ratio * lower_critical_value
upper_bound = ratio * upper_critical_value
lower_bound, upper_bound

1
(3.190589226470889, 3.827043522824141)


The 95% confidence interval for the ratio of the variances for Germany’s DAX and France’s CAC is $[3.191, 3.827]$.

See a problem? Tell us or edit the source.

## Solution, in R

View this solution alone.

We’ll use R’s dataset EuStockMarkets as an example; of course you should replace this example data with your actual data when using this code. This dataset has information on the daily closing prices of 4 European stock indices. We’re going to compare the variability of Germany’s DAX and France’s CAC closing prices here.

1
2
3
4
5
6
7
8
9
# install.packages("datasets") # if you have not done so already
library(datasets)

# Load in the EuStockMarkets data and convert to a DataFrame
EuStockMarkets <- data.frame(EuStockMarkets)

# Our two samples are its DAX and CAC columns
sample.1 <- EuStockMarkets$DAX sample.2 <- EuStockMarkets$CAC


Now that we have our data loaded we can compute the confidence interval. You can change the confidence level by changing the value of $\alpha$ below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# The degrees of freedom in each sample is its length minus 1
df_1 = length(sample.1) - 1
df_2 = length(sample.2) - 1

# Compute the ratio of the variances
test.stat.ratio <- var(sample.1)/var(sample.2)

# Find the critical values from the F-distribution
alpha <- 0.05       # replace with your chosen alpha (here, a 95% confidence level)
lower_critical_value <- 1 / qf(p = alpha/2, df1 = df_1, df2 = df_2, lower.tail = FALSE)
upper_critical_value <- qf(p = alpha/2, df1 = df_2, df2 = df_1, lower.tail = FALSE)

# Compute the confidence interval and print it out
lower_bound <- test.stat.ratio*lower_critical_value
upper_bound <- test.stat.ratio*upper_critical_value
lower_bound
upper_bound

1
2
3
4
5
[1] 3.190589

[1] 3.827044


The 95% confidence interval for the ratio of the variances for Germany’s DAX and France’s CAC is $[3.191, 3.827]$.