How to compute a confidence interval for the ratio of two population variances (in Python, using SciPy)

Task

Let’s say we want to compute a confidence interval for two population variances. We take two samples of data, $x_{1}, x_{2}, x_{3}, \dots, x_{k}$ and $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots, x_{k}^{'}$ , and compute their variances, $σ_{1}^{2}$ and $σ_{2}^{2}$ . How do we compute a confidence interval for $\frac{σ_{1}^{2}}{σ_{2}^{2}}$ ?

Related tasks:

Solution

We’ll use R’s dataset EuStockMarkets as an example; of course you should replace this example data with your actual data when using this code. This dataset has information on the daily closing prices of 4 European stock indices. We’re going to compare the variability of Germany’s DAX and France’s CAC closing prices here. Let’s load in the dataset using the process explained in how to quickly load some sample data.

from rdatasets import data
import pandas as pd

# Load in the EuStockMarkets data and convert to a DataFrame
EuStockMarkets = data('EuStockMarkets')
df = pd.DataFrame(EuStockMarkets[['DAX', 'CAC']])

# Our two samples are its DAX and CAC columns
sample1 = df['DAX'].tolist()
sample2 = df['CAC'].tolist()

Now that we have our data loaded we can compute the confidence interval. You can change the confidence level by changing the value of $α$ below.

# The degrees of freedom in each sample is its length minus 1
sample1_df = len(sample1) - 1
sample2_df = len(sample2) - 1

# Compute the ratio of the variances
import statistics
ratio = statistics.variance(sample1) / statistics.variance(sample2)

# Find the critical values from the F-distribution
from scipy import stats
alpha = 0.05       # replace with your chosen alpha (here, a 95% confidence level)
lower_critical_value = 1 / stats.f.ppf(q = 1 - alpha/2, dfn = sample1_df, dfd = sample2_df)
upper_critical_value = stats.f.ppf(q = 1 - alpha/2, dfn = sample2_df, dfd = sample1_df)

# Compute the confidence interval
lower_bound = ratio * lower_critical_value
upper_bound = ratio * upper_critical_value
lower_bound, upper_bound

(3.190589226470889, 3.827043522824141)

The 95% confidence interval for the ratio of the variances for Germany’s DAX and France’s CAC is $[3.191, 3.827]$ .

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)