Link Search Menu Expand Document (external link)

How to compute a confidence interval for a regression coefficient (in Python, using statsmodels)

See all solutions.

Task

Say we have a linear regression model, either single variable or multivariate. How do we compute a confidence interval for the coefficient of one of the explanatory variables in the model?

Related tasks:

Solution

We’ll assume that you have fit a single linear model to your data, as in the code below, which uses fake example data. You can replace it with your actual data.

1
2
3
4
5
6
7
8
import statsmodels.api as sm

xs = [  34,   9,  78,  60,  22,  45,  83,  59,  25 ]
ys = [ 126, 347, 298, 309, 450, 187, 266, 385, 400 ]

xs = sm.add_constant( xs )
model = sm.OLS( ys, xs )
results = model.fit()

We can use Python’s conf_int() function to find the confidence interval for the model coefficients. You can change the alpha parameter to specify a different significance level. Note that if you have a multiple regression model, it will make confidence intervals for all of the coefficient values.

1
results.conf_int( alpha=0.05 )
1
2
array([[172.63807531, 535.52642049],
       [ -4.49196063,   2.47393542]])

Each list in the array represents the 95% confidence interval for the corresponding coefficient in the model beginning with the intercept and each regression coefficient thereafter. Accordingly, the 95% confidence interval for the regression coefficient is $[-4.49196063,2.47393542]$.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Andrew Quagliaroli (aquagliaroli@falcon.bentley.edu)