Link Search Menu Expand Document (external link)

How to compute the standard error of the estimate for a model

Description

One measure of the goodness of fit of a model is the standard error of its estimates. If the actual values are $y_i$ and the estimates are $\hat y_i$, the definition of this quantity is as follows, for $n$ data points.

\[\sigma_{\text{est}} = \sqrt{ \frac{ \sum (y_i-\hat y_i)^2 }{ n } }\]

If we’ve fit a linear model, how do we compute the standard error of its estimates?

Using statsmodels, in Python

View this solution alone.

Let’s assume that you already fit the linear model, as shown in the code below. This one uses a small amount of fake data, but it’s just an example. See also how to fit a linear model to two columns of data.

1
2
3
4
5
6
7
8
# Below is the fake data as an example. You can replace with your real data.
x = [  34,   9,  78,  60,  22,  45,  83,  59,  25 ]
y = [ 126, 347, 298, 309, 450, 187, 266, 385, 400 ]

# Use statsmodels to build a linear regression model
import statsmodels.api as sm
x = sm.add_constant( x )
model = sm.OLS( y, x ).fit()

The standard error is shown as part of the model summary, reported by statsmodels’s built-in summary function. See the column entitled “std err” in the output below.

1
model.summary()
1
2
/opt/conda/lib/python3.10/site-packages/scipy/stats/_stats_py.py:1736: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=9
  warnings.warn("kurtosistest only valid for n>=20 ... continuing "
OLS Regression Results
Dep. Variable: y R-squared: 0.063
Model: OLS Adj. R-squared: -0.071
Method: Least Squares F-statistic: 0.4693
Date: Mon, 24 Jul 2023 Prob (F-statistic): 0.515
Time: 20:38:01 Log-Likelihood: -53.705
No. Observations: 9 AIC: 111.4
Df Residuals: 7 BIC: 111.8
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 354.0822 76.733 4.614 0.002 172.638 535.526
x1 -1.0090 1.473 -0.685 0.515 -4.492 2.474
Omnibus: 2.324 Durbin-Watson: 1.618
Prob(Omnibus): 0.313 Jarque-Bera (JB): 1.079
Skew: -0.832 Prob(JB): 0.583
Kurtosis: 2.674 Cond. No. 112.



Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

If we need to extract just the estimates or their standard errors, we can use code like the following.

1
model.params # just the model coefficients
1
array([354.0822479 ,  -1.00901261])
1
model.bse # just the standard errors of those estimates
1
array([76.73277161,  1.47293931])

The standard error of the estimate for the intercept is is 76.73277161 and the standard error of the estimate for the slope is 1.47293931.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

Let’s assume that you already fit the linear model, as shown in the code below. This one uses a small amount of fake data, but it’s just an example. See also how to fit a linear model to two columns of data.

1
2
3
x <- c(34, 9, 78, 60, 22, 45, 83, 59, 25)
y <- c(126, 347, 298, 309, 450, 187, 266, 385, 400)
model <- lm(y ~ x)

The standard error for each estimate is shown as part of the model summary, reported by R’s built-in summary function. See the column entitled “Std. Error” in the output below.

1
summary(model)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-193.776   -4.334   15.459   71.143  118.116 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  354.082     76.733   4.614  0.00244 **
x             -1.009      1.473  -0.685  0.51536   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 107.1 on 7 degrees of freedom
Multiple R-squared:  0.06283,	Adjusted R-squared:  -0.07106 
F-statistic: 0.4693 on 1 and 7 DF,  p-value: 0.5154

If we need to extract just the model coefficients table, or even just the “Std. Error” column of it, we can use code like the following.

1
2
coef(summary(model))
coef(summary(model))[,2]
1
2
3
4
5
6
7
8
            Estimate   Std. Error t value    Pr(>|t|)   
(Intercept) 354.082248 76.732772   4.6144853 0.002441995
x            -1.009013  1.472939  -0.6850334 0.515358250



(Intercept)           x 
  76.732772    1.472939 

The standard error of the estimate for the intercept is is 76.733 and the standard error of the estimate for the slope is 1.473.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

  • Excel
  • Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.