How to compute adjusted R-squared
Description
If we have fit a multivariate linear model, how can we compute the Adjusted $R^2$ for that model, to measure its goodness of fit?
Related tasks:
Using statsmodels, in Python
We assume you have already fit a multivariate linear model to some data, as in the code below. (If you’re unfamiliar with how to do so, see how to fit a multivariate linear model.) The data shown below is fake, and we assume you will replace it with your own real data if you use this code.
1
2
3
4
5
6
7
8
9
10
11
12
import pandas as pd
import statsmodels.api as sm
df = pd.DataFrame( {
'x1':[2, 7, 4, 3, 11, 18, 6, 15, 9, 12],
'x2':[4, 6, 10, 1, 18, 11, 8, 20, 16, 13],
'x3':[11, 16, 20, 6, 14, 8, 5, 23, 13, 10],
'y':[24, 60, 32, 29, 90, 45, 130, 76, 100, 120]
} )
xs = df[['x1', 'x2', 'x3']]
y = df['y']
xs = sm.add_constant(xs)
model = sm.OLS(y, xs).fit()
You can get a lot of information about your model from its summary.
1
model.summary()
1
2
/opt/conda/lib/python3.11/site-packages/scipy/stats/_stats_py.py:1806: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=10
warnings.warn("kurtosistest only valid for n>=20 ... continuing "
Dep. Variable: | y | R-squared: | 0.594 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.390 |
Method: | Least Squares | F-statistic: | 2.921 |
Date: | Mon, 24 Jul 2023 | Prob (F-statistic): | 0.122 |
Time: | 17:47:21 | Log-Likelihood: | -45.689 |
No. Observations: | 10 | AIC: | 99.38 |
Df Residuals: | 6 | BIC: | 100.6 |
Df Model: | 3 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 77.2443 | 27.366 | 2.823 | 0.030 | 10.282 | 144.206 |
x1 | -2.7009 | 2.855 | -0.946 | 0.381 | -9.686 | 4.284 |
x2 | 7.2989 | 2.875 | 2.539 | 0.044 | 0.265 | 14.333 |
x3 | -4.8607 | 2.187 | -2.223 | 0.068 | -10.211 | 0.490 |
Omnibus: | 2.691 | Durbin-Watson: | 2.123 |
---|---|---|---|
Prob(Omnibus): | 0.260 | Jarque-Bera (JB): | 1.251 |
Skew: | 0.524 | Prob(JB): | 0.535 |
Kurtosis: | 1.620 | Cond. No. | 58.2 |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In particular, that printout contains the Adjusted $R^2$ value; it is the second value in the right-hand column, near the top.
You can also obtain it directly, as follows:
1
model.rsquared_adj
1
0.390392407508503
In this case, the Adjusted $R^2$ is $0.3904$.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Solution, in R
We assume you have already fit a multivariate linear model to the data, as in the code below. (If you’re unfamiliar with how to do so, see how to fit a multivariate linear model.) The data shown below is fake, and we assume you will replace it with your own real data if you use this code.
1
2
3
4
5
x1 <- c(2, 7, 4, 3, 11, 18, 6, 15, 9, 12)
x2 <- c(4, 6, 10, 1, 18, 11, 8, 20, 16, 13)
x3 <- c(11, 16, 20, 6, 14, 8, 5, 23, 13, 10)
y <- c(24, 60, 32, 29, 90, 45, 130, 76, 100, 120)
model <- lm(y ~ x1 + x2 + x3)
You can get a lot of information about your model from its summary.
1
summary(model)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Call:
lm(formula = y ~ x1 + x2 + x3)
Residuals:
Min 1Q Median 3Q Max
-25.031 -20.218 -8.373 22.937 35.640
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 77.244 27.366 2.823 0.0302 *
x1 -2.701 2.855 -0.946 0.3806
x2 7.299 2.875 2.539 0.0441 *
x3 -4.861 2.187 -2.223 0.0679 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 30.13 on 6 degrees of freedom
Multiple R-squared: 0.5936, Adjusted R-squared: 0.3904
F-statistic: 2.921 on 3 and 6 DF, p-value: 0.1222
In particular, that printout contains the Adjusted $R^2$ value; it is the second value in the right-hand column, near the top.
You can also obtain it directly, as follows:
1
summary(model)$adj.r.squared
1
[1] 0.3903924
In this case, the Adjusted $R^2$ is $0.3904$.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.