Link Search Menu Expand Document (external link)

How to add a polynomial term to a model

Description

Sometimes, a simple linear model isn’t sufficient to describe the data. How can we include a higher-order term in a regression model, such as the square or cube of one of the predictors?

Related tasks:

Using sklearn, in Python

View this solution alone.

We begin with a fabricated dataset of 20 points. You can replace the code below with your own, real, data.

1
2
3
4
5
import numpy as np
import pandas as pd

x = np.arange(0,20)                                                  # List of integers from 0 to 19
y = [3,4,5,7,9,20,31,50,70,75,80,91,101,120,135,160,179,181,190,193] # List of 20 integers

We extend our dataset with a new column (or “feature”), containing $x^2$.

1
2
3
4
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures( degree=2, include_bias=False )
x_matrix = x.reshape( -1, 1 )                   # make x a matrix so that we can add columns
poly_features = poly.fit_transform( x_matrix )  # add a second column, so we now have x and x^2

Next, fit a regression model to the new features, which are $x$ and $x^2$.

1
2
3
4
from sklearn.linear_model import LinearRegression
poly_reg_model = LinearRegression()     # Our model will be linear in the features x and x^2
poly_reg_model.fit( poly_features, y )  # Use regression to create the model
LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Finally, get the coefficients and intercept of the model.

1
poly_reg_model.intercept_, poly_reg_model.coef_
1
(-8.384415584415635, array([6.28628389, 0.27420825]))

Thus the equation for our model of degree two is $\widehat{y} = -8.38 + 6.28x + 0.27x^2$

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

We’re going to use the Pressure dataset in R’s ggplot library as example data. It contains observations of pressure and temperature. You would use your own data instead.

1
2
3
# install.packages( "ggplot2" ) # if you haven't done this already
library(ggplot2)
data("pressure")

Let’s model temperature as the dependent variable with pressure squared as the independent variable. To place the “pressure squared” term in the model, we use R’s poly function, as shown below. It automatically includes a pressure term as well (not squared).

1
2
3
# Build the model
model <- lm(temperature ~ poly(pressure, 2), data = pressure)
summary(model)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Call:
lm(formula = temperature ~ poly(pressure, 2), data = pressure)

Residuals:
     Min       1Q   Median       3Q      Max 
-113.095  -44.543    6.157   50.459   75.791 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)          180.00      14.31  12.581 1.03e-09 ***
poly(pressure, 2)1   361.84      62.36   5.802 2.70e-05 ***
poly(pressure, 2)2  -186.66      62.36  -2.993   0.0086 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 62.36 on 16 degrees of freedom
Multiple R-squared:  0.7271,	Adjusted R-squared:  0.693 
F-statistic: 21.31 on 2 and 16 DF,  p-value: 3.079e-05

Now we have a model of the form $\hat t = 180 + 361.84p - 186.66p^2$, where $t$ stands for temperature and $p$ for pressure.

You can change the number in the poly function. For example, if we wanted to create a third-degree polynomial term then we would have specified poly(pressure, 3), and it would have included pressure, pressure squared, and pressure cubed.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

  • Excel
  • Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.