Linear regression is a tool for modeling the dependence between two sets of data so that we can eventually use this model to make predictions. The name comes from the fact that we form a linear model (straight line) of one set of data based on a second. In the literature, the variable that we wish to model is frequently called the response variable, and the variable that we are using in this model is the predictor variable.
In this recipe, we'll learn how to use the statsmodels package to perform simple linear regression to model the relationship between two sets of data.
Getting ready
For this recipe, we will need the statsmodels api module imported under the alias sm, the NumPy package imported as np, the Matplotlib pyplot module imported as plt, and an instance of a NumPy default random number generator. All this can be achieved with the following commands:
import statsmodels.api as sm
import numpy as np
import...