Training a linear regression model with scikit-learn
Linear regression is one the most basic ML models we can use, but it is very useful. Most people used linear regression in high school without talking about ML, and still use it on a regular basis within spreadsheets. In this recipe, we will explain the basics of linear regression, and then train and evaluate a linear regression model using scikit-learn on the California housing dataset.
Getting ready
Linear regression is not a complicated model, but it is still useful to understand what is under the hood to get the best out of it.
The way linear regression works is pretty straightforward. Heading back to the real estate price example, if we consider a feature x such as the apartment surface and a label y such as the apartment price, a common solution would be to find a and b such that y = ax + b.
Unfortunately, this is not so simple in real life. There is usually no a and b that makes this equality always respected....