Applying linear regression to our data
Linear regression is probably the most famous statistical model. It has been around for a long time, since the first concepts behind its development go back to the 1980. This model mainly owes its popularity to the relative ease of application and its great interpretability.
The intuition behind linear regression
When applying linear regression to a set of data, we are making the following assumption—the relationship between one (or more) explanatory variable and the response variable is known and linear. There are two points to consider:
- Known: We are assuming the existence of some kind of law ruling the level of y given the level of x. We are also usually implying that the level of x directly causes the level of y. We know from our discussion about linear correlation that this is not necessarily true and that further evidence is needed to assume causality.
- Linear: The relation between the explanatory variables and a response is assumed to be representable...