What is a linear regression?
A linear regression is a statistical model that allows us to understand the relationship between two or more variables. The most common use case is for understanding the relationship between a dependent variable and one or more independent variables. The dependent variable is the variable we are trying to explain and the independent variables are the variables we are using to explain the dependent variable. We can use such a model to either predict the dependent variable given the independent variables or compare the predictions for different values of the independent variables to make comparisons.
Before diving into the underpinnings, let’s have a look at the intuition for what we are trying to achieve with a linear regression. Let’s imagine a simple random dataset:
x ∼ N(100,50)
y = 2 * x + error
error ∼ N(10,50)
We know the relationship is linear because we have designed the dataset to be so. A linear regression...