In the previous chapter, we followed an example of linear regression using two variables. It is interesting to see how we can apply regression to more than two variables (called multiple linear regression) and extract useful information from the results.
Suppose that you are asked to test whether there exists a hidden policy of gender discrimination in a company. You could be working for a law firm that is leading a trial against this company, and they need data-based evidence to back up their claim.
You would start by taking a sample of the company's payroll, including several variables that describe each employee and the last salary increase amount. The following screenshot shows a set of values after they've been entered in an Excel worksheet:
There are four numerical features in the dataset:
- ID:...