Solution 11.1
As part of a research effort to improve metallic-oxide semiconductor sensors for the toxic gas CO (carbon monoxide), you are asked to investigate models of the sensor response for an array of sensors. You will review the data, perform some feature engineering for non-linear features, and then compare a baseline linear regression approach to a random forest model.
Perform the following steps to complete the activity:
- For this exercise, you will need the
pandas
andnumpy
libraries, and three modules fromsklearn
,matplotlib
, andseaborn
. Load them in the first cell of the notebook:import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression as OLS from sklearn.ensemble import RandomForestRegressor from sklearn.preprocessing import StandardScaler import matplotlib.pyplot as plt import seaborn as sns
- As we have done before, create a utility function to plot a grid of histograms given the data, which variables to plot, the rows and...