Activity 11.01 – Multiple regression with non-linear models
As part of a research effort to improve metallic-oxide semiconductor sensors for the toxic gas carbon monoxide (CO), you are asked to investigate models of the sensor response for an array of sensors. You will review the data, perform some feature engineering for non-linear features, and then compare a baseline linear regression approach to a random forest model:
- For this exercise, you will need the
pandas
andnumpy
libraries, and three modules fromsklearn
,matplotlib
, andseaborn
. Load them in the first cell of the notebook:import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression as OLS from sklearn.ensemble import RandomForestRegressor from sklearn.preprocessing import StandardScaler import matplotlib.pyplot as plt import seaborn as sns
- As we have done before, create a
utility
function to plot a grid of histograms after being given the data, which variables to plot, the...