In this section, I'll discuss how I created the dataset used for this chapter and provide insight into the features and the class labels we'll endeavor to predict. The data is available on GitHub at https://github.com/datameister66/MMLR3rd/blob/master/sim_df.csv:
- Let's get our libraries and data loaded:
> library(magrittr)
> install.packages("glmnet")
> install.packages("caret")
> install.packages("classifierplots")
> install.packages("DataExplorer")
> install.packages("InformationValue")
> install.packages("Metrics")
> install.packages("ROCR")
> install.packages("tidyverse")
> options(scipen=999)
> sim_df <- readr::read_csv('sim_df.csv')
The dataframe is 10,000 observations of 17 variables, consisting of 16 input features and 1...