Let's take a look at specific groups of predictor variables that commonly pop up in healthcare data.
Preprocessing the predictor variables
Visit information
The first feature category in the ED2013 dataset contains information about the timing of the visit. Variables such as month, day of week, and arrival time are included here. Also included are the waiting time and length of visit variables (both in minutes).
Month
Let's analyze the VMONTH predictor in more detail. The following code prints all the values in the training set and their counts:
print(X_train...