We will go back to the Automobile dataset as we are going to use the bagging regressor this time. The bagging meta-estimator is very similar to random forest. It is built of multiple estimators, each one trained on a random subset of the data using a bootstrap sampling method. The key difference here is that although decision trees are used as the base estimators by default, any other estimator can be used as well. Out of curiosity, let's use the K-Nearest Neighbors (KNN) regressor as our base estimator this time. However, we need to prepare the data to suit the new regressor's needs.
Preparing a mixture of numerical and categorical features
It is recommended to put all features on the same scale when using distance-based algorithms such as KNN. Otherwise, the effect of the features with higher magnitudes on the distance metric will overshadow the other features. As we have a mixture of numerical and categorical features here...