Section 2 – Preprocessing, Feature Selection, and Sampling
Anyone who has done a log transformation of a target, or scaled a feature, appreciates just how critical to our analysis preprocessing can be. Raise your hand if you were ever confident that your model approximated truth, but then tried a fairly obvious transformation and appreciated just how far from truth your original model was. Encoding, transforming, and scaling data is not a gimmick, though sometimes people have that impression. We apply that preprocessing because 1) it gets us closer to capturing a real-world process and 2) because many machine learning algorithms just work better with scaled data.
Feature selection is equally important. A good adage is never build a model with N features, when N - 1 features will do just as nicely. It is worth remembering that this is more complicated than having too many features. There are times when having 3 features is too many and others when having 103 features is perfectly...