What does preprocessing mean?
Beyond selecting a specific set of data that you want to use for a particular machine learning project, you also need to preprocess that data. This typically involves tasks such as formatting, cleaning, and sampling (or profiling). We won't be delving too far into the definitions of each of these tasks, and will assume that the reader grasps their meaning and purpose. We'll say that formatting is a way of simply putting the data source into a form that can be easily understood and consumed within your project. Cleaning is mostly concerned with removing unwanted data and sampling is all about reducing the overall size of the data for performance reasons.
Although, being a developer at heart, I am anxious to take on these tasks by crafting a script or perusing and selecting a function from an open source library, instead, let...