Preprocessing Time-Series
Preprocessing is a crucial step in machine learning that is nonetheless often neglected. Many books don't cover preprocessing in any depth or skip preprocessing entirely. When presenting to outsiders about a machine learning project, curiosity is naturally attracted to the algorithm rather than the dataset or the preprocessing.
One reason for the relative silence on preprocessing could be that it's less glamorous than machine learning itself. It is, however, often the step that takes the most time, sometimes estimated at around 98% of the whole machine learning process. And it is often in preprocessing that relatively easy work can have a great impact on the eventual performance of the machine learning model. The quality of the data goes a long way toward determining the outcome – low-quality input, in the worst case, can invalidate the machine learning work altogether (this is summarized in the adage "garbage in, garbage out...