Summary
This completes the overview of the most commonly used data filtering and smoothing techniques. There are other types of data preprocessing algorithms such as normalization, analysis, and reduction of variance; the identification of missing values is also essential to avoid the garbage-in garbage-out conundrum that plagues so many projects that use machine learning for regression or classification.
Scala can be effectively used to make the code understandable and avoid cluttering methods with unnecessary arguments.
The three techniques presented in this chapter, from the simplest moving averages and Fourier transform to the more elaborate Kalman filter, go a long way in setting up data for the next concepts introduced in the next chapter—unsupervised learning and more specifically, clustering.