This model is now ready to be used to predict things. Is this the best model? No, it's not. Finding the best model is a never ending quest. To be sure, there are indefinite ways of improving this model. One can use LASSO methods to determine the importance of variables before using them.
The model is not only the linear regression, but also the data cleaning functions and ingestion functions that come with it. This leads to a very high number of tweakable parameters. Maybe if you didn't like the way I imputed data, you can always write your own method!
Furthermore the code in this chapter can be cleaned up further. Instead of returning so many values in the clean function, a new tuple type can be created to hold the Xs and Ys—a data frame of sorts. In fact, that's what we're going to build in the upcoming chapters. Several...