Summary
In this chapter, we have learned how to implement an ML pipeline that performs hyperparameter tuning on a house price prediction example. We've created five steps of this pipeline, each outputting relevant files and information into Pachyderm output repositories. In our first pipeline, we performed an exploratory analysis to gather a general understanding of the dataset and built a heatmap that helped us outline the correlation between various parameters in our dataset. In our second pipeline, we cleaned the data of columns with missing information, as well as removed parameters that have little influence on the sale price of a house. In our third pipeline, we removed outliers—values that were outside of the standard range. Our fourth pipeline split our dataset into two parts—one for testing and the other for training. And finally, our fifth pipeline performed hyperparameter tuning for the alpha
parameter and found the best alpha for our use case. The last...