In this chapter, we will use the pandas, NumPy, Matplotlib, SciPy and scikit-learn Python libraries. These libraries are bundled in the free Anaconda Python distribution (https://www.anaconda.com/distribution/), which you can install as described in the Technical Requirements section of Chapter 1, Foreseeing Variable Problems when Building ML Models.Â
We will also use the open source Python library Feature-engine, which can be installed using pip:
pip install feature-engine
We will use the Boston House Prices dataset from scikit-learn, which contains no missing data. When trying the recipes in your own dataset, make sure you impute the missing values with any of the techniques we covered in Chapter 2, Imputing Missing Data.Â