Technical requirements
In this chapter, we will use the Matplotlib, pandas
, NumPy, scikit-learn, and feature-engine
Python libraries. If you need to install Python, the free Anaconda Python distribution (https://www.anaconda.com/) comes with most numerical computing libraries out of the box.
Feature-engine
can be installed with pip
:
pip install feature-engine
If you use Anaconda, you can install feature-engine
with conda
:
conda install -c conda-forge feature_engine
We will use the Credit Approval dataset from the UCI Machine Learning Repository (https://archive.ics.uci.edu/). To prepare the dataset, follow these steps:
- Visit https://archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/.
- Click on crx.data to download the data.
Figure 1.1 – Screenshot of the dataset download page
- Save
crx.data
to the folder where you will run the following commands.
Open a Jupyter notebook and run the following...