Preparations
You will find the code for this example here: https://github.com/PacktPublishing/Interpretable-Machine-Learning-with-Python/blob/master/Chapter02/CVD.ipynb.
Loading the libraries
To run this example, you need to install the following libraries:
mldatasets
to load the datasetpandas
andnumpy
to manipulate itstatsmodels
to fit the logistic regression modelsklearn
(scikit-learn) to split the datamatplotlib
to visualize the interpretations
You should load all of them first:
Import math import mldatasets import pandas as pd import numpy as np import statsmodels.api as sm from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt
Understanding and preparing the data
The data to be used in this example should then be loaded into a DataFrame we call cvd_df
:
cvd_df = mldatasets.load("cardiovascular-disease")
From this, you should be getting 70,000 records and 12 columns. We can take a peek...