Introduction
In the previous chapter, we looked at a dataset regarding credit cards, which was used to predict whether or not the customers would default. We applied different data analysis techniques, such as univariate analysis and bivariate analysis, to understand and process customers' payment histories and identify relationships between different features.
In this chapter, we are going to work with a dataset from the medical industry. This dataset is called the Heart Disease dataset and has been published in the UCI Machine Learning Repository. This dataset originally contained 75 attributes, but only 14 of those attributes have been used by published experiments, so we will also be using this subset for our data analysis. The dataset uses a lot of medical terminology that you may be unfamiliar with, but the features will be explained in the exercises so that you are aware of what you are analyzing.
We will be checking for outliers, missing values, and the trends and...