Now we will split our dataset into training and testing datasets. We're going to use sklearn's train_test_split function to generate a training dataset, which will be about 80% of the total data, and then a testing dataset, which will be 20% of the total data. The class values in this dataset contain multiple types of heart disease, with values ranging from 0 (healthy) to 4 (severe heart disease). Consequently, we will convert our class data into categorical labels.
Let's create X and y datasets for training. So, first, we want to split our class label into its own y value. We will import the model_selection package from sklearn and convert the X DataFrame to a NumPy array, taking everything but the class attribute. Likewise, for the y DataFrame, we will convert this into a NumPy array, but here we will only take the class attribute. Then...