Using the Breast Cancer Wisconsin (Diagnostic) Data Set
Scikit‐learn ships with the Breast Cancer Wisconsin (Diagnostic) Data Set. It is a classic dataset that is often used to illustrate binary classifications. This dataset contains 30 features, and they are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. The label of the dataset is a binary classification—M for malignant or B for benign. Interested readers can check out more information at https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
.
Examining the Relationship Between Features
You can load the Breast Cancer dataset by first importing the datasets
module from sklearn
. Then use the load_breast_cancer()
function as follows:
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
Now that the Breast Cancer dataset has been loaded, it is useful to examine the relationships between some of its features.
Plotting the Features in 2D
For...