KNN classifier with breast cancer Wisconsin data example
Breast cancer data has been utilized from the UCI machine learning repository http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29 for illustration purposes. Here the task is to find whether the cancer is malignant or benign based on various collected features such as clump thickness and so on using the KNN classifier:
# KNN Classifier - Breast Cancer
>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.metrics import accuracy_score,classification_report
>>> breast_cancer = pd.read_csv("Breast_Cancer_Wisconsin.csv")
The following are the first few rows to show how the data looks like. The Class
value has class 2
and 4
. Value 2
and 4
represent benign and malignant class, respectively. Whereas all the other variables do vary between value 1
and 10
, which are very much categorical in nature:
Only the Bare_Nuclei
variable has some missing values, here we are replacing...