Our first learning system will be the k-nearest neighbors (kNN) classifier. I will describe how the classifier makes predictions, the important hyperparameters it uses, and the problems that are faced by the classifier. Throughout, I will be using the classifier to predict species of iris flowers. So, let's go ahead and start a Jupyter Notebook for this classifier:
- The first thing we're going to do is load in the dataset and other required functions, as follows:
data:image/s3,"s3://crabby-images/7af32/7af325d9879ab0f6e40c54749f203a84a1eaba6a" alt=""
The iris dataset is provided with sklearn. It is one of their example datasets, and is well known.
- Then, we will load in an object that contains the iris data and save that into Python objects:
data:image/s3,"s3://crabby-images/482e4/482e4e01c7ecdbf04bb63715e7b96cb57e9bdbfd" alt=""
- Then, we will divide the dataset into training and test data by using the following lines of code:
data:image/s3,"s3://crabby-images/9ca71/9ca712f88de1c2b019acff4f9bfee72e940b446f" alt=""
Here are the first five rows of the training data:
data:image/s3,"s3://crabby-images/96d11/96d11d7f5a3311469eb0f34860f7c942a2e2dab2" alt=""
Here are the first five labels...