Creating our first classifier
Let us start with the simple and beautiful nearest neighbor method from the previous chapter. Although it is not as advanced as other methods, it is very powerful. As it is not model-based, it can learn nearly any data. However, this beauty comes with a clear disadvantage, which we will find out very soon.
Starting with the k-nearest neighbor (kNN) algorithm
This time, we won't implement it ourselves, but rather take it from the sklearn
toolkit. There, the classifier resides in sklearn.neighbors
. Let us start with a simple 2-nearest neighbor classifier:
>>> from sklearn import neighbors >>> knn = neighbors.KNeighborsClassifier(n_neighbors=2) >>> print(knn) KNeighborsClassifier(algorithm=auto, leaf_size=30, n_neighbors=2, p=2, warn_on_equidistant=True, weights=uniform)
It provides the same interface as all the other estimators in sklearn
. We train it using fit()
, after which we can predict the classes of new data instances using predict...