Learning to recognize handwritten digits with a K-nearest neighbors classifier
In this recipe, we will see how to recognize handwritten digits with a K-nearest neighbors (K-NN) classifier. This classifier is a simple but powerful model, well-adapted to complex, highly nonlinear datasets such as images. We will explain how it works later in this recipe.
How to do it...
We import the modules:
In [1]: import numpy as np import sklearn import sklearn.datasets as ds import sklearn.cross_validation as cv import sklearn.neighbors as nb import matplotlib.pyplot as plt %matplotlib inline
Let's load the digits dataset, part of the
datasets
module of scikit-learn. This dataset contains handwritten digits that have been manually labeled:In [2]: digits = ds.load_digits() X = digits.data y = digits.target print((X.min(), X.max())) print(X.shape) 0.0 16.0 (1797L, 64L)
In the matrix
X
, each row contains 8 * 8=64 pixels (in grayscale...