A more complex dataset and a more complex classifier
We will now look at a slightly more complex dataset. This will motivate the introduction of a new classification algorithm and a few other ideas.
Learning about the Seeds dataset
We will now look at another agricultural dataset; it is still small, but now too big to comfortably plot exhaustively as we did with Iris. This is a dataset of the measurements of wheat seeds. Seven features are present, as follows:
Area (A)
Perimeter (P)
- Compactness ()
Length of kernel
Width of kernel
Asymmetry coefficient
Length of kernel groove
There are three classes that correspond to three wheat varieties: Canadian, Koma, and Rosa. As before, the goal is to be able to classify the species based on these morphological measurements.
Unlike the Iris dataset, which was collected in the 1930s, this is a very recent dataset, and its features were automatically computed from digital images.
This is how image pattern recognition can be implemented: you can take images in...