We will now look at a slightly more complex dataset. This will include the introduction of a new classification algorithm and a few other ideas.
A more complex dataset and the nearest-neighbor classifier
Learning about the seeds dataset
We now look at another agricultural dataset, which is still small, but already too large to plot exhaustively on a page as we did with the Iris dataset. This dataset consists of measurements of wheat seeds. There are seven features that are present, which are as follows:
- Area A
- Perimeter P
- Compactness C = 4πA/P²
- Length of kernel
- Width of kernel
- Asymmetry coefficient
- Length of kernel groove
There are three classes corresponding to three wheat varieties: Canadian, Koma, and Rosa...