The UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/index.php) maintains over 350 datasets as a service to the machine learning community. These datasets can be used for experimentation with various models and algorithms. A typical dataset contains a number of features (inputs) and the desired output, in a form of columns, with a description of their meaning.
In this section, we will use the UCI Zoo dataset (https://archive.ics.uci.edu/ml/datasets/zoo). This dataset describes 101 different animals using the following 18 features:
No. |
Feature Name |
Data Type |
1 |
animal name |
Unique for each instance |
2 |
hair |
Boolean |
3 |
feathers |
Boolean |
4 |
eggs |
Boolean |
5 |
milk |
Boolean |
6 |
airborne |
Boolean |
7 |
aquatic |
Boolean |
8 |
predator |
Boolean |