Downloading and understanding the famous Iris data for unsupervised classification
In this recipe, we download and inspect the well-known Iris dataset in preparation for the upcoming streaming KMeans recipe, which lets you see classification/clustering in real-time.
The data is housed on the UCI machine learning repository, which is a great source of data to prototype algorithms on. You will notice that R bloggers tend to love this dataset.
How to do it...
- You can start by downloading the dataset using either two of the following commands:
wget https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
You can also use the following command:
curl https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data -o iris.data
You can also use the following command:
https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
- Now we begin our first step of data exploration by examining how the data in
iris.data
is formatted:
head -5 iris.data 5.1,3.5,1.4,0.2,Iris-setosa...