In this recipe, we download and inspect the wine quality dataset from the UCI machine learning repository to prepare data for Spark's streaming linear regression algorithm from MLlib.
Downloading wine quality data for streaming regression
How to do it...
You will need one of the following command-line tools curl or wget to retrieve specified data:
- You can start by downloading the dataset using either of the following three commands. The first one is as follows:
wget http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv
You can also use the following command:
curl http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv -o winequality-white.csv...