Implementing outlier detection algorithms
The first thing you'll do is implement what you've just studied in Python.
Implementing outlier detection in Python
In this section, we will use the Wine Quality dataset created by Paulo Cortez et al. (https://archive.ics.uci.edu/ml/datasets/wine+quality) to show how to detect outliers in Python. The dataset contains as many observations as the different types of red wine, each described by the organoleptic properties measured by the variables, except for the quality
one, which provides a measure of the quality of the product using a discrete grade scale from 1 to 10.
You'll find the code used in this section in the 01-detect-outliers-in-python.py
file into the Chapter12\Python
folder.
Once you have loaded the data from the winequality-red.csv
file directly from the web into the df
variable, let's start by examining the sulphates
variable. Let's check if it contains any outliers by displaying its boxplot...