Implementing outlier detection algorithms
The first thing you’ll do is implement what you’ve just learned in Python.
Implementing outlier detection in Python
In this section, we will use the Wine Quality dataset created by Paulo Cortez et al. (https://archive.ics.uci.edu/ml/datasets/wine+quality) to show how to detect outliers in Python. The dataset contains as many observations as there are different types of red wine, each described by the organoleptic properties measured by the variables, except for the quality
variable, which provides a measure of the quality of the product using a discrete grade scale from 1 to 10.
You’ll find the code used in this section in the Python\01-detect-outliers-in-python.py
file in the Chapter 16
folder.
Once you have loaded the data from the winequality-red.csv
file directly from the web into the df
variable, let’s start by examining the sulphates
variable. Let’s check if it contains any outliers...