Visualizing outliers with boxplots
In this recipe, we will identify outliers using boxplots. Boxplots produce a box that encloses the observations within the 75th and 25th quantiles, or in other words, within the Inter-Quartile Range (IQR). The IQR is given through the following equation:
According to the IQR proximity rule, a value is an outlier if it falls outside the following boundaries:
In a boxplot, these boundaries are indicated by their whiskers. Thus, values outside the whiskers are considered outliers. Outliers are highlighted with asterisks.
How to do it...
We will create the boxplots utilizing the Seaborn
library. Let’s begin the recipe by importing the Python libraries and loading the dataset as follows:
- Import the required Python libraries as follows:
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns sns.set(style="darkgrid") from sklearn.datasets import fetch_california_housing...