Performing univariate analysis using a boxplot
Just like the histogram, the boxplot (also known as the whisker plot) is a good candidate for visualizing a single continuous variable within our dataset. Boxplots give us a sense of the underlying distribution of our dataset through five key metrics. The metrics include the minimum, first quartile, median, third quartile, and maximum values.
Figure 4.3: Boxplot illustration
In the preceding figure, we can see the following components of a boxplot:
- The box: This represents the interquartile range (25th percentile/1st quartile to the 75th percentile/3rd quartile). The median is the line within the box and it is also referred to as the 50th percentile.
- The whisker limits: The upper and lower whisker limits represent the range of values in our dataset which are not outliers. The position of the whiskers is calculated from the interquartile range (IQR), 1st quartile, and 3rd quartile. This is represented...