Essential summary statistics
We have seen useful summary statistics of mean and variance in the Discrete distributions and Continuous distributions sections of Chapter 1, Data Characteristics. The concepts therein have their own utility value. The drawback of such statistical metrics is that they are very sensitive to outliers, in the sense that a single observation may completely distort the entire story.
In this section, we will discuss some exploratory analysis metrics that are intuitive and more robust than the metrics such as mean and variance. We'll be learning the following metrics:
Percentiles
Quantiles
Median
Hinges
Interquartile range
Percentiles, quantiles, and median
For a given dataset and a number 0 < k < 1, the 100k% percentile divides the dataset into two partitions with 100k% of the values below it and 100(1-k)% of the values above it. The fraction k is referred as a quantile. In Statistics, quantiles are used more often than percentiles. The difference being that the quantiles...