Learning about variance, standard deviation, quartiles, percentiles, and skewness
In the previous section, we studied the mean, median, and mode. They all describe, to a certain degree, the properties of the central part of the dataset. In this section, we will learn how to describe the spreading behavior of data.
Variance
With the same notation, variance for the population is defined as follows:
Intuitively, the further away the elements are from the mean, the larger the variance. Here, I plotted the histogram of two datasets with different variances. The one on the left subplot has a variance of 0.09 and the one on the right subplot has a variance of 0.009, 10 times smaller.
The following code snippet generates samples from the two distributions and plots them:
r1 = [random.normalvariate(0.5,0.3) for _ in range(10000)] r2 = [random.normalvariate(0.5,0.1) for _ in range(10000)] fig, axes = plt.subplots(1,2,figsize=(12,5)) axes[0].hist(r1,bins...