Outliers analysis of average De Bilt temperature
Outliers are values in a dataset that are to be considered extreme. Outliers can be caused by measurement or other types of errors, or they could be caused by a natural phenomenon. There are several definitions for outliers. In this example, we will be using the definition for mild outliers. This definition depends on the position of the first and the third quartiles. A quarter and three quarters of the items in the dataset are smaller than the first and third quartile values, respectively. The difference between these specific quartiles is called the inter-quartile range. It's a robust measure for dispersion similar to standard deviation. Mild outliers are defined to be 1.5 inter-quartile ranges away from either the first or third quartile. We can study the temperature outliers as follows:
Find the first quartile with a function from SciPy:
q1 = scoreatpercentile(temp, 25)
Find the third quartile:
q3 = scoreatpercentile(temp, 75)
Find the indices...