Detecting outliers using the Tukey method
This recipe will extend on the previous recipe, Detecting outliers using visualizations. In Figure 8.5, the box plot showed the quartiles with whiskers extending to the upper and lower fences. These boundaries or fences were calculated using the Tukey method.
Let's expand on Figure 8.5 with additional information of other components:
Visualizations are great to give you a high-level perspective on the data you are dealing with, such as the overall distribution and potential outliers. Ultimately you want to identify these outliers programmatically so you can isolate these data points for further investigation and analysis. This recipe will teach you how to calculate IQR and define points that fall outside the lower and upper Tukey fences.
How to do it...
Most statistical methods allow you to spot extreme values beyond a certain threshold. For example, this could be the mean...