We'll look at a common statistical decision. The decision is described in detail at http://www.itl.nist.gov/div898/handbook/prc/section4/prc45.htm.
This is a chi-squared decision on whether or not data is distributed randomly. To make this decision, we'll need to compute an expected distribution and compare the observed data to our expectations. A significant difference means there's something that needs further investigation. An insignificant difference means we can use the null hypothesis that there's nothing more to study: the differences are simply random variation.
We'll show how we can process the data with Python. We'll start with some backstory—some details that are not part of the case study, but often feature an Exploratory Data Analysis (EDA) application. We need to gather the...