Both t-tests and ANOVA have a major drawback: the population that is being sampled must follow a normal distribution. In many applications, this is not too restrictive because many real-world population values follow a normal distribution, or some rules, such as the central limit theorem, allow us to analyze some related data. However, it is simply not true that all possible population values follow a normal distribution in any reasonable way. For these (thankfully, rare) cases, we need some alternative test statistics to use as replacements for t-tests and ANOVA.
In this recipe, we will use a Wilcoxon rank-sum test and the Kruskal-Wallis test to test for differences between two (or more, in the latter case) populations.
Getting ready
For this recipe, we will need the pandas package imported as pd, the SciPy stats module, and a default random number generator instance created using the following commands:
...