Statistical Testing for Data Science
In the previous chapter, we laid the groundwork for understanding probability and statistics. Now, we will leverage that understanding to perform statistical tests that we can use to test hypotheses. We will cover the following statistical tests in this chapter:
- The t-test, z-test, and bootstrapping for comparing the means of data (for example, A/B testing)
- The ANOVA test for comparing the means of groups
- Testing if data comes from a distribution (for example, a Gaussian distribution)
- Testing for outliers with the scikit-posthocs package
- Tests for relationships between variables (Pearson and chi-squared tests)
This is only a small number of the total amount of statistical tests out there, but there are some that we can use for practical tasks. Some of these tests are also used in other data science methods, such as linear and logistic regression, which we will cover in Chapter 11, Machine Learning for...