In the previous chapter, you learned about discrete statistics methods for getting information about the distribution of discrete and continuous variables. In a data science project, the next typical step is to check for the associations between pairs of variables.
When checking for the associations between pairs of variables, you have three possibilities:
- Both variables are discrete
- Both variables are continuous
- There is one discrete and one continuous variable
Besides dealing with two variables only, this section also introduces linear regression, one of the most important statistical methods, where you model a single response (or dependent) variable with a regression formula that includes one or more predictor (or independent) variables.
Altogether, you will learn about the following in this section:
- Chi-squared test of the independence...