Correlation between categorical and numeric variables
We have shown that, in the case of two numeric variables, you can get a sense of the association between them by taking a look at their scatterplot. Clearly, this strategy cannot be used when one or both variables are categorical. Note that a variable is categorical (or qualitative, or nominal) when it takes on values that are names or labels, such as smartphone operating systems (iOS, Android, Linux, and so on).
Let's see how to analyze the case of two categorical variables.
Considering both variables categorical
So, is there a graphical representation that helps us understand whether there is a significant association between two categorical variables? The answer is yes and its name is a mosaic plot. In this section, we will take the Titanic disaster dataset as a reference dataset. In order to have an idea of what a mosaic plot looks like, let's take into consideration the variables Survived
(which takes values...