Correlation between non-numeric and numeric variables
If you want to graphically represent an association between a numeric variable and a categorical (non-numeric) variable, the boxplot or violin plot will be the graphical representation for you. If you have already come across the problem of having to represent the distribution of a variable by highlighting key statistics, then you should be familiar with a boxplot:
Figure 15.31: Graphical explanation of a boxplot
A violin plot is nothing more than a combination of a histogram/distribution plot and a boxplot for the same variable:
Figure 15.32: Graphical explanation of a violin plot
See the References section for more details about boxplots and violin plots.
If you need to relate a numeric variable to a categorical variable, you can create a violin plot for each element of the categorical variable. Returning to the example of the Titanic disaster dataset, given the Pclass
(categorical) and Age
(numeric...