Chapter 6: Creating Statistical Plots
Creating statistical plots is a standard data analysis task, especially during data exploration. It is an essential part of data visualization, helping make meaningful visual representations for our data. It is crucial, as in many cases, that we learn more from our data by looking at it than by exclusively analyzing its summary statistics. Anscombe’s quartet is an example of this as its four datasets show similar descriptive statistics but different distributions we can see after plotting them. Figure 6.1 shows these datasets with a Pearson correlation coefficient, r, of 0.82, but various joint distributions.
Also, we can rely on statistical plots to effectively communicate our findings to the world – a common data visualization task. Some visualizations, such as histograms, are easily understood by people from many backgrounds. Others, such as boxplots, are better suited for a statistically versed audience: