In this chapter, we introduce and explain the main techniques of exploratory data analysis (EDA). We start by explaining the general goal of this stage of the predictive analytics process, and discuss how we accomplish this.
A natural and common way to classify EDA techniques is by the number of variables involved in the analysis—one, two, or more than two. Hence, this chapter has sections on univariate, bivariate, and multivariate analysis. Within the univariate and bivariate types of analysis, we have different numerical and graphical techniques that depend on the type of feature we are working with.
In this chapter, we use the diamond prices dataset to introduce and illustrate the main techniques of univariate and bivariate EDA. We will provide examples of how to produce the main visualizations used in analytics...