Summary
In this chapter, we introduced basic techniques to conduct EDA. We started by going over the common approaches to analyzing and summarizing categorical data, including frequency count and bar charts. We then introduced marginal distribution and faceted bar charts when working with multiple categorical variables.
Next, we switched to analyzing numerical variables and covered sensitive measures such as central tendency (mean) and variation (variance), as well as robust measures such as median and IQR. Several types of charts are available for visualizing a numerical variable, including histograms, density plots, and box plots, all of which can be combined with another categorical variable.
Finally, we went through a case study using the stock price data. We started by downloading the real data from Yahoo! Finance and applying all the EDA techniques to analyze the data, followed by creating a correlation plot to indicate the strength of covariation between each pair of variables...