Data visualizations are very important in data science. They are used as a part of Exploratory Data Analysis (EDA), to familiarize yourself with data, to examine the distributions of variables, to identify outliers, and to help guide data cleaning and analysis. They are also used to communicate results to a variety of audiences, from other data scientists to customers.
EDA is the general name for the process of using numerical summaries, plots, and aggregating methods to explore a dataset to familiarize yourself with its contents. It will almost certainly involve you examining the distribution of variables in the dataset, looking at missingness, deciding whether there are any outliers or errors, and generally getting a feel for what is contained in your data.
In this chapter, you'll learn about base plots, ggplot2, and will be briefly introduced...