Visualizing data
In the previous section, we introduced a number of functions to import data, the first step in most data analysis. It is usually a good practice to look at the data before pouring it into a model, so that is what we will do in the next step. The reason is simple—different models have different strengths, and no model is universally the best choice for all cases since they have a different set of assumptions. Arbitrarily applying a model without checking the data against its assumptions usually results in misleading conclusions.
An initial way to choose a model and perform such checks is to just visually examine the data by looking at its boundaries and patterns. In other words, we need to visualize the data first. In this section, you will learn the basic graphic functions to produce simple charts to visualize a given dataset.
We will use the datasets in the nycflights13
and babynames
packages. If you don't have them installed, run the following code:
install.package(c("nycflights13...