Summary
We began Chapter 2 by learning how to acquire data, using native R datasets or loading it from the popular CSV format, and how to customize the dataset even during the importing data phase, such as deciding on the number of rows to load. Then we briefly explained the difference between a data frame and Tibble format. They serve the same purpose and basically do the same things, but Tibbles bring some enhancements and are more suited to the modern world, and work much better with the tidyverse package in R.
Next, we advanced to more sophisticated ways to bring data to your R session by using web scraping or capturing datasets from a public API. As we live in a world where many businesses and salespeople work with Microsoft Excel, it is important to know how to save a file as a CSV. That was also covered in this chapter.
Coming to a close, we went over the basic steps of EDA: loading and viewing data, calculating descriptive statistics, handling missing values and outliers...