In this section, we will look at some basic characteristics of data and how to read, preprocess, and cleanse datasets. We have seen some of these aspects earlier in Chapter 3, Data Wrangling with R, and this section is intended to provide a high-level view of the relevant topics.
Preparing data for analysis
Data categories
Data can be broadly categorized into two types:
- Discrete (or categorical): Any value that denotes a category is considered a discrete variable. Examples of discrete variables include most nouns such as fruits, colors, school grades, countries and genders.
- Continuous (or quantitative): Continuous numbers are numerical quantities on which you can perform arithmetic operations. This includes variables such...