Like any professional field, data analysis is filled with buzzwords, and it can often be difficult for newcomers to understand the lingo—the topic of this chapter is no exception. When we perform data wrangling, we are taking our input data from its original state to a format where we can perform meaningful analysis on it. Data manipulation is another way to refer to this process. There is no set list or order of operations; the only goal is that the data post-wrangling is more useful to us than when we started.
In practice, there are three common tasks involved in the data wrangling process:
- Data cleaning
- Data transformation
- Data enrichment
It should be noted that there is no inherent order to these tasks, and it is highly probable that we will perform each many times throughout our data wrangling. This idea brings up an interesting conundrum...