Binding
Binding data is the last of the main transformations listed at the beginning of this chapter. It is common to find yourself with two or more datasets that you need to put together for analysis. There are a couple of ways to do that, as follows:
Figure 7.16 – Types of data binding
Assume that our Census Income dataset has only 10 rows. After some research, the internal team found another 10 observations and gave them to the data science team. The ten new observations have to be appended to the original dataset since they have the same variables. Let’s see that in action:
# Creating datasets A and B A <- df[1:10, ] B <- df[11:20, ] # Append / bind rows AB <- rbind(A, B)
To illustrate the other scenario, that is, binding columns, imagine that the original data has only three variables, age
, workclass
, and fnlwgt
. Then, the team was able to collect more information about the taxpayers, adding education grade and occupation....