Merging large datasets with a data.table
In previous recipes, we demonstrated how to manipulate and aggregate data with a data.table
. In addition to performing data manipulation on a single table, we often need to import additional features or correlate data from other data sources. Therefore, we can join two or more tables into one. In this recipe, we introduce some methods that we can use to merge two data.table
.
Getting ready
Ensure that you completed the Enhancing a data.frame with a data.table recipe to load purchase_view.tab
and purchase_order.tab
as both data.frame
and data.table
into your R environment.
How to do it…
Perform the following steps to merge two data.table
:
First, we generate a
product.dt
data table by calculating the number of purchased items:> product.dt <- order.dt[,.(Buy = length(Action)),by=Product] > head(product.dt[order(-Buy)]) Product Buy 1: P0005772981 821 2: P0024239865 729 3: P0004607050 584 4: P0003425855 552 5: P0014252066 438...