Performing join operations on the DynamoDB data using AWS EMR
In the previous recipe, we saw how to use EMR to access the DynamoDB data and query the same as well. In this recipe, we will see how to join two DynamoDB tables in order to get the combined view.
Getting ready
To perform this recipe, you should have performed the earlier recipe and should have your EMR cluster still running.
How to do it…
Here, we will use two tables: one is the Customer
table, and the other one is the Orders
table. The Customer
table contains detailed information of the customer, while the Order
table contains the details of the order, along with customerId
, which provides a link between these two tables. Now we want to execute queries that need information from both tables, which cannot be achieved solely by DynamoDB, and so, we use EMR:
- To get started, we need to make sure that we have two tables created, as mentioned earlier. Now, we will connect to the EMR cluster, and we will create two Hive tables corresponding...