Technical requirements
In this chapter, we will implement a batch ETL pipeline using AWS services, so before getting started, make sure you have the following requirements:
- An AWS account with access to create Amazon S3, AWS Lambda, Amazon EMR, Amazon Athena, and AWS Glue Data Catalog resources
- An IAM user that has access to create IAM roles, which will be used to trigger or execute jobs
- Access to the GitHub repository:
https://github.com/PacktPublishing/Simplify-Big-Data-Analytics-with-Amazon-EMR-/tree/main/chapter_09
Now let's dive deep into the use case and hands-on implementation steps.
Check out the following video to see the Code in Action at https://bit.ly/3LtLZGX