Technical requirements
In this chapter, we will showcase interactive development using an EMR notebook and the Apache Spark and Apache Hudi frameworks. So, before getting started, make sure you have the following:
- An AWS Account with the ability to create Amazon S3, Amazon EMR, Amazon Athena, and AWS Glue Catalog resources
- An IAM user that can create IAM roles, which will be used to trigger or execute jobs
- Access to the Jupyter notebook that is available in our GitHub repository here: https://github.com/PacktPublishing/Simplify-Big-Data-Analytics-with-Amazon-EMR-/tree/main/chapter_11
Now, let's dive deep into the use case and hands-on implementation steps starting with the overview of Apache Hudi.
Check out the following video to see the Code in Action at https://bit.ly/3svY3i9