Creating and updating Hudi tables using Glue
Apache Hudi is an open source data management tool that was initially developed by Uber. Its superpower is enabling incremental data processing in a data lake. The Apache Hudi format is supported by a wide range of tools on AWS such as AWS Glue, Amazon Redshift, Amazon Athena, and Amazon EMR.
The CloudFormation template, for this chapter, creates two Hudi batch jobs. They are 02 - Hudi Init load for Data Analysis Chapter
and 03 - Hudi Incremental load for Data Analysis Chapter
. Both of these jobs use the Hudi connection created in the Creating the Marketplace connections section. Additionally, these jobs accept the target bucket as an input parameter. This input parameter is prepopulated by the CloudFormation template. Navigate to the job details page of the 02 - Hudi Init load for Data Analysis Chapter
job (https://console.aws.amazon.com/gluestudio/home?#/editor/job/02%20-%20Hudi%20Init%20load%20for%20Data%20Analysis%20Chapter/details...