Summary
In this chapter, we introduced different AWS Glue microservices, including Glue Data Catalog, crawlers, classifiers, connections, ETL jobs, development endpoints, the schema registry, and triggers. We also discussed the key features of each of those different microservices to understand how they aid in different stages of data integration.
Then, we explored the structure of Glue Data Catalog, Glue connections, and the mechanisms used by crawlers and classifiers for data discovery. We also talked about the different classes/APIs that are available in AWS Glue ETL that help with data preparation and transformation. After this, we briefly explored development endpoints and interactive sessions, which make it easy for data engineers/developers to test and write ETL jobs. Then, we explored AWS Glue Triggers and understood how they help us orchestrate complex ETL workflows by allowing Glue users to chain crawlers and ETL jobs based on specific conditions or a schedule.
In...