Network
Even though AWS Glue is a serverless service, understanding its network infrastructure and how it connects to resources is a critical part of guaranteeing your data’s security and your organization’s compliance. By default, Glue will always attempt to use the less public route to direct network traffic. However, it is crucial to understand how this routing works to avoid public calls that could compromise your information.
Glue network architecture
Much like with other AWS services, all AWS Glue resources are stored and executed in internal AWS accounts that are not accessible or part of any public infrastructure. This includes your Data Catalog, crawlers, ETL jobs, development endpoints, triggers, and workflows. This is shown in the following diagram:
Figure 8.1 – AWS resources within the AWS cloud
If a Glue resource needs access to an S3 location, this communication happens privately and internally through the AWS infrastructure...