Data federation using Amazon Athena
Amazon Athena is primarily used to query data from S3 data lakes. However, to query data across heterogeneous sources, Athena provides a feature called Federated Query. This feature enables different personas, such as data analysts, data engineers, and data scientists, to execute queries across disparate data sources from Athena itself. The single biggest differentiator for Federated Query is that the execution of such queries happens inside the systems that store the data.
Athena executes these federated queries using connectors. Athena provides many connectors to a variety of source systems. Using these connectors, Athena can pass portions of the query that need to be executed in the source system. This execution is assisted by AWS Lambda functions, which optimize the query’s execution and gather the data received from the underlying systems. Since Lambda functions are serverless and scalable, this allows Athena to query larger datasets...