Hands-on – triggering an AWS Lambda function when a new file arrives in an S3 bucket
In the hands-on portion for this chapter, we're going to configure an S3 bucket to automatically trigger a Lambda function whenever a new file is written to the bucket. In the Lambda function, we're going to make use of an open source Python library called AWS Data Wrangler, created by AWS Professional Services to simplify common ETL tasks when working in an AWS environment. We'll use the AWS Data Wrangler library to convert a CSV file into Parquet format, and then update the AWS Glue Data Catalog.
Creating a Lambda layer containing the AWS Data Wrangler library
Lambda layers allow your Lambda function to bring in additional code, packaged as a .zip
file. In our use case, the Lambda layer is going to contain the AWS Data Wrangler Python library, which we can then attach to any Lambda function where we want to use the library.
To create a Lambda layer, do the following...