Hands-on – orchestrating a data pipeline using AWS Step Functions
In this section, we will get hands-on with the AWS Step Functions service, which can be used to orchestrate data pipelines. The pipeline we’re going to orchestrate is relatively simple, but Step Functions can also be used to orchestrate far more complex pipelines with many steps. To keep things simple, we will only use Lambda functions to process our data, but you could replace Lambda functions with Glue jobs in production pipelines that need to process large amounts of data.
For our Step Functions state machine, let’s start by running a Lambda function that checks the extension of an incoming file to determine the type of file. Once determined, we’ll pass that information on to the next state, which is a CHOICE
state. If it is a file type we support, we’ll call a Lambda function to process the file, but if it’s not, we’ll send out a notification, indicating that...