Test your knowledge
Before moving on to the next chapter, test your knowledge with the following questions:
- Assume you are designing a data pipeline that needs to process two input files as two parallel steps and then invoke a common ETL process to aggregate the output of these parallel steps. You have decided to leverage AWS Step Functions to orchestrate the pipeline. Which
Task
types will you be integrating and how? - Assume you have a few Hadoop workloads running on-premises and a few Spark ETL jobs running in Amazon EMR. To simplify orchestration and monitoring, you are looking for an orchestration tool. While comparing different options, you found that AWS Step Functions and MWAA are the two best options. Which of them is better suited to your workload and why?