Chapter 9. Orchestrating Data using AWS Data Pipeline
In the previous chapter, we explored the AWS analytics suite of services by deep diving into Amazon EMR and Amazon Redshift services.
In this chapter, we will be continuing the trend and learning about an extremely versatile and powerful data orchestration and transformation service called AWS Data Pipeline.
Let's have a quick look at the various topics that we will be covering in this chapter:
- Introducing AWS Data Pipeline along with a quick look at some of its concepts and terminologies
- Getting started with Data Pipeline using a simple Hello World example
- Working with the Data Pipeline definition file
- Executing scripts and commands on remote EC2 instances using a data pipeline
- Backing up data from one S3 bucket to another using a simple, parameterized data pipeline
- Building pipelines using the AWS CLI
So without any further ado, let's get started right away!