Chapter 9. Orchestrating Data using AWS Data Pipeline
In the previous chapter, we explored the AWS analytics suite of services by deep diving into Amazon EMR and Amazon Redshift services.
Â
In this chapter, we will be continuing the trend and learning about an extremely versatile and powerful data orchestration and transformation service called AWS Data Pipeline.
Let's have a quick look at the various topics that we will be covering in this chapter:
- Introducing AWS Data Pipeline along with a quick look at some of its concepts and terminologies
- Getting started with Data Pipeline using a simple Hello World example
- Working with the Data Pipeline definition file
- Executing scripts and commands on remote EC2 instances using a data pipeline
- Backing up data from one S3 bucket to another using a simple, parameterized data pipeline
- Building pipelines using the AWS CLI
So without any further ado, let's get started right away!