Introducing AWS Data Pipeline
AWS Data Pipeline is an extremely versatile web service that allows you to move data back and forth between various AWS services, as well as on-premise data sources. The service is designed specifically to provide you with an in-built fault tolerance and highly available platform, using which you can define and build your very own custom data migration workflows. AWS Data Pipeline also provides add-on features such as scheduling, dependency tracking, and error handling, so that you do not have to waste extra time and effort in writing them on your own. This easy-to-use and flexible service, accompanied by its low operating costs, make the AWS Data Pipeline service ideal for use cases such as:
- Migrating data on a periodic basis from an Amazon EMR cluster over to Amazon Redshift for data warehousing
- Incrementally loading data from files stored in Amazon S3 directly into an Amazon RDS database
- Copying data from an Amazon MySQL database into an Amazon Redshift cluster...