AWS big data tools for ETL pipelines
Several AWS tools can be used for creating ETL pipelines in the cloud. In this section, we chose to focus on the most common AWS tools that are best for building cost-effective and scalable ETL workflows.
AWS Data Pipeline
AWS Data Pipeline (https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html) is a web service for orchestrating data workflows across various AWS services and on-premises systems. It provides a visual pipeline designer that makes it easy to visualize and clearly define pre-built connectors for popular data sources and destinations, scheduling, error handling, and monitoring. Data Pipeline supports a wide range of data formats and protocols, including relational databases, NoSQL databases, and Hadoop clusters.
Amazon Kinesis
Amazon Kinesis (https://aws.amazon.com/kinesis/) is a managed service a big data platform specifically designed for processing large datasets (we’re talking...