Section 2: Architecting and Implementing Data Lakes and Data Lake Houses
In this section of the book, we examine an approach for architecting a high-level data pipeline and then dive into the specifics of data ingestion and transformation. We also examine different types of data consumers, learn about the important role of data marts and data warehouses, and finally put it all together by orchestrating data pipelines. We get hands-on with various AWS services for data ingestion (Amazon Kinesis and DMS), transformation (AWS Glue Studio), consumption (AWS Glue DataBrew), and pipeline orchestration (Step Functions).
This section comprises the following chapters:
- Chapter 5, Architecting Data Engineering Pipelines
- Chapter 6, Ingesting Batch and Streaming Data
- Chapter 7, Transforming Data to Optimize for Analytics
- Chapter 8, Identifying and Enabling Data Consumers
- Chapter 9, Loading Data into a Data Mart
- Chapter 10, Orchestrating the Data Pipeline