Architecting and Implementing Data Lakes and Data Lake Houses
In this section of the book, we examine an approach for architecting a high-level data pipeline and then dive into the specifics of data ingestion and transformation. We will examine different types of data consumers, learn about the important role of data marts and data warehouses, and finally put it all together by orchestrating our own data pipelines. We get hands-on with various AWS services for data ingestion (Amazon Kinesis and DMS), transformation (AWS Glue Studio), consumption (AWS Glue DataBrew), and pipeline orchestration (Step Functions).
This section comprises the following chapters:
- Chapter 5, Architecting Data Engineering Pipelines
- Chapter 6, Ingesting Batch and Streaming Data
- Chapter 7, Transforming Data to Optimize for Analytics
- Chapter 8, Identifying and Enabling Data Consumers
- Chapter 9, A Deeper Dive into Data Marts and Amazon Redshift
- Chapter 10, Orchestrating...