Big data architecture best practices
You learned about various big data technology and architecture patterns in previous sections. Let’s look at the following reference architecture diagram with different layers of a data lake architecture to learn best practices.
Figure 12.11: Data lake reference architecture
The preceding diagram depicts an end-to-end data pipeline in a data lake architecture using the AWS cloud platform with the following components:
- AWS Direct Connect will set up a high-speed network connection between the on-premises data center and AWS to migrate data. If you have large volumes of archive data, using the AWS Snow family to move it offline is better.
- A data ingestion layer with various components to ingest streaming data using Amazon Kinesis, relational data using AWS Data Migration Service (DMS), secure file transfer using AWS Transfer for Secure Shell File Transfer Protocol (SFTP), and AWS DataSync to update data files between...