In this chapter, you learned about big data architecture and components for big data pipeline design. You learned about data ingestion and various technology choices available to collect batch and stream data for processing. As the cloud is taking a central place in storing the vast amounts of data being produced today, you learned about the various services available to ingest data in the AWS cloud ecosystem.
Data storage is one of the central points when it comes to handling big data. You learned about various kinds of data stores, including structured and unstructured data, NoSQL, and data warehousing, with the relevant technology choices associated with each. You learned about data lake architecture and its benefits.
Once you collect and store data, you need to perform data transformation to get insight into that data and visualize your business requirements. You learned about data processing architecture along with technology choices to choose open source and cloud-based...