Summary
In this chapter, we covered data management considerations for ML and what an enterprise data management platform could look like for ML. Now, you should know where data management intersects with the ML life cycle and how to design a data lake architecture on AWS. To put the learning into practice, you also built a data lake using Lake Formation. You practiced data ingestion, processing, and data cataloging for data discovery, querying, and downstream ML tasks. You have also developed hands-on skills with AWS data management tools, including AWS Lake Formation, AWS Glue, AWS Lambda, and Amazon Athena.
In the next chapter, we will start covering architecture and technologies for building data science environments using open source technologies.