Hands-on exercise – data management for ML
In this hands-on exercise, you will build a data management platform for a fictitious retail bank to support an ML workflow. We will build the data management platform on AWS using various AWS technologies. If you don't have an AWS account, you can create one by following the instructions at https://aws.amazon.com/console/.
The data management platform we create will have the following key components:
- A data lake environment for data management
- A data ingestion component for ingesting files to the data lake
- A data discovery and query component
- A data processing component
The following diagram shows the data management architecture we will build in this exercise:
Let's get started with building out this architecture on AWS.
Creating a data lake using Lake Formation
We will build the data lake...