Visual data preparation with Data Wrangler
Let's start small with our 1-month dataset. Working with a small dataset is a good way to get familiar with the data before diving into more scalable techniques. SageMaker Data Wrangler gives us an easy way to construct a data flow, a series of data preparation steps powered by a visual interface.
In the rest of this section, we'll use Data Wrangler to inspect and transform data, and then export the Data Wrangler steps into a reusable flow.
Data inspection
Let's get started with Data Wrangler for data inspection, where we look at the properties of our data and determine how to prepare it for model training. Begin by adding a new flow in SageMaker Studio; go to the File menu, then New, then Flow. After the flow starts up and connects to Data Wrangler, we need to import our data. The following screenshot shows the data import step in Data Wrangler: