Exploring data with visualization
Exploratory data analysis (EDA) provides insights into the data at hand and helps us strategize the data transformation so that ML modeling can be the most performant. Analyzing and visualizing data with programming is robust and scalable but it requires lots of coding and development. Using SageMaker Data Wrangler, you can easily create charts and figures in the UI. Currently, SageMaker Data Wrangler supports the following types of chart and analysis that do not require coding: histogram, scatter plot, bias report, multicollinearity, quick model, target leakage, and table summary. Let's take a look at how they work one by one.
Understanding the frequency distribution with a histogram
The histogram helps us understand the frequency distribution of a variable whose values are bucketed into discrete intervals with a bar graph. We can use the histogram function in SageMaker Data Wrangler to see, for example, how long callers spend making calls...