Step 4 – adding transformations
As part of your data analysis, you might have noticed elements of your dataset that you want to change or transform. The goal of data transformation is to make data more suitable for modeling, to improve the performance of machine learning algorithms, or to handle missing or corrupted values. Data transformations for machine learning can include things such as normalization, standardization, data encoding, and binning. Not all datasets are alike, and not all transformations apply to all datasets. The goal of data analysis is to identify specific transformations for your dataset. While we typically apply data transformation as an early step in the machine learning pipeline, before data is used to train a model, in real-world machine learning, we continually monitor our model performance and apply transformations as necessary. After you have imported and inspected your dataset in Data Wrangler, you can start adding transformations to your data flow...