Working with Data in AMLS
In Machine Learning (ML), regardless of the use case or the algorithm we use, an important component that will always be used is data. Without data, you cannot build machine learning models. The quality of the data is very critical for building performant models. Complex models such as deep neural networks require a lot more data than simpler models. Data in an ML workflow will often come from a variety of data sources and require different methods to be leveraged for data processing, cleansing, and feature selection. During this process of feature engineering, your Azure Machine Learning workspace will be leveraged to empower you to collaboratively work with your data. This will ensure secure connectivity to a variety of data sources, as well as enable you to register your datasets for use in training, testing, and validation.
As an example of steps within this workflow, we may be required to take raw data, join with an additional dataset, cleanse data...