Tools like ChatGPT have been making headlines as of late. ChatGPT and other LLMs have been transforming the way people study, work, and for the most part, do anything. However, ChatGPT and other LLMs are for everyday users. In short, ChatGPT and other similar systems can help engineers and data scientists, but they are not designed to be engineering or analytics tools. Though ChatGPT and other LLMs are not designed to be machine-learning tools, there is a tool that can assist engineers and data scientists. Enter the world of AutoML for Azure. This article is going to explore AutoML and how it can be used by engineers and data scientists to create machine learning models.
AutoML is an Azure tool that builds the optimal model for a given data set. In many senses, AutoML can be thought of as a ChatGPT-like system for engineers. AutoML is a tool that allows engineers to quickly produce optimal machine-learning models with little to no technical input. In short, ChatGPT and other similar systems are tools that can answer general questions about anything, but AutoML is specifically designed to produce machine-learning models.
Though AutoML is a tool designed to produce machine learning models it doesn’t actually use AI or machine learning in the process. The key to AutoML is parallel pipelines. A pipeline can be thought of as the logic in a machine-learning model. For example, the pipeline logic will include things such as cleaning data, splitting data, using a model for the system, and so on.
When a person utilizes AutoML it will create a series of parallel pipelines with different algorithms and parameters. When a model “fits” the data the best it will cease, and that pipeline will be chosen. Essentially, AutoML in Azure is a quick and easy way for engineers to cut out all the skilled and time-consuming development that can easily hinder non-experienced data scientists or engineers. To demonstrate how AutoML in Azure works let’s build a model using the tool.
Azure’s AutoML takes a little bit of technical knowledge to get up and running, especially if you’re using a custom dataset. For the most part, you’re going to need to know approximately what type of analysis you’re going to perform. You’re also going to need to know how to create a dataset. This may seem like a daunting task but it is relatively easy.
To use AutoML in Azure you’ll need to setup a few things. The first thing to set up an ML workspace. This is done by simply logging into Azure and searching for ML like in Figure 1:
Figure 1
From there, click on Azure Machine Learning and you should be redirected to the following page. Once on the Azure Machine Learning page click on the Create button and New Workspace:
Figure 2
Once there, fill out the form, all you need to do is select a resource group and give the workspace a name. You can use any name you want, but for this tutorial, the name Article 1
will be used. You’ll be prompted to click create, once you click that button Azure will start to deploy the workspace. The workspace deployment may take a few minutes to complete. Once done click Go to resource. Once you click Go to resource click on Launch studio like in Figure 3.
Figure 3
At this point, the workspace has been generated and we can move to the next step in the process, using AutoML to create a new model.
Now, that the workspace has been created, click Launch Studio you should be met with Figure 4. The page in Figure 4 is Azure Machine Learning Studio. From here you can navigate to AutoML by clicking the link on the left sidebar:
Figure 4
Once you click the AutoML you should be redirected to the page in Figure 5:
Figure 5
Once you see something akin to Figure 5 click on the New Automated ML Job button which should redirect you to a screen that prompts you to select a dataset. This step is one of the more in-depth compared to the rest of the process. During this step, you will need to select your dataset. You can opt to use a predefined dataset that Azure provides for test purposes. However, for a real-world application, you’ll probably want to opt for a custom dataset that was engineered for your task. Azure will allow you to either use a pre-built dataset or your own. For this tutorial we’re going to use a custom dataset that is the following:
Hours | Story Points |
16 | 13 |
15 | 12 |
15 | 11 |
13 | 4 |
22 | 8 |
28 | 18 |
30 | 19 |
10 | 3 |
21 | 14 |
11 | 7 |
12 | 9 |
25 | 19 |
24 | 17 |
23 | 15 |
16 | 13 |
15 | 12 |
15 | 11 |
13 | 4 |
22 | 8 |
28 | 18 |
30 | 19 |
10 | 3 |
21 | 14 |
11 | 7 |
12 | 9 |
25 | 19 |
24 | 17 |
23 | 15 |
16 | 13 |
15 | 12 |
15 | 11 |
13 | 4 |
22 | 8 |
28 | 18 |
30 | 19 |
10 | 3 |
21 | 14 |
11 | 7 |
12 | 9 |
25 | 19 |
24 | 17 |
23 | 15 |
16 | 13 |
15 | 12 |
15 | 11 |
13 | 4 |
22 | 8 |
28 | 18 |
30 | 19 |
10 | 3 |
21 | 14 |
11 | 7 |
12 | 9 |
25 | 19 |
24 | 17 |
23 | 15 |
To use this dataset simply copy and paste into a CSV file. To use it select the data from a file option and follow the wizard. Note, that for custom datasets you’ll need at least 50 data points.
Continue to follow the wizard and give the experiment a name, for example, E1. You will also have to select a Target Column. For this tutorial select Story Points. If you do not already have a compute instance available, click the New button at the bottom and follow the wizard to set one up. Once that step is complete you should be directed to a page like in Figure 6:
Figure 6
This is where you select the general type of analysis to be done on the dataset. For this tutorial select Regression and click the Next button in Figure 6 then click Finish. This will start the process which will take several minutes to complete.
The whole process can take up to about 20 or so minutes depending on which compute instance you use. Once done you will be able to see the metrics by clicking on the Models tab. This will show all the models that were tried out. From here you can explore the model and the associated statistics.
In all, Azure’s AutoML is an AI tool that helps engineers quickly produce an optimal model. Though not the same, this tool can be used by engineers the same way ChatGPT and similar systems can be used by everyday users. The main drawback to AutoML is that unlike ChatGPT a user will need a rough idea as to what they’re doing. However, once a person has a rough idea of the basic types of machine-learning analysis they should be able to use this tool to great effect.
M.T. White has been programming since the age of 12. His fascination with robotics flourished when he was a child programming microcontrollers such as Arduino. M.T. currently holds an undergraduate degree in mathematics, and a master's degree in software engineering, and is currently working on an MBA in IT project management. M.T. is currently working as a software developer for a major US defense contractor and is an adjunct CIS instructor at ECPI University. His background mostly stems from the automation industry where he programmed PLCs and HMIs for many different types of applications. M.T. has programmed many different brands of PLCs over the years and has developed HMIs using many different tools.
Author of the book: Mastering PLC Programming