ChatGPT and Azure Low Code Machine Learning

Introduction

ChatGPT can do many amazing things. ChatGPT can easily troubleshoot code, it can generate source code, and much more. However, software development and by extension data engineering is comprised of so much more than just text-based programming. For example, Azure offers a low/no-code tool that can be used to generate machine learning models without having to spend countless hours writing millions of lines of code. However, there is a caveat to this service; in short, a person has to know what they are doing to use it. For many building a machine-learning system is a complex task. This is where ChatGPT comes into play. ChatGPT can easily guide a user into building the pipeline needed to meet their goals. What is especially interesting about this notion is that we can use AI to help build other AI! To demonstrate this we’re going to use ChatGPT as a guiding tool to build our own linear regression model in Azure.

Expectations

At the time of writing this article, ChatGPT cannot produce actual diagrams; however, what it can do is guide you. This means that you still have to be familiar with the overall gist of technology, but you don’t have to be a data science expert to figure out how to make a model from scratch. ChatGPT is an excellent tool to get you up and running quickly, but as stated in many other articles throughout the internet, ChatGPT is not a replacement for developers. On a side note, this tutorial is meant to be interactive. ChatGPT will only generate a series of steps that must be interpreted. This means that if you don’t know how to carry out a task there are a lot of opportunities to create your own prompts. With that, let’s start!

Setup

This tutorial is going to assume that you have a basic knowledge of Azure and the Azure machine learning services. However, we’re going to use ChatGPT as a guide from start to finish, so if you’re not sure of what you’re doing you can always modify the prompts or create new ones to help you along. As an exercise, if you do get stuck it is a good idea to try to write some prompts on your own to get a feel for ChatGPT. With that let’s start by having ChatGPT guide us in creating the necessary environment for our machine learning pipeline.

Environment

To begin, let’s start with getting our environment set up. To figure out how to do this let’s use the following prompt:

What infrastructure will I need to build a no-code Azure machine learning pipeline?

Once you run this prompt you should be met with something akin to the following:

To build a no-code Azure Machine Learning pipeline, you will need the following infrastructure components:

1.     Azure Subscription: You need an active Azure subscription to access and utilize Azure Machine Learning services.
2.     Azure Machine Learning Workspace: Create an Azure Machine Learning workspace, which serves as the top-level resource for organizing and managing your machine learning assets. The workspace provides a centralized location to store your pipelines, datasets, models, and other resources.
3.     Compute Resources: Azure Machine Learning provides various compute options for running your pipeline. Depending on your workload and requirements, you can choose from options such as Azure Machine Learning Compute, Azure Databricks, or remote VMs. These resources will execute the pipeline steps and perform the required computations.
4.     Data Storage: Azure Machine Learning supports multiple data storage options, such as Azure Blob Storage, Azure Data Lake Storage, or Azure SQL Database. You need a suitable data storage solution to store your input datasets, intermediate data, and any other data required for your pipeline.
5.     Networking and Security: Ensure that your Azure subscription and associated resources are properly secured. Azure provides various security features, such as Virtual Network (VNet) integration, network security groups (NSGs), role-based access control (RBAC), and Azure Active Directory (AAD) authentication, to protect your resources and control access.
6.     Monitoring and Logging: Azure Machine Learning offers monitoring and logging capabilities to track the execution of your pipelines, monitor performance metrics, and troubleshoot issues. You can use Azure Monitor, Azure Log Analytics, or other monitoring tools to gain insights into pipeline performance and health.
7.     Integration with DevOps: If you have a DevOps workflow, you may want to integrate your no-code pipeline with your existing CI/CD processes. Azure Machine Learning can be integrated with Azure DevOps, GitHub Actions, or other tools to automate the deployment and management of your pipelines.

Remember to evaluate your specific workload requirements and choose the appropriate infrastructure components accordingly. The exact infrastructure setup will vary depending on factors such as data size, computational needs, security requirements, and integration with existing systems.

This prompt generated a lot of needed infrastructure components that are necessary for a production-level pipeline. However, our pipeline will only be used for practice so all we need to do is follow steps 1-3. In other words, all we need for this tutorial is an Azure instance, an Azure Machine Learning workspace, and a compute instance. If you can create these by yourself do so; however, if you can’t try writing your own prompt. For example, try engineering a prompt that can give you the steps necessary to set up a compute instance. You could try something like the following:

How do I set up a compute instance in Azure?

If this prompt doesn’t provide you with the necessary information, try rewording it and giving it more details until you can do it with the provided steps.

Pipeline

The next step is building out the pipeline. The pipeline is the logic for the model. In other words, the pipeline is the algorithm that will load the data for the model, clean the data, train the model, and evaluate the model if necessary. In short, the best way to think of the pipeline is as a graphical machine learning program. As stated before, ChatGPT is mostly limited to text-based responses. However, we can still get step-by-step directions with the following prompt:

Build a minimal linear regression model using Azure Machine Learning Studio

Which will output the following response:

As an AI language model, I don't have direct access to external tools or platforms like Azure Machine Learning Studio, and my capabilities are limited to providing instructions and guidance. However, I can walk you through the steps to build a minimal linear regression model using Azure Machine Learning Studio.

Please note that the steps might slightly vary based on updates to the platform since my last update in September 2021. Make sure to check the official Azure Machine Learning documentation for any changes.

Here's a general outline of the steps:

1.     Set up Azure Machine Learning Studio: If you don't have an Azure Machine Learning account, sign up for one. Once you have access to Azure Machine Learning Studio, create a new project or workspace.
2.     Create a new Experiment: In Azure Machine Learning Studio, create a new experiment by clicking on "Experiments" and then "New." Give your experiment a descriptive name.
3.     Data Ingestion: Upload your dataset to Azure Machine Learning Studio. You can do this by clicking on "Datasets" and then "New" to upload your dataset.
4.     Data Cleaning and Preparation: Prepare your data by cleaning and transforming it if needed. This might include removing missing values, encoding categorical variables, and splitting the data into training and testing sets.
5.     Add Linear Regression Module: Drag and drop the "Linear Regression" module from the left panel to the canvas in your experiment.
6.     Connect Data to the Model: Connect the dataset you uploaded earlier to the "Linear Regression" module by clicking and dragging from the dataset to the input of the linear regression module.
7.     Train the Model: Run the experiment to train the linear regression model on your dataset.
8.     Evaluate the Model: After the training is complete, you can use evaluation modules like "Score Model" and "Evaluate Model" to assess the performance of the trained linear regression model.
9.     Deploy the Model (Optional): If you want to use the trained model for predictions in real-world scenarios, you can deploy the model as a web service in Azure Machine Learning Studio.
10. Test the Deployed Model (Optional): Test the deployed model to ensure it is working correctly and providing accurate predictions.

Remember, this is just a minimal example of building a linear regression model using Azure Machine Learning Studio. Depending on your dataset and specific use case, you may need to perform additional steps like feature engineering, hyperparameter tuning, and more thorough model evaluation.
Always refer to the official documentation and resources from Azure Machine Learning Studio for the most up-to-date and detailed instructions.

Analyzing the output, the first two steps are self-explanatory. The third step will require some preparation since we’re going to use a custom dataset for this tutorial.

This tutorial is going to use a dataset that consists of 14 points. Normally, you’ll want to have a dataset of at least 30 points to ensure normal distribution; however, for this example, we’re going to assume the data has a normal distribution. With that, to train the model for this tutorial, copy the following data into either a plain text file or an Excel spreadsheet and save it as a .csv file:

chatgpt-and-azure-low-code-machine-learning-img-0

Once you have the data saved, you’ll need to create a data asset and upload the file to Azure. If you don’t know the steps to upload the dataset you should try writing another prompt for directions. If you need more information on this step you could try a prompt like:

How do I upload a custom dataset to Azure Machine Learning Studio?

Moving on to the rest of the instructions, we can construct a pipeline that resembles the pipeline in Figure 1:

chatgpt-and-azure-low-code-machine-learning-img-1

Figure 1 – Completed Pipeline

This pipeline is, for the most part, a one-to-one representation of the instructions. However, this pipeline expanded in Step 4. In this pipeline, we added a Select Columns in the Dataset module to ensure we are only including the necessary data for the model. From there we ran that data to a Split Data component as was suggested by ChatGPT. Other than that, the model is exactly as described with the exception of the last two steps which are for deployment and testing; that is, the two steps that were labeled as “optional” by ChatGPT.

For this tutorial build the model as is seen in Figure 1 and run it. After you run the pipeline, you can see how well the model performed. To see the statistics, you can click the Evaluate Model component to see the statistics for the model. There is a lot of information that can be unpacked, but you can see the scores by navigating to the metrics tab. If you used the same dataset, your numbers should be around the values that are in Figure 2.

chatgpt-and-azure-low-code-machine-learning-img-2

Figure 2 – Linear Regression Outputs

At this point, ChatGPT has guided us in building a linear regression model. Overall, the linear regression model that ChatGPT guided us to build is a very simple model that, all things considered, is fairly accurate.

Summary

This tutorial has been a crash course on how ChatGPT can be used to build no-code solutions in Azure Machine Learning Studios. What’s incredible about this tutorial is that we used AI to help build another AI system. However, as was seen in the tutorial, ChatGPT was only a guide. For graphical systems, ChatGPT, at best, can be used as a guide. This means that for systems like Azure Machine Learning Studio, a basic understanding of using the system is required. As such, this means that for graphical systems ChatGPT is best utilized by persons that are knowledgeable of the system but need guidance for the task at hand. For example, if this were a real project the ideal engineer would be a person that knows how to use Azure Machine Learning Studio but needs help with creating the pipeline logic.

In terms of graphical-based programming, ChatGPT is almost a little ironic. For example, when it comes to text-based programming in Java, Python, or whatever other language ChatGPT can be used to generate working code. However, due to the current limitation of ChatGPT not being able to generate graphical programs for systems like Azure Machine Learning Studio a person will require a more in-depth knowledge of the system. As ChatGPT matures this may change but for now, it is best to still have a knowledgeable engineer driving ChatGPT and implementing its solutions. Overall, ChatGPT is an excellent assistant but will require a person that is knowledgeable of the technology being used.

Author Bio

M.T. White has been programming since the age of 12. His fascination with robotics flourished when he was a child programming microcontrollers such as Arduino. M.T. currently holds an undergraduate degree in mathematics, and a master's degree in software engineering, and is currently working on an MBA in IT project management. M.T. is currently working as a software developer for a major US defense contractor and is an adjunct CIS instructor at ECPI University. His background mostly stems from the automation industry where he programmed PLCs and HMIs for many different types of applications. M.T. has programmed many different brands of PLCs over the years and has developed HMIs using many different tools.

Author of the book: Mastering PLC Programming