The field of machine learning (ML) and generative AI has rapidly evolved from its foundational concepts, such as Alan Turing's pioneering work on intelligence, to the sophisticated models and applications we see today. While Turing’s ideas centered on defining and detecting intelligence, modern applications stretch the definition and utility of intelligence in the realm of artificial neural networks, language models, and generative adversarial networks. For software engineers, this evolution presents both opportunities and challenges, from creating intelligent models to harnessing tools that streamline development and deployment processes. This article explores the best practices in machine learning, insights on deploying generative AI in real-world applications, and the emerging tools that software engineers can utilize to maximize efficiency and innovation.
When Alan Turing developed his famous Turing test for intelligence, computers, and software were completely different from what we are used to now. I’m certain that Turing did not think about Large Language Models (LLMs), Generative AI (GenAI), Generative Adversarial Networks, or Diffusers. Yet, this test for intelligence is equally useful today as it was at the time when it was developed. Perhaps our understanding of intelligence has evolved since then. We consider intelligence on different levels, for example, at the philosophical level and the computer science level.
At the philosophical level, we still try to understand what intelligence really is, how to measure it, and how to replicate it. At the computer science level, we develop new algorithms that can tackle increasingly complex problems, utilize increasingly complex datasets, and provide more complex output. In the following figure, we can see two different solutions to the same problem. On the left-hand side, the solution to the Fibonacci problem uses good old-fashioned programming where the programmer translates the solution into a program. On the right-hand side, we see a machine learning solution – the programmer provides example data and uses an algorithm to find the pattern just to replicate it later.
Figure 1. Fibonacci problem solved with a traditional algorithm (left-hand side) and machine learning’s linear regression (right-hand side). Although the traditional way is slow, it can be mathematically proven to be correct for all numbers, whereas the machine learning algorithm is fast, but we do not know if it renders correct results for all numbers.
Although the above is a simple example, it illustrates that the difference between a model and an algorithm is not that great. Essentially, the machine learning model on the right is a complex function that takes an input and produces an output. The same is true for the generative AI models.
Generative AI is much more complex than the algorithms used for Fibonacci, but it works in the same way – based on the data it creates new output. Instead of predicting the next Fibonacci number, LLMs predict the next token, and diffusers predict values of new pixels. Whether that is intelligence, I am not qualified to judge. What I am qualified to say is how to use these kinds of models in modern software engineering.
When I wrote the book Machine Learning Infrastructure and Best Practices for Software Engineers1, we could see how powerful ChatGPT 3.5 is. In my profession, software engineers use it to write programs, debug them and even to improve the performance of the programs. I call it being a superprogrammer. Suddenly, when software engineers get these tools, they become team leader for their bots, who support them – these bots are the copilots for the software engineers. But using these tools and models is just the beginning.
Neural Processing Units (NPUs) have started to become more popular in modern computers, which addresses the challenges with running language models locally, without the access to internet. The local execution reduces latency and reduces security risks of hijacking information when it is sent between the model and the client. However, the NPUs are significantly less powerful than data centers, and therefore we can only use them with small language models – so-called mini-LLMs. An example of such a model is Phi-3-mini model developed by Microsoft2.
In addition to NPUs, frameworks like ONNX appeared, which made it possible to quickly interchange models between GPUs and NPUs – you could train the model on a powerful GPU and use it on a small NPU thanks to these frameworks.
Since AI take so much space in modern hardware and software, GeekbenchAI3 is a benchmark suite that allows us to quantify and compare AI capabilities of modern hardware. I strongly recommend to take it for a spin to check what we can do with the hardware that we have at hands. Now, hardware is only as good as the software, and there, we also saw a lot of important improvements.
In my book, I presented the methods and tools to work with generative AI (as well as the classical ML). It’s a solid foundation for designing, developing, testing and deploying AI systems. However, if we want to utilize LLMs without the hassle of setting up the entire environment, we can use frameworks like Ollama4. The Ollama framework seamlessly downloads and deploys LLMs on a local machine if we have enough resources. Once installing the framework, we can type
ollama run phi-3
to start a conversation with the model. The framework provides a set of user interfaces, web services and other types of mechanisms needed to construct a fully-fledged machine learning software5. We can use it locally for all kinds of tasks, e.g., in finance6 .
As generative AI continues to evolve, its role in software engineering is set to expand in exciting ways. Here are key trends and opportunities that software engineers should focus on to stay ahead of the curve:
To stay competitive and relevant, software engineers need to continuously adapt by learning new AI technologies, refining their workflows with AI assistance, and staying attuned to emerging ethical challenges. Whether by mastering AI automation, optimizing edge deployments, or championing ethical practices, the future belongs to those who embrace AI as both a powerful tool and a collaborative partner. For software engineers ready to dive deeper into the transformative world of machine learning and generative AI, Machine Learning Infrastructure and Best Practices for Software Engineers offers a comprehensive guide packed with practical insights, best practices, and hands-on techniques.
As generative AI technologies continue to advance, software engineers are at the forefront of a new era of intelligent and automated development. By understanding and implementing best practices, engineers can leverage these tools to streamline workflows, enhance collaborative capabilities, and push the boundaries of what is possible in software development. Emerging hardware solutions like NPUs, edge computing capabilities, and advanced frameworks are opening new pathways for deploying efficient AI solutions. To remain competitive and innovative, software engineers must adapt to these evolving technologies, integrating AI-driven automation and collaboration into their practices and embracing the future with curiosity and responsibility. This journey not only enhances technical skills but also invites engineers to become leaders in shaping the responsible and creative applications of AI in software engineering.
Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner’s Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.