ML engineering in the real world
The majority of us who work in ML, analytics, and related disciplines do so for organizations with a variety of different structures and motives. These could be for for-profit corporations, not-for-profits, charities, or public sector organizations like government or universities. In pretty much all of these cases, we do not do this work in a vacuum and not with an infinite budget of time or resources. It is important, therefore, that we consider some of the important aspects of doing this type of work in the real world.
First of all, the ultimate goal of your work is to generate value. This can be calculated and defined in a variety of ways, but fundamentally your work has to improve something for the company or its customers in a way that justifies the investment put in. This is why most companies will not be happy for you to take a year to play with new tools and then generate nothing concrete to show for it, or to spend your days only reading the latest papers. Yes, these things are part of any job in technology, and they can definitely be super-fun, but you have to be strategic about how you spend your time and always be aware of your value proposition.
Secondly, to be a successful ML engineer in the real world, you cannot just understand the technology; you must understand the business. You will have to understand how the company works day to day, you will have to understand how the different pieces of the company fit together, and you will have to understand the people of the company and their roles. Most importantly, you have to understand the customer, both of the business and your work. If you do not know the motivations, pains, and needs of the people you build for, then how can you be expected to build the right thing?
Finally, and this may be controversial, the most important skill for you to become a successful ML engineer in the real world is one that this book will not teach you, and that is the ability to communicate effectively. You will have to work in a team, with a manager, with the wider community and business, and, of course, with your customers, as mentioned above. If you can do this and you know the technology and techniques (many of which are discussed in this book), then what can stop you?
But what kinds of problems can you solve with ML when you work in the real world? Well, let’s start with another potentially controversial statement: a lot of the time, ML is not the answer. This may seem strange given the title of this book, but it is just as important to know when not to apply ML as when to apply it. This will save you tons of expensive development time and resources.
ML is ideal for cases when you want to do a semi-routine task faster, with more accuracy, or at a far larger scale than is possible with other solutions.
Some typical examples are given in the following table, along with some discussion as to whether or not ML would be an appropriate tool to solve the problem:
Requirement |
Is ML Appropriate? |
Details |
Anomaly detection of energy pricing signals. |
Yes |
You will want to do this on large numbers of points on potentially varying time signals. |
Improving data quality in an ERP system. |
No |
This sounds more like a process problem. You can try and apply ML to this but often it is better to make the data entry process more automated or the process more robust. |
Forecasting item consumption for a warehouse. |
Yes |
ML will be able to do this more accurately than a human can, so this is a good area of application. |
Summarizing data for business reviews. |
Maybe |
This can be required at scale but it is not an ML problem – simple queries against your data will do. |
Table 1.1: Potential use cases for ML.
As this table of simple examples hopefully starts to make clear, the cases where ML is the answer are ones that can usually be very well framed as a mathematical or statistical problem. After all, this is what ML really is – a series of algorithms rooted in mathematics that can iterate some internal parameters based on data. Where the lines start to blur in the modern world are through advances in areas such as deep learning or reinforcement learning, where problems that we previously thought would be very hard to phrase appropriately for standard ML algorithms can now be tackled.
The other tendency to watch out for in the real world (to go along with let’s use ML for everything) is the worry that people have about ML coming for their job and that it should not be trusted. This is understandable: a report by PwC in 2018 suggested that 30% of UK jobs will be impacted by automation by the 2030s (Will Robots Really Steal Our Jobs?: https://www.pwc.co.uk/economic-services/assets/international-impact-of-automation-feb-2018.pdf). What you have to try and make clear when working with your colleagues and customers is that what you are building is there to supplement and augment their capabilities, not to replace them.
Let’s conclude this section by revisiting an important point: the fact that you work for a company means, of course, that the aim of the game is to create value appropriate to the investment. In other words, you need to show a good Return on Investment (ROI). This means a couple of things for you practically:
- You have to understand how different designs require different levels of investment. If you can solve your problem by training a deep neural net on a million images with a GPU running 24/7 for a month, or you know you can solve the same problem with some basic clustering and a few statistics on some standard hardware in a few hours, which should you choose?
- You have to be clear about the value you will generate. This means you need to work with experts and try to translate the results of your algorithm into actual dollar values. This is so much more difficult than it sounds, so you should take the time you need to get it right. And never, ever over-promise. You should always under-promise and over-deliver.
Adoption is not guaranteed. Even when building products for your colleagues within a company, it is important to understand that your solution will be tested every time someone uses it post-deployment. If you build shoddy solutions, then people will not use them, and the value proposition of what you have done will start to disappear.
Now that you understand some of the important points when using ML to solve business problems, let’s explore what these solutions can look like.