Introduction to Machine Learning with C++

There are different approaches to make computers solve tasks. One of them is to define an explicit algorithm, and another one is to use implicit strategies based on mathematical and statistical methods. Machine Learning (ML) is one of the implicit methods that uses mathematical and statistical approaches to solve tasks. It is an actively growing discipline, and a lot of scientists and researchers find it to be one of the best ways to move forward toward systems acting as human-level artificial intelligence (AI).

In general, ML approaches have the idea of searching patterns in a given dataset as their basis. Consider a recommendation system for a news feed, which provides the user with a personalized feed based on their previous activity or preferences. The software gathers information about the type of news article the user reads and calculates some statistics. For example, it could be the frequency of some topics appearing in a set of news articles. Then, it performs some predictive analytics, identifies general patterns, and uses them to populate the user's news feed. Such systems periodically track a user's activity, and update the dataset and calculate new trends for recommendations.

There are many areas where ML has started to play an important role. It is used for solving enterprise business tasks as well as for scientific researches. In customer relationship management (CRM) systems, ML models are used to analyze sales team activity, to help them to process the most important requests first. ML models are used in business intelligence (BI) and analytics to find essential data points. Human resource (HR) departments use ML models to analyze their employees' characteristics in order to identify the most effective ones and use this information when searching applicants for open positions.

A fast-growing direction of research is self-driving cars, and deep learning neural networks are used extensively in this area. They are used in computer vision systems for object identification as well as for navigation and steering systems, which are necessary for car driving.

Another popular use of ML systems is electronic personal assistants, such as Siri from Apple or Alexa from Amazon. Such products also use deep learning models to analyze natural speech or written text to process users' requests and make a natural response in a relevant context. Such requests can activate music players with preferred songs, as well as update a user's personal schedule or book flight tickets.

This chapter describes what ML is and which tasks can be solved with ML, and discusses different approaches used in ML. It aims to show the minimally required math to start implementing ML algorithms. It also covers how to perform basic linear algebra operations in libraries such as Eigen, xtensor, Shark-ML, Shogun, and Dlib, and also explains the linear regression task as an example.

The following topics will be covered in this chapter:

Understanding the fundamentals of ML
An overview of linear algebra
An overview of a linear regression example

Understanding the fundamentals of ML

There are different approaches to create and train ML models. In this section, we show what these approaches are and how they differ. Apart from the approach we use to create a ML model, there are also parameters that manage how this model behaves in the training and evaluation processes. Model parameters can be divided into two distinct groups, which should be configured in different ways. The last crucial part of the ML process is a technique that we use to train a model. Usually, the training technique uses some numerical optimization algorithm that finds the minimal value of a target function. In ML, the target function is usually called a loss function and is used for penalizing the training algorithm when it makes errors. We discuss these concepts more precisely in the following sections.

Venturing into the techniques of ML

We can divide ML approaches into two techniques, as follows:

Supervised learning is an approach based on the use of labeled data. Labeled data is a set of known data samples with corresponding known target outputs. Such a kind of data is used to build a model that can predict future outputs.
Unsupervised learning is an approach that does not require labeled data and can search hidden patterns and structures in an arbitrary kind of data.

Let's have a look at each of the techniques in detail.

Supervised learning

Supervised ML algorithms usually take a limited set of labeled data and build models that can make reasonable predictions for new data. We can split supervised learning algorithms into two main parts, classification and regression techniques, described as follows:

Classification models predict some finite and distinct types of categories—this could be a label that identifies if an email is spam or not, or whether an image contains a human face or not. Classification models are applied in speech and text recognition, object identification on images, credit scoring, and others. Typical algorithms for creating classification models are Support Vector Machine (SVM), decision tree approaches, k-nearest neighbors (KNN), logistic regression, Naive Bayes, and neural networks. The following chapters describe the details of some of these algorithms.
Regression models predict continuous responses such as changes in temperature or values of currency exchange rates. Regression models are applied in algorithmic trading, forecasting of electricity load, revenue prediction, and others. Creating a regression model usually makes sense if the output of the given labeled data is real numbers. Typical algorithms for creating regression models are linear and multivariate regressions, polynomial regression models, and stepwise regressions. We can use decision tree techniques and neural networks to create regression models too. The following chapters describe the details of some of these algorithms.

Unsupervised learning

Unsupervised learning algorithms do not use labeled datasets. They create models that use intrinsic relations in data to find hidden patterns that they can use for making predictions. The most well-known unsupervised learning technique is clustering. Clustering involves dividing a given set of data in a limited number of groups according to some intrinsic properties of data items. Clustering is applied in market researches, different types of exploratory analysis, deoxyribonucleic acid (DNA) analysis, image segmentation, and object detection. Typical algorithms for creating models for performing clustering are k-means, k-medoids, Gaussian mixture models, hierarchical clustering, and hidden Markov models. Some of these algorithms are explained in the following chapters of this book.

Dealing with ML models

We can interpret ML models as functions that take different types of parameters. Such functions provide outputs for given inputs based on the values of these parameters. Developers can configure the behavior of ML models for solving problems by adjusting model parameters. Training a ML model can usually be treated as a process of searching the best combination of its parameters. We can split the ML model's parameters into two types. The first type consists of parameters internal to the model, and we can estimate their values from the training (input) data. The second type consists of parameters external to the model, and we cannot estimate their values from training data. Parameters that are external to the model are usually called hyperparameters.

Internal parameters have the following characteristics:

They are necessary for making predictions.
They define the quality of the model on the given problem.
We can learn them from training data.
Usually, they are a part of the model.

If the model contains a fixed number of internal parameters, it is called parametric. Otherwise, we can classify it as non-parametric.

Examples of internal parameters are as follows:

Weights of artificial neural networks (ANNs)
Support vector values for SVM models
Polynomial coefficients for linear regression or logistic regression

On the other hand, hyperparameters have the following characteristics:

They are used to configure algorithms that estimate model parameters.
The practitioner usually specifies them.
Their estimation is often based on using heuristics.
They are specific to a concrete modeling problem.

It is hard to know the best values for a model's hyperparameters for a specific problem. Also, practitioners usually need to perform additional research on how to tune required hyperparameters so that a model or a training algorithm behaves in the best way. Practitioners use rules of thumb, copying values from similar projects, as well as special techniques such as grid search for hyperparameter estimation.

Examples of hyperparameters are as follows:

C and sigma parameters used in the SVM algorithm for a classification quality configuration
The learning rate parameter that is used in the neural network training process to configure algorithm convergence
The k value that is used in the KNN algorithm to configure the number of neighbors

Model parameter estimation

Model parameter estimation usually uses some optimization algorithm. The speed and quality of the resulting model can significantly depend on the optimization algorithm chosen. Research on optimization algorithms is a popular topic in industry, as well as in academia. ML often uses optimization techniques and algorithms based on the optimization of a loss function. A function that evaluates how well a model predicts on the data is called a loss function. If predictions are very different from the target outputs, the loss function will return a value that can be interpreted as a bad one, usually a large number. In such a way, the loss function penalizes an optimization algorithm when it moves in the wrong direction. So, the general idea is to minimize the value of the loss function to reduce penalties. There is no one universal loss function for optimization algorithms. Different factors determine how to choose a loss function. Examples of such factors are as follows:

Specifics of the given problem—for example, if it is a regression or a classification model
Ease of calculating derivatives
Percentage of outliers in the dataset

In ML, the term optimizer is used to define an algorithm that connects a loss function and a technique for updating model parameters in response to the values of the loss function. So, optimizers tune ML models to predict target values for new data in the most accurate way by fitting model parameters. There are many optimizers: Gradient Descent, Adagrad, RMSProp, Adam, and others. Moreover, developing new optimizers is an active area of research. For example, there is the ML and Optimization research group at Microsoft (located in Redmond) whose research areas include combinatorial optimization, convex and non-convex optimization, and their application in ML and AI. Other companies in the industry also have similar research groups; there are many publications from Facebook Research, Amazon Research, and OpenAI groups.

Filter reviews by

All

Amazon verified reviews

Karl Mueller Feb 15, 2023

While not a huge problem this book really needs the supplied Docker environment for the examples to work properly.I initially tried to set up the environment myself in my base Linux installation and found that some of the tools used in the book are difficult to find, difficult to compile, etc.Previously I knew nothing about Docker, but it wasn't difficult to learn and it is a useful system to know.It does raise the question about how useful some of the tools can be if they can only ever exist properly in the Docker environment provided with the book. Apart from that I found the book very useful for moving my ML knowledge developed in MATLAB, across to C++ which is the main language I use for development.

Amazon Verified review

Kindle Customer Dec 24, 2020

While Python normally does the job just fine when it comes to handling ML and more general analytics tasks, I have wanted for a long time to work on these kinds of problems using C++. Unfortunately, it has been very difficult to get started because of a severe lack of educational resources out there. Luckily, this book has finally filled that gap for me.What I really like about the book is that the author has put together a series of very complete examples for each method being discussed. Every program reads in an actual csv file with the data (as opposed to using some form of random number generation to create a toy example), puts it into the right format to be used with the given implementation of an ML method and then puts together a data set that one can use as output. As someone who has not had much experience with C++ outside a classroom setting, I found this extremely helpful, and it has made the material immediately applicable to my work in real life.The book covers just the right amount of theory in each chapter as well before diving into the C++ implementation, making the material accessible to developers who are relatively new to data science (which, as I understand, is actually the main target audience).

Robin T. Wernick Feb 08, 2021

Python has hijacked the Machine Learning territory over the last few years since 2014. This leaves the 'C' languages without a comparable foothold in this arena until this book was published. This book covers the gaping void between the 'C' language trained programmers and the Python Machine Language world. It has the same mathematical introduction theory, but counters with a set of code libraries that work with C++.This book will allow the C++ programmer to expand his programming scope without having to rewrite his entire code base in Python and learn a whole new programming language. Not only will it save enormous amounts of time, but it will also provide and give usage detail for a compatible PyTorch Deep Learning library for C++code use. Now the high performance world of GPU programming is available with a tensor interface to C++ programmers.

Matthew Emerick Jun 15, 2020

Disclaimer: The publisher asked me to review this book and gave me a review copy. I promise to be 100% honest in how I feel about this book, both the good and the less so.Personal Background: My first programming language after I started university was C++, followed by C. I'm glad to see that C++ can be used for ML problems, though I do understand that Python can be the easier choice. I try to keep in mind, however, that most if not all Python ML libraries are written in C/C++ to make it run faster.OverviewTo get the most out of this book, I would recommend that you have at least an intermediate competency of C++ and some basic knowledge of machine learning. The former is far more valuable than the later, in this case, as the author assumes that you know C++. There is no hand holding with the code. However, the author does walk you through ML from the basics to a moderate level.What I Like:This book is broken into four overall sections: Overview of Machine Learning, Machine Learning Algorithms, Advanced Examples, and Production and Deployment Challenges. This is an excellent selection of sections that make the overall book better organized. The first section gives a good overview of machine learning (as the title indicates), including a basic understanding of the math involved, data preproccessing, and general rundown of the considerations for choosing which ML technique you should use.The second section gives all the major ML algorithms that a junior ML developer will need. The book focuses on supervised and unsupervised ML, which is most of what you'll see in a business setting. This section finishes with a chapter on Ensemble Learning, where you use multiple ML algorithms to give you better results. The advanced examples mix and match some other algorithms to give you a basic understanding and a starting point for learning more. The final section looks at model deployment and mobile and cloud considerations. If you're new to machine learning and wish to use C++, this is a book book for it. Especially valuable are the Further Reading sections at the end of every chapter.What I Don't Like:When looking at the code, it was very different from the C++ code I learn nearly two decades ago. With the use of C++17, I faced a steep learning curve to use the code examples. While not a concern in and of itself, the first reference to C++17 I could find is on page 41. As someone who knows and enjoys an older version, this made using the code examples more difficult to me. I understand and agree with using a more recent version of the language, but would have appreciated a warning on the back cover or at least in the preface so that I could do some review first. A book recommendation for learning this version of C++ would have be appreciated as well.In the first chapter, the author divides machine learning up into two categories: supervised and unsupervised learning. While technically correct, there is a third category that doesn't fit well into either one: reinforcement learning. I wouldn't expect the author to delve into that niche sub field, it still should have been mentioned.What I Would Like to See:I really enjoyed this book. It has much to offer anyone with C++ experience. It is well organized and has much useful information. I am very happy to have it as part of my library. I think that a book from this author about C++ ML from Scratch would be interesting.Overall, I give this book a 4.9 out of 5. It's an excellent resource.

George Ford Feb 12, 2023

I originally bought this book with the hopes of being able to get a better grasp on machine learning with c++, since the back cover states: "This book makes machine learning with C++ for beginners easy with its example based approach". It starts off reviewing some of the basics of linear algebra... OK. But then in the next chapter, in an attempt to get you familiar with all of the different libraries, you begin loading data using API's without any background to what those API's do and then how you would use that data.The author tries to familiarize you with a bunch of different libraries, without truly ever really describing the details of any of them. The author will write code to accomplish a task with a given library, and then repeat the same with another library. But, in my opinion, this is done without much insight as to why you are doing what you are doing. Just copying code.The background info on neural networks, although helpful, does not really explain fully how they work, outside of providing the differential equations that are implemented.It is a really tough read to go from cover to cover, and I don't feel like you really grasp much, since too much is trying to be explained with a bunch of tools, but no focus on any given tool.I think it would be much better if someone were to focus on one or two tools (xtensor, libtorch, dlib, etc) and approach the subject in that manner. This way, you are familiarizing yourself with the subject as well as the library you are using

Hands-On Machine Learning with C++: Build, train, and deploy end-to-end machine learning and deep learning pipelines

What do you get with Print?

Hands-On Machine Learning with C++

Introduction to Machine Learning with C++

Understanding the fundamentals of ML

Venturing into the techniques of ML

Supervised learning

Unsupervised learning

Dealing with ML models

Model parameter estimation

Page 1 of 6

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the authors

FAQs

Hands-On Machine Learning with C++: Build, train, and deploy end-to-end machine learning and deep learning pipelines

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the authors

FAQs