In this chapter, we discussed how an SVM works in both linear and non-linear scenarios, starting with the basic mathematical formulation. The main concept is to find the hyperplane that maximizes the distance between the classes by using a limited number of samples (called support vectors) that are closest to the separation margin.
We saw how to transform a non-linear problem using kernel functions, which allows the remapping of the original space to another high-dimensional one where the problem becomes linearly separable. We also saw how to control the number of support vectors and how to use SVMs for regression problems.
In the next chapter, Chapter 8, Decision Trees and Ensemble Learning, we're going to introduce another classification method called decision trees, which is the last one explained in this book.