Summary
In this chapter, we discussed how a support vector machine works in both linear and non-linear scenarios, starting from the basic mathematical formulation. The main concept is to find the hyperplane that maximizes the distance between the classes by using a limited number of samples (called support vectors) that are closest to the separation margin.
We saw how to transform a non-linear problem using kernel functions, which allow remapping of the original space to a another high-dimensional one where the problem becomes linearly separable. We also saw how to control the number of support vectors and how to use SVMs for regression problems.
In the next chapter, we're going to introduce another classification method called decision trees, which is the last one explained in this book.