An SVM is a learner for binary classification (and regression) that tries to separate examples from the two different class labels with a decision boundary that maximizes the margin between the two classes.
Let's return to our example of positive and negative data samples, each of which has exactly two features (x and y) and two possible decision boundaries, as follows:
Both of these decision boundaries get the job done. They partition all the samples of positives and negatives with zero misclassifications. However, one of them seems intuitively better. How can we quantify better and thus learn the best parameter settings?
This is where SVMs come into the picture. SVMs are also called maximal margin classifiers because they can be used to do exactly that—define the decision boundary so as to make those two clouds of + and - as far apart as possible...