Multiclass classification with support vector machines
Just like with logistic regression, we've seen that the basic premise behind the support vector machine is that it is designed to handle two classes. Of course, we often have situations where we would like to be able to handle a greater number of classes, such as when classifying different plant species based on a variety of physical characteristics. One way to do this is the one versus all approach. Here, if we have K classes, we create K SVM classifiers, and for each classifier, we are attempting to distinguish one particular class from all the rest.
To determine the best class to pick, we assign the class for which the observation produces the highest distance from the separating hyperplane, thus lying farthest away from all other classes. More formally, we pick the class for which our linear feature combination has a maximum value across all the different classifiers.
An alternative approach is known as the (balanced) one versus one...