Choosing the cost of a support vector machine
The support vector machines create an optimum hyperplane that separates the training data by the maximum margin. However, sometimes we would like to allow some misclassifications while separating categories. The SVM model has a cost function, which controls training errors and margins. For example, a small cost creates a large margin (a soft margin) and allows more misclassifications. On the other hand, a large cost creates a narrow margin (a hard margin) and permits fewer misclassifications. In this recipe, we will illustrate how the large and small cost will affect the SVM classifier.
Getting ready
In this recipe, we will use the iris
dataset as our example data source.
How to do it...
Perform the following steps to generate two different classification examples with different costs:
- Subset the
iris
dataset with columns named asSepal.Length
,Sepal.Width
,Species
, with species insetosa
andvirginica
:
> iris.subset = subset(iris, select...