Linear Support Vector Machines (SVM)
Support Vector Machines (SVM) is a type of supervised machine learning algorithm and can be used for both classification and regression. However, it is more popular in addressing the classification problems, and since Spark offers it as an SVM classifier, we will limit our discussion to the classification setting only. When used as a classifier, unlike logistic regression, it is a non-probabilistic classifier.
The SVM has evolved from a simple classifier called the maximal margin classifier. Since the maximal margin classifier required that the classes be separable by a linear boundary, it could not be applied to many datasets. So it was extended to an improved version called the support vector classifier that could address the cases where the classes overlapped and there were no clear separation between the classes. The support vector classifier was further extended to what we call an SVM to accommodate the non-linear class boundaries. Let us discuss...