After introducing a powerful alternative for text feature exaction, we will continue with another great classifier alternative for text data classification, the support vector machine.
In machine learning classification, SVM finds an optimal hyperplane that best segregates observations from different classes. A hyperplane is a plane of n-1 dimension that separates the n dimensional feature space of the observations into two spaces. For example, the hyperplane in a two-dimensional feature space is a line, and a surface in a three-dimensional feature space. The optimal hyperplane is picked so that the distance from its nearest points in each space to itself is maximized. And these nearest points are the so-called support vectors.