In this chapter, we first expanded our knowledge of text feature exaction by introducing an advanced technique termed frequency-inverse document frequency. We then continued our journey of classifying news data with the support vector machine classifier, where we acquired the mechanics of SVM, kernel techniques and implementations of SVM, and other important concepts of machine learning classification, including multiclass classification strategies and grid search, as well as useful tips for using SVM (for example, choosing between kernels and tuning parameters). We finally adopted what we have learned in two practical cases, news topic classification and fetal state classification.
We have learned and applied two classification algorithms so far, naive Bayes and SVM. naive Bayes is a simple algorithm. For a dataset with independent features, naive Bayes will usually perform well. SVM is versatile to adapt...