So far, we have looked at two classical classifiers, namely the decision tree and the nearest neighbor classifier. Scikit-learn supports many more, but it does not support everything that has ever been proposed in academic literature. Thus, one may be left wondering: which one should I use? Is it even important to learn about all of them?
In many cases, knowledge of your dataset may help you decide which classifier has a structure that best matches your problem. However, there is a very good study by Manuel Fernández-Delgado and his colleagues titled, Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? This is a very readable, very practically-oriented study, where the authors conclude that there is actually one classifier which is very likely to be the best (or close to the best) for a majority of problems, namely random...