Let's start with the simple and beautiful nearest-neighbor method from Chapter 2, Classifying with Real-world Examples. Although it is not as advanced as other methods, it is very powerful: as it is not model-based, it can learn nearly any data. But this beauty comes with a clear disadvantage, which we will find out very soon (because of which, we had to capitalize learn in the previous sentence).
Creating our first classifier
Engineering the features
As mentioned earlier, we will use the Text and Score features to train our classifier. The problem with Text is that the classifier does not work well with strings. We will have to convert it into one or more numbers. So, what statistics could be useful to extract from a...