Learning to classify classy answers
While classifying, we want to find the corresponding classes, sometimes also called labels, for the given data instances. To be able to achieve this, we need to answer the following two questions:
How should we represent the data instances?
Which model or structure should our classifier possess?
Tuning the instance
In its simplest form, in our case, the data instance is the text of the answer and the label is a binary value indicating whether the asker accepted this text as an answer or not. Raw text, however, is a very inconvenient representation to process for most of the machine learning algorithms. They want numbers. It will be our task to extract useful features from raw text, which the machine learning algorithm can then use to learn the right label.
Tuning the classifier
Once we have found or collected enough (text and label) pairs, we can train a classifier. For the underlying structure of the classifier, we have a wide range of possibilities, each of...