The main aim of text classification is to sort text documents into different classes. This is a vital analysis technique in NLP. We will use a technique that is based on a statistic called tf-idf, which stands for term frequency-inverse document frequency. This is an analysis tool that helps us to understand how important a word is to a document in a set of documents. This serves as a feature vector that's used to categorize documents.
Building a text classifier
Getting ready
In this recipe, we will use the term frequency-inverse document frequency method to evaluate the importance of a word for a document in a collection or a corpus, and to build a text classifier.