When we deal with a text document, we encounter different forms of words. Consider the word play. This word can appear in various forms, such as play, plays, player, playing, and so on. These are basically families of words with similar meanings. During text analysis, it's useful to extract the base forms of these words. This will help us to extract some statistics to analyze the overall text. The goal of stemming is to reduce these different forms into a common base form. This uses a heuristic process to cut off the ends of words in order to extract the base form.
Stemming text data
Getting ready
In this recipe, we will use the nltk.stem package that offers a processing interface for removing morphological affixes from...