Lemmatization is a more methodical process of converting words to their base. Where stemming generally just chops off the ends of words, lemmatization takes into account the morphological analysis of words, evaluating the context and part of speech to determine the inflected form, and makes a decision between different rules to determine the root.
Performing lemmatization
How to do it
Lemmatization can be utilized in NTLK using the WordNetLemmatizer. This class uses the WordNet service, an online semantic database to make its decisions. The code in the 07/04_lemmatization.py file extends the previous stemming example to also calculate the lemmatization of each word. The code of importance is the following:
from nltk.stem...