Stemming and lemmatization
In language, inflection is how different grammatical categories such as tense, mood, or gender can be expressed by modifying a common root word. This often involves changing the prefix or suffix of a word but can also involve modifying the entire word. For example, we can make modifications to a verb to change its tense:
Run -> Runs (Add "s" suffix to make it present tense)
Run -> Ran (Modify middle letter to "a" to make it past tense)
But in some cases, the whole word changes:
To be -> Is (Present tense)
To be -> Was (Past tense)
To be -> Will be (Future tense – addition of modal)
There can be lexical variations on nouns too:
Cat -> Cats (Plural)
Cat -> Cat's (Possessive)
Cat -> Cats' (Plural possessive)
All these words relate back to the root word cat. We can calculate the root of all the words in the sentence to reduce the whole sentence to its lexical roots:
...