Word stemming
In some NLP tasks, we need to stem words, or remove the suffixes and endings such as -ing and -ed. This recipe shows how to do that.
Getting ready
To do this recipe, we will be using NLTK and its Snowball Stemmer.
How to do it…
We will load the NLTK Snowball Stemmer and use it to stem words:
- Import the NLTK Snowball Stemmer:
from nltk.stem.snowball import SnowballStemmer
- Initialize
stemmer
with English:stemmer = SnowballStemmer('english')
- Initialize a list with words to stem:
words = ['leaf', 'leaves', 'booking', 'writing', 'completed', 'stemming', 'skies']
- Stem the words:
stemmed_words = [stemmer.stem(word) for word in words]
The result will be as follows:
['leaf', 'leav', 'book', 'write', 'complet', 'stem', 'sky']