Creating word clouds
A word cloud is a visual representation of the most common words within a text. It measures the frequency of each word within the text and represents this frequency by the size of the word. Bigger words occur more frequently in the text, while smaller words occur less frequently. A word cloud provides a very useful summary of the word distribution within a text. It is also a great way to gain quick insights into the prominent words in text data.
We will explore how to create word clouds in Python using the wordcloud
library and the FreqDist
class in nltk
.
Getting ready
We will work with the Sentiment Analysis of Restaurant Review dataset for this recipe. However, we will work with the preprocessed version that was exported from the Performing stemming and lemmatization recipe. This version cleaned out stop words, punctuations, contractions, and uppercase letters and lemmatized the data. You can retrieve all the files from the GitHub repository.
Along...