Technical requirements
We will leverage the pandas
, matplotlib
, seaborn
, nltk
, scikit-learn
, gensim
, and pyLDAvis
libraries in Python for this chapter. The code and notebooks for this chapter are available on GitHub at https://github.com/PacktPublishing/Exploratory-Data-Analysis-with-Python-Cookbook.
Across many of the recipes, we will be using nltk
, which is a widely used library for several text analysis tasks, such as text cleaning, stemming, lemmatization, and sentiment analysis. It has a suite of text-processing modules that perform these tasks, and it is very easy to use. To install nltk
, we will use the code here:
pip install nltk
To perform various tasks, nltk
also requires some resources to be downloaded. The following code helps to achieve this:
nltk.download('punkt') nltk.download('stopwords') nltk.download('wordnet') nltk.download('averaged_perceptron_tagger') nltk.download('vader_lexicon') nltk.download(&apos...