LDA topic modeling with gensim
In the previous section, we saw how to create an LDA model with the sklearn
package. In this recipe, we will create an LDA model using the gensim
package.
Getting ready
We will be using the gensim
package, which can be installed using the following command:
pip install gensim
How to do it…
We will load the data, clean it, preprocess it in a similar fashion to the previous recipe, and then create the LDA model. The steps for this recipe are as follows:
- Perform the necessary imports:
import re import pandas as pd from gensim.models.ldamodel import LdaModel import gensim.corpora as corpora from gensim.utils import simple_preprocess import matplotlib.pyplot as plt from pprint import pprint from Chapter06.lda_topic import stopwords, bbc_dataset, clean_data
- Define the function that will preprocess the data. It uses the
clean_data
function from the previous recipe:def preprocess(df): df = clean_data(df...