Gensim [2] is arguably the most popular topic modeling toolkit freely available, and it being in Python means that it fits right into our ecosystem. Gensim's popularity is because of its wide variety of topic modeling algorithms, straightforward API, and active community. Of course, we have already introduced Gensim before, in Chapter 4, Gensim - Vectorizing Text and Transformations and n-grams, on vector spaces. We would be needing to know how to set up our corpus for the topic modeling algorithms we will be using, so now is a good time to brush on the contents of the Vector transformation in Gensim section, in Chapter 4, Gensim - Vectorizing Text and Transformations and n-grams.
All done? Now we can start using the powerful tools that Gensim have to offer. The Jupyter notebook [7] runs us through the same corpus generating techniques we previously...