Using contextualized topic models
In this recipe, we will look at another topic model algorithm: contextualized topic models. To produce a more effective topic model, it combines embeddings with a bag-of-words document representation.
We will show you how to use the trained topic model with input in other languages. This feature is especially useful because we can create a topic model in one language, for example, one that has many resources available, and then apply it on another language that does not have as many resources. To achieve this, we will utilize a multilingual embedding model in order to encode the data.
Getting ready
We will need the contextualized-topic-models
package for this recipe. It is part of the poetry environment and the requirements.txt
file.
The notebook is located at https://github.com/PacktPublishing/Python-Natural-Language-Processing-Cookbook-Second-Edition/blob/main/Chapter06/6.5-contextualized-tm.ipynb.
How to do it...
In this recipe...