The Neural Topic Model (NTM), as we described previously, is a generative document model that produces multiple representations of a document. It generates two outputs:
- The topic mixture for a document
- A list of keywords that explain a topic, for all the topics across an entire corpus
NTM is based on the Variational Autoencoder architecture. The following illustration shows how NTM works:
Let's explain this diagram, bit by bit:
- There are two components—an encoder and a decoder. In the encoder, we have a Multiple Layer Perceptron (MLP) network that takes a bag-of-words representation of documents and creates two vectors, a vector of means and a vector of standard deviation . Intuitively, the mean vector controls where encoding the input should be centered, while the standard deviation controls the area around the center...