R packages for LDA
There are mainly two packages in R that can be used for performing LDA on documents. One is the topicmodels package developed by Bettina Grün and Kurt Hornik and the second one is lda developed by Jonathan Chang. Here, we describe both these packages.
The topicmodels package
The topicmodels package is an interface to the C and C++ codes developed by the authors of the papers on LDA and
Correlated Topic Models (CTM) (references 7, 8, and 9 in the References section of this chapter). The main function LDA
in this package is used to fit LDA models. It can be called by:
>LDA(X,K,method = "Gibbs",control = NULL,model = NULL,...)
Here, X is a document-term matrix that can be generated using the tm package and K is the number of topics. The method
is the method to be used for fitting. There are two methods that are supported: Gibbs
and VEM
.
Let's do a small example of building LDA models using this package. The dataset used is the Reuter_50_50 dataset from the UCI Machine Learning...