Throughout this end-to-end project, we have used LDA, which is one of the most popular TM algorithms used for text mining. We could use more robust TM algorithms, such as Probabilistic Latent Sentiment Analysis (pLSA), Pachinko Allocation Model (PAM), and Hierarchical Drichilet Process (HDP) algorithms.
However, pLSA has the overfitting problem. On the other hand, both HDP and PAM are more complex TM algorithms used for complex text mining, such as mining topics from high-dimensional text data or documents of unstructured text. Finally, non-negative matrix factorization is another way to find topics in a collection of documents. Irrespective of the approach, the output of all the TM algorithms is a list of topics with associated clusters of words.
The previous example shows how to perform TM using the LDA algorithm as a standalone...