Latent Dirichlet allocation (LDA)
LDA and LDA: unfortunately, there are two methods in machine learning with the initials LDA: latent Dirichlet allocation, which is a topic modeling method; and linear discriminant analysis, which is a classification method. They are completely unrelated, except for the fact that the initials LDA can refer to either. However, this can be confusing. Scikit-learn has a submodule, sklearn.lda
, which implements linear discriminant analysis. At the moment, scikit-learn does not implement latent Dirichlet allocation.
The simplest topic model (on which all others are based) is latent Dirichlet allocation (LDA). The mathematical ideas behind LDA are fairly complex, and we will not go into the details here.
For those who are interested and adventurous enough, a Wikipedia search will provide all the equations behind these algorithms at the following link:
http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
However, we can understand that this is at a high level...