Topic Modeling with Latent Dirichlet Allocation (LDA)
The subjects or common themes of a set of documents can be determined with Amazon Comprehend. For example, you have a movie review website with two message boards, and you want to determine which message board is discussing two newly released movies (one about sport and the other about a political topic). You can provide the message board text data to Amazon Comprehend to discover the most prominent topics discussed on each message board.
The machine learning algorithm that Amazon Comprehend uses to perform Topic Modeling is called Latent Dirichlet Allocation (LDA). LDA is a learning-based model that's used to determine the most important topics in a collection of documents.
How LDA works is that it considers every document to be a combination of topics, and each word in the document is associated with one of these topics.
For example, if the first paragraph of a document consists of words such as eat, chicken, restaurant...