As discussed earlier, the goal of topic modeling is to identify patterns in a text corpus that correspond to document topics. In this example, we will use a dataset originating from BBC News. This dataset is one of the standard benchmarks in machine-learning research, and is available for non-commercial and research purposes.
The goal is to build a classifier that is able to assign a topic to an uncategorized document.