Creating a news topic classifier
The model we are going to create will classify news from the Reuters newswire classification dataset. It will read the raw text of each news item and classify it into sections, assigning a label corresponding to the section that they belong to (Sports, Weather, Travel, and so on).
Reuters newswire is a dataset that contains 11,228 newswires from Reuters, labeled over 46 topics.
The text of each news item is encoded as a list of word indexes. These are integers that are indexed by frequency in the dataset. So, here, integer 1 encodes the first most frequent word in the data, 2 encodes the second most frequent, and so on.
The notebook that contains the complete source code can be found at https://github.com/PacktPublishing/Automated-Machine-Learning-with-AutoKeras/blob/main/Chapter08/Chapter8_Reuters.ipynb.
Now, let's have a look at the relevant cells of the notebook in detail:
- Installing AutoKeras: As we've mentioned in...