Text generation is one of the major applications of RNN models in NLP. An RNN model is trained on the sequences of text and then used to generate the sequences of text by providing a seed text as input. Let's try that on the text8 dataset.
Let's load the text8 dataset and print the first 100 words:
from datasetslib.text8 import Text8
text8 = Text8()
# downloads data, converts words to ids, converts files to a list of ids
text8.load_data()
print(' '.join([text8.id2word[x_i] for x_i in text8.part['train'][0:100]]))
We get the following output:
anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that...