Summary
Generating text is a complicated task. There are practical uses that can make typing text messages or composing emails easier. On the other hand, there are creative uses, like generating stories. In this chapter, we covered a character-based RNN model to generate headlines one character at a time and noted that it picked up the structure, capitalization, and other things quite well. Even though the model was trained on a particular dataset, it showed promise in completing short sentences and partially typed words based on the context. The next section covered the state-of-the-art GPT-2 model, which is based on the Transformer decoder architecture. The previous chapter had covered the Transformer encoder architecture, which is used by BERT.
Generating text has many knobs to tune like temperature to resample distributions, greedy search, beam search, and Top-K sampling to balance the creativity and predictability of the generated text. We saw the impact of these settings...