Text generation and Chatbots
Text generation had experienced the most significant breakthrough in 2019 when OpenAI announced GPT-2. This Transformer-based model was able to generate coherent pieces of text on a large scale.
GPT-250 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.
The whole 2019 was full of surprises when it comes to text generation models with Megatron from NVIDIA being 5 times larger than GPT-2 and finally Turing-NLG from Microsoft being 10 times larger than GPT-2 (released in February 2020).
We are just beginning to experience...