ELMo, BERT, and GPT
In this section, we will explain several classic NLP models used nowadays, namely ELMo, BERT, and GPT.
Before we dive into these complicated model structures, we will first illustrate the basic concept of a Recurrent Neural Network (RNN) and how it works. Then, we will move on to the transformers. This section will cover the following topics:
- Basic concepts
- RNN
- ELMo
- BERT
- GPT
We will start with introducing RNNs.
Basic concepts
Here, we will dive into the world of RNNs. At a high-level, different from CNNs, an RNN usually needs to maintain the states from previous input. It is just like memory for human beings.
We will illustrate what we mean with the following examples:
As shown in the preceding figure, one-to-one is a typical problem format in the computer vision domain. Basically, assuming we have a CNN model, we input an image as Input 1, as shown in Figure...