BART is another interesting model introduced by Facebook AI. It is based on the transformer architecture. The BART model is essentially a denoising autoencoder. It is trained by reconstructing corrupted text.
Just like the BERT model, we can use the pre-trained BART model and fine-tune it for several downstream tasks. The BART model is best suited to text generation. It is also used for other tasks such as language translation and comprehension. The researchers have also shown that the performance of BART is equivalent to that of the RoBERTa model. But how exactly does BART work? What's special about BART? How does it differ from BERT? Let's find out the answers to all these questions in the next section.
Architecture of BART
BART is essentially a transformer model with an encoder and a decoder. We feed corrupted text to the encoder and the encoder learns the representation of the given text and sends the representation to the decoder. The decoder takes...