Transformers and transfer learning
A milestone in NLP happened in 2017 with the release of the research paper Attention Is All You Need, by Vaswani et al. (https://arxiv.org/abs/1706.03762), which introduced a brand-new machine learning idea and architecture – transformers. Transformers in NLP is a fresh idea that aims to solve sequential modeling tasks and targets some problems introduced by long short-term memory (LSTM) architecture (recall LSTM architecture from Chapter 8, Text Classification with spaCy). Here's how the paper explains how transformers work:
Transduction in this context means transforming input words to output words by transforming input words and sentences into vectors. Typically, a transformer is trained on a huge corpus such as Wiki or news. Then,...