The background of the Transformer
In this section, we will go through the background of NLP that led to the Transformer. The Transformer model invented by Google Research has toppled decades of Natural Language Processing research, development, and implementations.
Let us first see how that happened when NLP reached a critical limit that required a new approach.
Over the past 100+ years, many great minds have worked on sequence transduction and language modeling. Machines progressively learned how to predict probable sequences of words. It would take a whole book to cite all the giants that made this happen.
In this section, I will share my favorite researchers with you to lay the ground for the arrival of the Transformer.
In the early 20th century, Andrey Markov introduced the concept of random values and created a theory of stochastic processes. We know them in artificial intelligence (AI) as Markov Decision Processes (MDPs), Markov Chains, and Markov Processes....