Getting Started with the Architecture of the Transformer Model
Language is the essence of human communication. Civilizations would never have been born without the word sequences that form language. We now mostly live in a world of digital representations of language. Our daily lives rely on NLP digitalized language functions: web search engines, emails, social networks, posts, tweets, smartphone texting, translations, web pages, speech-to-text on streaming sites for transcripts, text-to-speech on hotline services, and many more everyday functions.
Chapter 1, What are Transformers?, explained the limits of RNNs and the birth of cloud AI transformers taking over a fair share of design and development. The role of the Industry 4.0 developer is to understand the architecture of the original Transformer and the multiple transformer ecosystems that followed.
In December 2017, Google Brain and Google Research published the seminal Vaswani et al., Attention is All You Need paper...