References
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017, Attention Is All You Need, https://arxiv.org/abs/1706.03762Hugging Face Transformer Usage: https://huggingface.co/transformers/usage.htmlTensor2Tensor (T2T) Introduction: https://colab.research.google.com/github/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/hello_t2t.ipynb?hl=enManuel Romero Notebook with link to explanations by Raimi Karim: https://colab.research.google.com/drive/1rPk3ohrmVclqhH7uQ7qys4oznDdAhpzFGoogle language research: https://research.google/teams/language/Hugging Face research: https://huggingface.co/transformers/index.htmlThe Annotated Transformer: http://nlp.seas.harvard.edu/2018/04/03/attention.htmlJay Alammar, The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/