References
[1] N. F. Liu et al., “Lost in the Middle: How Language Models Use Long Contexts.” arXiv, Nov. 20, 2023. doi: 10.48550/arXiv.2307.03172.
[2] L. Berglund et al., “The Reversal Curse: LLMs trained on ‘A is B’ fail to learn ‘B is A.’” arXiv, Sep. 22, 2023. doi: 10.48550/arXiv.2309.12288.
[3] A. Vaswani et al., “Attention Is All You Need,” Jun. 2017.