An expanding universe of models
New transformer models, like new smartphones, emerge nearly every week. Some of these models are both mind-blowing and challenging for a project manager:
- ERNIE is a continual pretraining framework that produces impressive results for language understanding.
Paper: https://arxiv.org/abs/1907.12412
Challenges: Hugging Face provides a model. Is it a full-blown model? Is it the one Baidu trained to exceed human baselines on the SuperGLUE Leaderboard (December 2021): https://super.gluebenchmark.com/leaderboard? Do we have access to the best one or just a toy model? What is the purpose of running AutoML on such small versions of models? Will we gain access to it on the Baidu platform or a similar one? How much will it cost?
- SWITCH: A trillion-parameter model optimized with sparse modeling.
Paper: https://arxiv.org/abs/2101.03961
Challenges: The paper is fantastic. Where is the model? Will we ever have access...