A prefix instead of task-specific formats
Raffel et al. (2019) still had one problem to solve: unifying task-specific formats. The idea was to find a way to have one input format for every task submitted to the transformer. That way, the model parameters would be trained for all types of tasks in one text-to-text format.
The Google T5 team devised a simple solution: adding a prefix to an input sequence. We would need thousands of additional vocabularies in many languages without the invention of the prefix by some long-forgotten genius. For example, we would need to find words to describe prepayment, prehistoric, Precambrian, and thousands of other words if we did not use “pre” as a prefix.
Raffel et al. (2019) proposed adding a prefix to an input sequence. A T5 prefix is not just a tag or indicator like [CLS]
for classification in some transformer models. Instead, a T5 prefix contains the essence of a task a transformer needs to solve. A prefix conveys meaning...