Exploring neural network transformers
Figure 6.1 provides an overview of the impact transformers have had, thanks to the plethora of transformer model variants.
Figure 6.1 – Transformers’ different modality and model branches
The transformer does not have inherent inductive bias structurally designed into its architecture. Inductive bias refers to the pre-assumptions made by a learning algorithm on the data. This bias can be built into the model architecture or the learning process, and it helps to guide the model toward learning specific patterns or structures in the data. Traditional models, such as RNNs, incorporate inductive bias through their design, for instance, by assuming that data has a sequential structure and that the order of elements is important. Another example is CNN models, which are specifically designed for processing grid-like data, such as images, by incorporating inductive bias in the form of local connectivity and...