TSMixer
While the Transformer-based models were forging ahead with steam, a parallel track of research started by using Multi-Layer Perceptrons (MLPs) instead of Transformers as the key learning unit. The trend kicked off in 2021 when MLP-Mixer showed that one can attain state-of-the-art performance in vision problems by using just MLPs, replacing Convolutional Neural Networks. And so, similar mixer architectures using MLPs as the key learning component started popping up in all domains. In 2023, Si-An Chen et al. from Google brought mixing MLPs into time series forecasting.
Reference check:
The research paper by Si-An et al. on TSMixer is cited in the References section as 19.
The architecture of the TSMixer model
TSMixer really took inspiration from the Transformer model but tried to replicate similar processes with an MLP. Let’s use Figure 16.10 to understand the similarities and differences.
Figure 16.12: Transformer vs TSMixer
...