Optimization solution 6 – Token Merging (ToMe)
Token Merging (ToMe) was first posited by Daniel et al [3]. It is a technique that can be used to speed up the inference time of Stable Diffusion models. ToMe works by merging redundant tokens in the model, which means that the model has less work to do compared with non-merging models. This can lead to noticeable speed improvements without sacrificing image quality.
ToMe works by first identifying redundant tokens in the model. This is done by looking at the similarity between tokens. If two tokens are very similar, then they are probably redundant. Once redundant tokens have been identified, they are merged. This is done by averaging the values of the two tokens.
For example, if a model has 100 tokens and 50 of those tokens are redundant, then merging the redundant tokens can reduce the number of tokens that the model has to process by 50%.
ToMe can be used with any Stable Diffusion model. It does not require any additional...