Weight sharing – reducing the number of distinct weight values
Weight sharing or weight clustering is another technique that can significantly reduce the size of the model. The idea behind this technique is rather simple: let’s cluster the weights into groups (or clusters) and use the centroid values instead of individual weight values. In this case, we can store the value of each centroid instead of storing every value for the weights. Therefore, we can compress the model size significantly and possibly speed up the inference process. The key idea behind weight sharing is graphically presented in Figure 10.2 (adapted from the official TF blog post on weight clustering API: https://blog.tensorflow.org/2020/08/tensorflow-model-optimization-toolkit-weight-clustering-api.html):
Figure 10.2 – An illustration of weight sharing
Let’s learn how to perform weight sharing in TF before looking at how to do the same in PyTorch.