TPU performance
Discussing performance is always difficult because it is important to first define the metrics that we are going to measure, and the set of workloads that we are going to use as benchmarks. For instance, Google reported an impressive linear scaling for TPU v2 used with ResNet-50 [4] (see Figure 7).
Figure 7: Linear scalability in the number of TPUs v2 when increasing the number of images
In addition, you can find online a comparison of ResNet-50 [4] where a Full Cloud TPU v2 Pod is >200x faster than a V100 Nvidia Tesla GPU for ResNet-50 training:
Figure 8: A Full Cloud TPU v2 Pod is >200x faster than a V100 Nvidia Tesla GPU for training a ResNet-50 model
In December 2018, the MLPerf initiative was announced. MLPerf [5] is a broad ML benchmark suite created by a large set of companies. The goal is to measure the performance of ML frameworks, ML accelerators, and ML cloud platforms.