Evaluating and improving throughput
As we’ve previously covered in the book, total job throughput is an important metric to track. On the one hand, you want to keep a batch size small enough to ensure your model is trained appropriately. On the other hand, you want to max out your overall job performance to get the most possibly accurate model you can. We learned in Chapter 7 how to use hyperparameter tuning to solve both of those. We also covered other tips and tricks for reducing your graphics processing unit (GPU) memory footprint in Chapter 5 and Chapter 8. Now, let’s close out a few more gaps in this area.
First, it’s important to consider how you measure throughput in general terms. You have probably used some logging packages in PyTorch that handily report iterations per second during the training loop. Obviously, this is extremely useful in clocking your training speed, but how would you take into account the size of the model? What if you wanted to...