Profiling MNIST model inference using PyTorch Profiler
The profiling of programming code is the analysis of its performance in terms of its space (memory) and time complexity, providing us with a breakdown of the time and memory consumed by the various sub-modules or functions called within the code. When we run inference using a PyTorch deep learning model, a series of such function calls are made in order to produce the output (y) from the input (X). In this section, we will learn how to profile PyTorch model inference using the PyTorch Profiler.
We will infer the MNIST model that was trained in Chapter 1, Overview of Deep Learning Using PyTorch [13], and deployed in Chapter 13, Operationalizing PyTorch Models into Production [14]. First we will run the model inference on a CPU and profile the inference to examine the CPU time and memory consumption by its various internal operations. Next, we will run model inference on the GPU and repeat the profiling exercise. Finally, we...