Modern convolutional neural networks can be huge. For example, the pre-trained ResNet family network can be from 100 to 1,000 layers deep, and take from 138 MB to 0.5 GB in Torch data format. To deploy them to mobile or embedded devices can be problematic, especially if your app requires several models for different tasks. Also, CNNs are computationally heavy, and in some settings (for example, real-time video analysis) can drain device battery in no time. Actually, much faster than it took to write this chapter's intro. But why are they so big, and why do they consume so much energy? And how do we fix it without sacrificing accuracy?
As we've already discussed the speed optimization in the previous chapter, we are concentrating on the memory consumption in this chapter. We specifically focus on the deep learning neural networks...