As briefly presented in Chapter 4, Influential Classification Tools, fully convolutional networks (FCNs) are based on the VGG-16 architecture, with the final dense layers replaced by 1 × 1 convolutions. What we did not mention was that these networks are commonly extended with upsampling blocks and used as encoders-decoders. Proposed by Jonathan Long, Evan Shelhamer, and Trevor Darrell from the University of California, Berkeley, the FCN architecture perfectly illustrates the notions developed in the previous subsection:
- How CNNs for feature extraction can be used as efficient encoders
- How their feature maps can then be effectively upsampled and decoded by the operations we just introduced
Indeed, Jonathan Long et al. suggested reusing a pretrained VGG-16 as a feature extractor (refer to Chapter 4, Influential Classification Tools). With its five convolutional blocks, VGG-16 efficiently transforms images into feature maps...