Densely connected convolutional networks (DenseNet)
DenseNet attacks the problem of vanishing gradient using a different approach. Instead of using shortcut connections, all the previous feature maps will become the input of the next layer. The preceding figure, shows an example of a dense interconnection in one Dense block.
For simplicity, in this figure, we'll only show four layers. Notice that the input to layer l is the concatenation of all previous feature maps. If we designate the BN-ReLU-Conv2D
as the operation H(x), then the output of layer l is:
(Equation 2.4.1)
Conv2D
uses a kernel of size 3. The number of feature maps generated per layer is called the growth rate, k. Normally, k = 12, but k = 24 is also used in the paper, Densely Connected Convolutional Networks, Huang, and others, 2017 [5]. Therefore, if the number of feature maps
is
, then the total number...