DenseNet (Densely Connected Convolutional Networks, https://arxiv.org/abs/1608.06993) try to alleviate the vanishing gradient problem and improve feature propagation, while reducing the number of network parameters. We've already seen how ResNets introduce residual blocks with skip connections to solve this. DenseNets take some inspiration from this idea and take it even further with the introduction of dense blocks. A dense block consists of sequential convolutional layers, where any layer has a direct connection to all subsequent layers. In other words, a network layer, l, will receive input, xl, from all preceding network layers:
Here, are the concatenated output feature maps of the preceding network layers. This is unlike ResNets, where we combine different layers with the element-wise sum. Hl is a composite function, which defines three...