It is also worth noting that residual blocks do not contain more parameters than traditional ones, as the skip and addition operations do not require any. They can, therefore, be efficiently used as building blocks for ultra-deep networks.
Besides the 152-layer network applied to the ImageNet challenge, the authors illustrated their contributions by training an impressive 1,202-layer one. They reported no difficulty training such a massive CNN (although its validation accuracy was slightly lower than for the 152-layer network, allegedly because of overfitting).
More recent works have been exploring the use of residual computations to build deeper and more efficient networks, such as Highway networks (with a trainable switch value to decide which path should be used for each residual block) or DenseNet models (adding further skip connections between blocks).