Before the introduction of the inception layer, most CNN architectures had a standard configuration—stacked (in series) convolution, normalization, max pooling, and activation layers followed by a fully connected and softmax layer. This architecture led to an increasing depth of the neural network, which suffered from two major drawbacks:
- Overfitting
- Increased computation time
The inception model solved both issues by moving from dense network to sparse matrices and clustering them to form dense submatrices.
The inception model is also known as GoogLeNet. It was introduced by Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragmir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich in a paper titled Going Deeper with Convolutions. The name of Inception came from the paper Network in Network by Min Lin, Qiang...