PSPNet -Full-Resolution Residual Networks were really computationally intensive and using them on full-scale images was really slow. In order to deal with this problem, PSPNet came into the picture. It applies four different max-pooling operations with four different window sizes and strides. Using the max-pooling layers allows us to extract feature information from different scales with more efficiency.
PSPNet achieved state-of-the-art performance on various datasets. It became popular after the ImageNet scene parsing challenge in 2016. It hit the PASCAL VOC 2012 benchmark and the Cityscapes benchmark with a mIoU record of 85.4% accuracy on PASCAL VOC 2012, and also achieved 80.2% on Cityscapes. The following is a link to the relevant paper: https://arxiv.org/pdf/1612.01105.
The following diagram shows the architecture of PSPNet:
Fig 8.5: PSPNet architecture
Check out https://hszhao.github.io/projects/pspnet/ to find out more about the PSPNet architecture...