While 3D models of target objects are often available in industrial contexts, it is rare to have a 3D representation of the environments they will be found in (for instance, a 3D model of the industrial plant). The 3D objects/scenes then appear isolated, with no proper background. But, like any other visual content, if models are not trained to deal with background/clutter, they won't be able to perform properly once confronted with real images. Therefore, it is common for researchers to post-process synthetic images, for instance, to merge them with relevant background pictures (replacing the blank background with pixel values from images of related environments).
While some augmentation operations could be taken care of by the rendering pipeline (such as brightness changes or motion blur), other 2D transformations are still commonly applied to synthetic data during training. This additional post-processing is once again done to reduce the risk of...