论文标题
通过注入潜在分割空间来提高语义分割周期的性能
Improving Performance of Semantic Segmentation CycleGANs by Noise Injection into the Latent Segmentation Space
论文作者
论文摘要
近年来,语义细分从计算机视觉中的各种作品中受益。受到多功能自行车结构的启发,我们将语义分割与周期一致性的概念相结合,以实现多任务训练协议。然而,所谓的隐志效应在很大程度上阻止了学习,这表明自己是潜在分割域中的水印,这使得图像重建是一个太容易的任务。为了解决这个问题,我们提出了基于量化噪声或加成高斯噪声的噪声注入,以避免循环体系结构中这种不利的信息流。我们发现噪声注入会大大减少水印的产生,因此允许识别高度相关的类别,例如“交通信号”,而ERFNET基线几乎无法检测到。我们在CityScapes数据集上报告了MIOU和PSNR结果,分别用于语义细分和图像重建。所提出的方法允许在同一周期内绝对5.7%的城市景观验证集对MIOU进行改进,而无需注入噪声,并且在ERFNET非环体基线的绝对4.9%中仍然具有4.9%。
In recent years, semantic segmentation has taken benefit from various works in computer vision. Inspired by the very versatile CycleGAN architecture, we combine semantic segmentation with the concept of cycle consistency to enable a multitask training protocol. However, learning is largely prevented by the so-called steganography effect, which expresses itself as watermarks in the latent segmentation domain, making image reconstruction a too easy task. To combat this, we propose a noise injection, based either on quantization noise or on Gaussian noise addition to avoid this disadvantageous information flow in the cycle architecture. We find that noise injection significantly reduces the generation of watermarks and thus allows the recognition of highly relevant classes such as "traffic signs", which are hardly detected by the ERFNet baseline. We report mIoU and PSNR results on the Cityscapes dataset for semantic segmentation and image reconstruction, respectively. The proposed methodology allows to achieve an mIoU improvement on the Cityscapes validation set of 5.7% absolute over the same CycleGAN without noise injection, and still an absolute 4.9% over the ERFNet non-cyclic baseline.