论文标题
特征型域自适应对比度蒸馏,以进行有效的单图像超分辨率
Feature-domain Adaptive Contrastive Distillation for Efficient Single Image Super-Resolution
论文作者
论文摘要
最近,基于CNN的SISR有许多参数和高计算成本以实现更好的性能,从而将其适用性限制在资源受限设备(例如移动设备)上。作为使网络高效的方法之一,将教师的有用知识转移给学生的知识蒸馏(KD)正在研究。最近,SISR的KD利用特征蒸馏(FD)来最大程度地减少教师和学生网络之间特征图的欧几里得距离损失,但它没有充分考虑如何有效,有意义地从教师中提供知识,以在给定的网络容量限制下提高学生的绩效。在本文中,我们提出了一种特征域的自适应对比蒸馏(FACD)方法,用于有效培训轻型学生SISR网络。我们使用欧几里得距离损失显示了现有的FD方法的局限性,并提出了特征障碍的对比损失,这使学生网络从特征域中的教师表示中学习了更丰富的信息。此外,我们提出了一种自适应蒸馏,根据训练斑的条件有选择地应用蒸馏。实验结果表明,使用拟议的FACD方案的学生EDSR和RCAN网络不仅改善了整个基准数据集和尺度的PSNR性能,而且还可以改善与常规FD方法相比的主观图像质量。
Recently, CNN-based SISR has numerous parameters and high computational cost to achieve better performance, limiting its applicability to resource-constrained devices such as mobile. As one of the methods to make the network efficient, Knowledge Distillation (KD), which transfers teacher's useful knowledge to student, is currently being studied. More recently, KD for SISR utilizes Feature Distillation (FD) to minimize the Euclidean distance loss of feature maps between teacher and student networks, but it does not sufficiently consider how to effectively and meaningfully deliver knowledge from teacher to improve the student performance at given network capacity constraints. In this paper, we propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks. We show the limitations of the existing FD methods using Euclidean distance loss, and propose a feature-domain contrastive loss that makes a student network learn richer information from the teacher's representation in the feature domain. In addition, we propose an adaptive distillation that selectively applies distillation depending on the conditions of the training patches. The experimental results show that the student EDSR and RCAN networks with the proposed FACD scheme improves not only the PSNR performance of the entire benchmark datasets and scales, but also the subjective image quality compared to the conventional FD approaches.