论文标题

使用明确的Lipschitz正规化增强基于混合的半监督学习

Enhancing Mixup-based Semi-Supervised Learning with Explicit Lipschitz Regularization

论文作者

Gyawali, Prashnna Kumar, Ghimire, Sandesh, Wang, Linwei

论文摘要

深度学习的成功取决于大规模注释的数据集的可用性,这些数据集的获取可能是昂贵的,需要专家领域知识。半监督学习(SSL)通过利用神经功能在大型未标记数据上的行为来减轻这一挑战。神经功能的平滑度是SSL中普遍使用的假设。一个成功的例子是在SSL中采用混合策略,该策略通过鼓励在训练示例之间插值时,通过鼓励其线性地行事来实现神经功能的全球平稳性。然而,尽管取得了经验成功,但理论上的基础是混合如何正规化神经功能的理论基础尚未完全理解。在本文中,我们提供了理论上证实的命题,通过界定神经网络梯度函数的Lipschitz常数来改善神经功能的平滑度。然后,我们建议可以通过通过对抗性Lipschitz正则化来同时限制神经功能本身的Lipschitz常数来加强这一点,从而鼓励神经功能线性地表现,同时也限制了该线性函数的斜率。在三个基准数据集和一个现实世界的生物医学数据集上,我们证明,从少量标记的数据中学习时,这种合并的正则化会改善SSL的泛化性能。我们进一步证明了对单步对抗攻击的提出方法的鲁棒性。我们的代码可在https://github.com/prasanna1991/mixup-lr上找到。

The success of deep learning relies on the availability of large-scale annotated data sets, the acquisition of which can be costly, requiring expert domain knowledge. Semi-supervised learning (SSL) mitigates this challenge by exploiting the behavior of the neural function on large unlabeled data. The smoothness of the neural function is a commonly used assumption exploited in SSL. A successful example is the adoption of mixup strategy in SSL that enforces the global smoothness of the neural function by encouraging it to behave linearly when interpolating between training examples. Despite its empirical success, however, the theoretical underpinning of how mixup regularizes the neural function has not been fully understood. In this paper, we offer a theoretically substantiated proposition that mixup improves the smoothness of the neural function by bounding the Lipschitz constant of the gradient function of the neural networks. We then propose that this can be strengthened by simultaneously constraining the Lipschitz constant of the neural function itself through adversarial Lipschitz regularization, encouraging the neural function to behave linearly while also constraining the slope of this linear function. On three benchmark data sets and one real-world biomedical data set, we demonstrate that this combined regularization results in improved generalization performance of SSL when learning from a small amount of labeled data. We further demonstrate the robustness of the presented method against single-step adversarial attacks. Our code is available at https://github.com/Prasanna1991/Mixup-LR.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源