与自适应正则化的异性恋和不平衡的深度学习

论文标题

与自适应正则化的异性恋和不平衡的深度学习

Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

论文作者

Cao, Kaidi, Chen, Yining, Lu, Junwei, Arechiga, Nikos, Gaidon, Adrien, Ma, Tengyu

论文摘要

现实世界中的大规模数据集是异性恋和不平衡的 - 标签具有不同级别的不确定性，并且标签分布量很长。异性恋和失衡挑战深度学习算法，因为难以区分错误的标记，模棱两可和罕见的例子。同时解决异性疾病和失衡的问题不足。我们为异性数据集提出了一种数据依赖性的正则化技术，该技术对输入空间的不同区域进行了不同的方式。受到一维非参数分类设置中最佳正则强度的理论推导的启发，我们的方法更严重地将数据点适应了数据点。我们在几个基准任务上测试了我们的方法，包括现实世界中的异性恋和不平衡的数据集，网络视频。我们的实验证实了我们的理论，并证明了对噪声深度学习中其他方法的显着改善。

Real-world large-scale datasets are heteroskedastic and imbalanced -- labels have varying levels of uncertainty and label distributions are long-tailed. Heteroskedasticity and imbalance challenge deep learning algorithms due to the difficulty of distinguishing among mislabeled, ambiguous, and rare examples. Addressing heteroskedasticity and imbalance simultaneously is under-explored. We propose a data-dependent regularization technique for heteroskedastic datasets that regularizes different regions of the input space differently. Inspired by the theoretical derivation of the optimal regularization strength in a one-dimensional nonparametric classification setting, our approach adaptively regularizes the data points in higher-uncertainty, lower-density regions more heavily. We test our method on several benchmark tasks, including a real-world heteroskedastic and imbalanced dataset, WebVision. Our experiments corroborate our theory and demonstrate a significant improvement over other methods in noise-robust deep learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题