通过经验增益最大化学习的框架

论文标题

通过经验增益最大化学习的框架

A Framework of Learning Through Empirical Gain Maximization

论文作者

Feng, Yunlong, Wu, Qiang

论文摘要

我们在本文中开发了一个经验增益最大化（EGM）的框架，以解决响应变量中可能存在重尾噪声或异常值的强大回归问题。 EGM的想法是近似噪声分布的密度函数，而不是像往常一样直接近似真相函数。与鼓励所有观察值同等重要性的经典最大似然估计不同，并且在存在异常观察的情况下可能是有问题的，可以从最小距离估计的角度来解释EGM方案，并允许对这些观察值的无知。此外，可以将几种众所周知的鲁棒性非凸回归范式（例如Tukey回归和截断最小的正方形回归）重新构成该新框架。然后，我们为EGM开发了一种学习理论，通过该理论可以为这些良好的但不完全理解的回归方法进行统一分析。由新框架产生的，可以得出对现有有限的非凸损失函数的新颖解释。在这个新框架内，两个看似无关紧要的术语，众所周知的Tukey对健壮回归的双重损失和非参数平滑的三量级内核是密切相关的。更确切地说，这表明Tukey的双重损失可以源自三级内核。同样，在机器学习中，其他经常采用的有限的非凸损失功能，例如截断的正方形损失，Geman-McClure损失和指数平方损失，也可以从统计中的某些平滑内核中重新校正。此外，新的框架使我们能够为健壮的学习设计新的有限的非凸损失功能。

We develop in this paper a framework of empirical gain maximization (EGM) to address the robust regression problem where heavy-tailed noise or outliers may present in the response variable. The idea of EGM is to approximate the density function of the noise distribution instead of approximating the truth function directly as usual. Unlike the classical maximum likelihood estimation that encourages equal importance of all observations and could be problematic in the presence of abnormal observations, EGM schemes can be interpreted from a minimum distance estimation viewpoint and allow the ignorance of those observations. Furthermore, it is shown that several well-known robust nonconvex regression paradigms, such as Tukey regression and truncated least square regression, can be reformulated into this new framework. We then develop a learning theory for EGM, by means of which a unified analysis can be conducted for these well-established but not fully-understood regression approaches. Resulting from the new framework, a novel interpretation of existing bounded nonconvex loss functions can be concluded. Within this new framework, the two seemingly irrelevant terminologies, the well-known Tukey's biweight loss for robust regression and the triweight kernel for nonparametric smoothing, are closely related. More precisely, it is shown that the Tukey's biweight loss can be derived from the triweight kernel. Similarly, other frequently employed bounded nonconvex loss functions in machine learning such as the truncated square loss, the Geman-McClure loss, and the exponential squared loss can also be reformulated from certain smoothing kernels in statistics. In addition, the new framework enables us to devise new bounded nonconvex loss functions for robust learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题