论文标题

高维感知器中的概括误差:接近贝叶斯误差,凸优化

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

论文作者

Aubin, Benjamin, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka

论文摘要

我们考虑了一个经常研究的合成数据集的监督分类,该数据集是通过喂养具有随机IID输入的一层神经网络而生成的。我们研究了标准分类器在高维度中的概括性能,其中$α= n/d $保持有限的限制,以高尺寸$ d $和样本数量$ n $。我们的贡献是三倍:首先,我们证明了$ \ ell_2 $正规分类器达到的概括错误的公式,该分类器最小化了凸损失。该公式首先是通过统计物理学的启发式复制方法获得的。其次,专注于常用的损失功能并优化$ \ ell_2 $正则化强度,我们观察到,尽管山脊回归性能较差,但逻辑和铰链回归令人惊讶地能够非常紧密地处理贝叶斯(Bayes)最佳的概括错误。由于$α\ to \ infty $,它们导致了贝叶斯最佳率,这一事实并非遵循基于保证金的概括误差界限的预测。第三,我们设计了一个最佳损失和正规化器,可证明会导致贝叶斯最佳的概括误差。

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs. We study the generalization performances of standard classifiers in the high-dimensional regime where $α=n/d$ is kept finite in the limit of a high dimension $d$ and number of samples $n$. Our contribution is three-fold: First, we prove a formula for the generalization error achieved by $\ell_2$ regularized classifiers that minimize a convex loss. This formula was first obtained by the heuristic replica method of statistical physics. Secondly, focussing on commonly used loss functions and optimizing the $\ell_2$ regularization strength, we observe that while ridge regression performance is poor, logistic and hinge regression are surprisingly able to approach the Bayes-optimal generalization error extremely closely. As $α\to \infty$ they lead to Bayes-optimal rates, a fact that does not follow from predictions of margin-based generalization error bounds. Third, we design an optimal loss and regularizer that provably leads to Bayes-optimal generalization error.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源