高维感知器中的概括误差：接近贝叶斯误差，凸优化

论文标题

高维感知器中的概括误差：接近贝叶斯误差，凸优化

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

论文作者

Aubin, Benjamin, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka

论文摘要

我们考虑了一个经常研究的合成数据集的监督分类，该数据集是通过喂养具有随机IID输入的一层神经网络而生成的。我们研究了标准分类器在高维度中的概括性能，其中$α= n/d $保持有限的限制，以高尺寸$ d $和样本数量$ n $。我们的贡献是三倍：首先，我们证明了$ \ ell_2 $正规分类器达到的概括错误的公式，该分类器最小化了凸损失。该公式首先是通过统计物理学的启发式复制方法获得的。其次，专注于常用的损失功能并优化$ \ ell_2 $正则化强度，我们观察到，尽管山脊回归性能较差，但逻辑和铰链回归令人惊讶地能够非常紧密地处理贝叶斯（Bayes）最佳的概括错误。由于$α\ to \ infty $，它们导致了贝叶斯最佳率，这一事实并非遵循基于保证金的概括误差界限的预测。第三，我们设计了一个最佳损失和正规化器，可证明会导致贝叶斯最佳的概括误差。

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs. We study the generalization performances of standard classifiers in the high-dimensional regime where $α=n/d$ is kept finite in the limit of a high dimension $d$ and number of samples $n$. Our contribution is three-fold: First, we prove a formula for the generalization error achieved by $\ell_2$ regularized classifiers that minimize a convex loss. This formula was first obtained by the heuristic replica method of statistical physics. Secondly, focussing on commonly used loss functions and optimizing the $\ell_2$ regularization strength, we observe that while ridge regression performance is poor, logistic and hinge regression are surprisingly able to approach the Bayes-optimal generalization error extremely closely. As $α\to \infty$ they lead to Bayes-optimal rates, a fact that does not follow from predictions of margin-based generalization error bounds. Third, we design an optimal loss and regularizer that provably leads to Bayes-optimal generalization error.

下载PDF全文

下载文献需遵守相关版权规定

论文标题