学习者的波动，偏见，差异和集合：高维度中凸损失的精确渐近学

论文标题

学习者的波动，偏见，差异和集合：高维度中凸损失的精确渐近学

Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

论文作者

Loureiro, Bruno, Gerbelot, Cédric, Refinetti, Maria, Sicuro, Gabriele, Krzakala, Florent

论文摘要

从数据采样到参数的初始化，随机性在现代机器学习实践中无处不在。因此，了解预测中不同随机性来源引起的统计波动是理解鲁棒概括的关键。在本手稿中，我们开发了一种定量和严格的理论，用于研究在高维度中以不同但相关的特征训练的广义线性模型集合中的波动。特别是，我们对在高维极限下的通用凸丢失和正则化的经验风险最小化的渐近关节分布提供了完整的描述。我们的结果包括一组丰富的分类和回归任务，例如过份术的神经网络的懒惰制度，或等效地与内核的随机特征。在允许直接研究结合（或包装）对测试误差偏置变化分解的缓解效果，我们的分析还有助于解散统计波动的贡献，以及插值阈值的奇异作用，这是“双子”现象的根源。

From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features in high-dimensions. In particular, we provide a complete description of the asymptotic joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit. Our result encompasses a rich set of classification and regression tasks, such as the lazy regime of overparametrised neural networks, or equivalently the random features approximation of kernels. While allowing to study directly the mitigating effect of ensembling (or bagging) on the bias-variance decomposition of the test error, our analysis also helps disentangle the contribution of statistical fluctuations, and the singular role played by the interpolation threshold that are at the roots of the "double-descent" phenomenon.

下载PDF全文

下载文献需遵守相关版权规定

论文标题