论文标题

高斯内核岭通过雅各布控制回归的带宽选择

Bandwidth Selection for Gaussian Kernel Ridge Regression via Jacobian Control

论文作者

Allerbo, Oskar, Jörnsten, Rebecka

论文摘要

大多数机器学习方法都需要调整超参数。对于使用高斯内核的内核岭回归,超参数是带宽。带宽指定内核的长度尺度,必须仔细选择以获得具有良好概括的模型。带宽选择,交叉验证和边际可能性最大化的默认方法通常会产生良好的结果,尽管以高计算成本为准。受Jacobian正则化的启发,我们为如何用高斯内核来推断的函数的衍生物如何取决于内核带宽,从而提出了近似表达。我们使用此表达式提出了基于控制雅各布式的封闭形式的羽毛,带宽选择启发式。此外,雅各布的表达式阐明了带宽选择是如何在推断功能的平滑度与训练数据核矩阵的条件之间的权衡。我们显示了与交叉验证和边际可能性最大化相比的真实和合成数据,我们的方法是在模型性能方面配对,但最多六个数量级的速度更快。

Most machine learning methods require tuning of hyper-parameters. For kernel ridge regression with the Gaussian kernel, the hyper-parameter is the bandwidth. The bandwidth specifies the length scale of the kernel and has to be carefully selected to obtain a model with good generalization. The default methods for bandwidth selection, cross-validation and marginal likelihood maximization, often yield good results, albeit at high computational costs. Inspired by Jacobian regularization, we formulate an approximate expression for how the derivatives of the functions inferred by kernel ridge regression with the Gaussian kernel depend on the kernel bandwidth. We use this expression to propose a closed-form, computationally feather-light, bandwidth selection heuristic, based on controlling the Jacobian. In addition, the Jacobian expression illuminates how the bandwidth selection is a trade-off between the smoothness of the inferred function and the conditioning of the training data kernel matrix. We show on real and synthetic data that compared to cross-validation and marginal likelihood maximization, our method is on pair in terms of model performance, but up to six orders of magnitude faster.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源