小树的增强

论文标题

Lassoed Tree Boosting

论文作者

Schuler, Alejandro, Li, Yi, van der Laan, Mark

论文摘要

梯度提升在大多数预测问题中表现出色，并符合大型数据集。在本文中，我们证明了一种````小动物''梯度增强的树算法，早期停止的速度比$ n^{ - 1/4} $ l2收敛的速度快，在界面分段变化的Cadlag函数的大型非参数中。此速率很显着，因为它不取决于维度，稀疏性或平滑度。我们使用仿真和实际数据来确认我们的理论，并在标准提升方面证明经验性能和可伸缩性。我们的融合证明是基于一种新颖的，一般定理，以早期停止使用嵌套的唐斯克类的经验损失最小化。

Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets. In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than $n^{-1/4}$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness. We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题