快速ABC-boost：在多类分类中选择基类的统一框架

论文标题

快速ABC-boost：在多类分类中选择基类的统一框架

Fast ABC-Boost: A Unified Framework for Selecting the Base Class in Multi-Class Classification

论文作者

Li, Ping, Zhao, Weijie

论文摘要

ICML'09中的工作表明，可以根据预选的“基类”重写经典多类逻辑回归损失损失函数的衍生物，并在流行的增强框架中应用了新的衍生物。为了利用新衍生物，必须有一个策略来识别/选择每个提升迭代中的基类。 ICML'09中的“自适应基类提升”（ABC-BOOST）的想法在每次迭代时都采用了基本昂贵的“详尽搜索”策略。已经充分证明，与树木集成在一起时，ABC-Boost可以在许多多类分类任务中实现实质性改进。此外，UAI'10中的工作得出了显式的二阶拆分增益公式，该公式通常会大大提高分类准确性，而仅对多类和二进制级别的分类任务仅使用拳头信息进行树拆分。在本文中，我们通过引入一系列想法来提高ABC-Boost的计算效率，开发一个统一的框架，以有效地选择基类。我们的框架具有参数$（s，g，w）$。在每个提升迭代中，我们仅搜索“ $ s $ worst类”（而不是所有类）来确定基类。进行搜索时，我们还允许“差距” $ g $。也就是说，我们只在每$ g+1 $迭代中搜索基类。此外，我们仅在$ w $提高迭代后才开始搜索，允许“热身”阶段。参数$ s $，$ g $，$ w $可以视为可调参数，而某些组合$（s，g，w）$甚至可能导致比“详尽的搜索”策略提高测试准确性。总体而言，我们提出的框架为实践中实施ABC增强提供了强大而可靠的计划。

The work in ICML'09 showed that the derivatives of the classical multi-class logistic regression loss function could be re-written in terms of a pre-chosen "base class" and applied the new derivatives in the popular boosting framework. In order to make use of the new derivatives, one must have a strategy to identify/choose the base class at each boosting iteration. The idea of "adaptive base class boost" (ABC-Boost) in ICML'09, adopted a computationally expensive "exhaustive search" strategy for the base class at each iteration. It has been well demonstrated that ABC-Boost, when integrated with trees, can achieve substantial improvements in many multi-class classification tasks. Furthermore, the work in UAI'10 derived the explicit second-order tree split gain formula which typically improved the classification accuracy considerably, compared with using only the fist-order information for tree-splitting, for both multi-class and binary-class classification tasks. In this paper, we develop a unified framework for effectively selecting the base class by introducing a series of ideas to improve the computational efficiency of ABC-Boost. Our framework has parameters $(s,g,w)$. At each boosting iteration, we only search for the "$s$-worst classes" (instead of all classes) to determine the base class. We also allow a "gap" $g$ when conducting the search. That is, we only search for the base class at every $g+1$ iterations. We furthermore allow a "warm up" stage by only starting the search after $w$ boosting iterations. The parameters $s$, $g$, $w$, can be viewed as tunable parameters and certain combinations of $(s,g,w)$ may even lead to better test accuracy than the "exhaustive search" strategy. Overall, our proposed framework provides a robust and reliable scheme for implementing ABC-Boost in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题