论文标题

快速ABC-boost:在多类分类中选择基类的统一框架

Fast ABC-Boost: A Unified Framework for Selecting the Base Class in Multi-Class Classification

论文作者

Li, Ping, Zhao, Weijie

论文摘要

ICML'09中的工作表明,可以根据预选的“基类”重写经典多类逻辑回归损失损失函数的衍生物,并在流行的增强框架中应用了新的衍生物。为了利用新衍生物,必须有一个策略来识别/选择每个提升迭代中的基类。 ICML'09中的“自适应基类提升”(ABC-BOOST)的想法在每次迭代时都采用了基本昂贵的“详尽搜索”策略。已经充分证明,与树木集成在一起时,ABC-Boost可以在许多多类分类任务中实现实质性改进。此外,UAI'10中的工作得出了显式的二阶拆分增益公式,该公式通常会大大提高分类准确性,而仅对多类和二进制级别的分类任务仅使用拳头信息进行树拆分。在本文中,我们通过引入一系列想法来提高ABC-Boost的计算效率,开发一个统一的框架,以有效地选择基类。我们的框架具有参数$(s,g,w)$。在每个提升迭代中,我们仅搜索“ $ s $ worst类”(而不是所有类)来确定基类。进行搜索时,我们还允许“差距” $ g $。也就是说,我们只在每$ g+1 $迭代中搜索基类。此外,我们仅在$ w $提高迭代后才开始搜索,允许“热身”阶段。参数$ s $,$ g $,$ w $可以视为可调参数,而某些组合$(s,g,w)$甚至可能导致比“详尽的搜索”策略提高测试准确性。总体而言,我们提出的框架为实践中实施ABC增强提供了强大而可靠的计划。

The work in ICML'09 showed that the derivatives of the classical multi-class logistic regression loss function could be re-written in terms of a pre-chosen "base class" and applied the new derivatives in the popular boosting framework. In order to make use of the new derivatives, one must have a strategy to identify/choose the base class at each boosting iteration. The idea of "adaptive base class boost" (ABC-Boost) in ICML'09, adopted a computationally expensive "exhaustive search" strategy for the base class at each iteration. It has been well demonstrated that ABC-Boost, when integrated with trees, can achieve substantial improvements in many multi-class classification tasks. Furthermore, the work in UAI'10 derived the explicit second-order tree split gain formula which typically improved the classification accuracy considerably, compared with using only the fist-order information for tree-splitting, for both multi-class and binary-class classification tasks. In this paper, we develop a unified framework for effectively selecting the base class by introducing a series of ideas to improve the computational efficiency of ABC-Boost. Our framework has parameters $(s,g,w)$. At each boosting iteration, we only search for the "$s$-worst classes" (instead of all classes) to determine the base class. We also allow a "gap" $g$ when conducting the search. That is, we only search for the base class at every $g+1$ iterations. We furthermore allow a "warm up" stage by only starting the search after $w$ boosting iterations. The parameters $s$, $g$, $w$, can be viewed as tunable parameters and certain combinations of $(s,g,w)$ may even lead to better test accuracy than the "exhaustive search" strategy. Overall, our proposed framework provides a robust and reliable scheme for implementing ABC-Boost in practice.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源