学习梯度增强多标签分类规则

论文标题

学习梯度增强多标签分类规则

Learning Gradient Boosted Multi-label Classification Rules

论文作者

Rapp, Michael, Mencía, Eneldo Loza, Fürnkranz, Johannes, Nguyen, Vu-Linh, Hüllermeier, Eyke

论文摘要

在多标签分类中，预测的评估不如单标签分类不那么直接，但已经提出了各种有意义的损失函数。理想情况下，学习算法应可以定制为性能度量的特定选择。从这个角度来看，现代增强的实施，最突出的梯度提升决策树似乎很有吸引力。但是，它们主要仅限于单标签分类，因此不适合多标签损失，除非这些标签可分解。在这项工作中，我们将梯度提升框架的概括用于多输出问题，并提出了一种用于学习多标签分类规则的算法，该算法能够最大程度地减少可分解性以及不可解释的损失功能。我们使用众所周知的锤损失和子集0/1损失作为代表，我们分析了我们在合成数据上的方法和局限性，并评估其在多标签基准测试中的预测性能。

In multi-label classification, where the evaluation of predictions is less straightforward than in single-label classification, various meaningful, though different, loss functions have been proposed. Ideally, the learning algorithm should be customizable towards a specific choice of the performance measure. Modern implementations of boosting, most prominently gradient boosted decision trees, appear to be appealing from this point of view. However, they are mostly limited to single-label classification, and hence not amenable to multi-label losses unless these are label-wise decomposable. In this work, we develop a generalization of the gradient boosting framework to multi-output problems and propose an algorithm for learning multi-label classification rules that is able to minimize decomposable as well as non-decomposable loss functions. Using the well-known Hamming loss and subset 0/1 loss as representatives, we analyze the abilities and limitations of our approach on synthetic data and evaluate its predictive performance on multi-label benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题