论文标题
即时的联合特征选择和分类
On-the-Fly Joint Feature Selection and Classification
论文作者
论文摘要
在线环境中的联合功能选择和分类对于时间敏感的决策至关重要。但是,大多数现有方法独立处理此耦合问题。具体来说,在线功能选择方法可以在线处理流媒体功能或数据实例,以生成固定的分类功能集,而在线分类方法则使用有关功能空间的全面知识对传入实例进行分类。然而,所有现有方法都利用一组功能(用于所有数据实例)进行分类。取而代之的是,我们提出了一个框架,以实现联合特征选择和分类,以最大程度地减少每个数据实例评估的功能数量并最大化分类精度。我们得出了相关优化问题的最佳解决方案并分析其结构。提出了两种算法,即etana和f-etana,这些算法基于最佳溶液及其特性。我们评估了拟议算法在几个公共数据集上的性能,证明了(i)提议的算法在最先进的情况下的主导地位,以及(ii)其适用于广泛的应用领域,包括临床研究和自然语言处理。
Joint feature selection and classification in an online setting is essential for time-sensitive decision making. However, most existing methods treat this coupled problem independently. Specifically, online feature selection methods can handle either streaming features or data instances offline to produce a fixed set of features for classification, while online classification methods classify incoming instances using full knowledge about the feature space. Nevertheless, all existing methods utilize a set of features, common for all data instances, for classification. Instead, we propose a framework to perform joint feature selection and classification on-the-fly, so as to minimize the number of features evaluated for every data instance and maximize classification accuracy. We derive the optimum solution of the associated optimization problem and analyze its structure. Two algorithms are proposed, ETANA and F-ETANA, which are based on the optimum solution and its properties. We evaluate the performance of the proposed algorithms on several public datasets, demonstrating (i) the dominance of the proposed algorithms over the state-of-the-art, and (ii) its applicability to broad range of application domains including clinical research and natural language processing.