论文标题

使用深层合奏选择复合特征

Composite Feature Selection using Deep Ensembles

论文作者

Imrie, Fergus, Norcliffe, Alexander, Lio, Pietro, van der Schaar, Mihaela

论文摘要

在许多现实世界中,特征不是一个人行动,而是彼此结合。例如,在基因组学中,疾病可能不是由任何单个突变引起的,而是需要存在多个突变。先前的功能选择工作要么寻求识别单个功能,要么只能从预定义的集合中确定相关组。我们研究了发现无预定义分组的预测特征组的问题。为此,我们根据特征之间的线性和非线性相互作用来定义预测组。我们介绍了一种新颖的深度学习体系结构,该体系结构使用特征选择模型的集合来查找预测组,而无需提供候选人组。选定的组稀疏,并且表现出最小重叠。此外,我们提出了一个新的指标,以衡量发现的群体与地面真理之间的相似性。我们证明了模型在多个合成任务和半合成化学数据集上的实用性,其中已知地面真相结构以及图像数据集和现实世界中的癌症数据集。

In many real world problems, features do not act alone but in combination with each other. For example, in genomics, diseases might not be caused by any single mutation but require the presence of multiple mutations. Prior work on feature selection either seeks to identify individual features or can only determine relevant groups from a predefined set. We investigate the problem of discovering groups of predictive features without predefined grouping. To do so, we define predictive groups in terms of linear and non-linear interactions between features. We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups, without requiring candidate groups to be provided. The selected groups are sparse and exhibit minimum overlap. Furthermore, we propose a new metric to measure similarity between discovered groups and the ground truth. We demonstrate the utility of our model on multiple synthetic tasks and semi-synthetic chemistry datasets, where the ground truth structure is known, as well as an image dataset and a real-world cancer dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源