论文标题
TaxoExpan:具有位置增强图神经网络的自我监督分类法扩展
TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network
论文作者
论文摘要
分类法包括机器可解剖的语义,并为许多Web应用程序提供宝贵的知识。例如,在线零售商(例如,亚马逊和eBay)将分类法用于产品建议,网络搜索引擎(例如Google和Bing)利用分类法来增强查询了解。在手动或半自动的构建分类法上已做出了巨大的努力。但是,随着Web内容的快速增长,现有的分类法将变得过时,并且无法捕获新兴的知识。因此,在许多应用中,现有分类法的动态扩展需求很大。在本文中,我们研究了如何通过添加一组新概念来扩展现有的分类法。我们提出了一个新颖的自我监督框架,名为TaxoExpan,该框架自动生成一组<查询概念,锚概念>与现有分类学作为培训数据的成对。使用这样的自学数据,TaxoExpan学习了一个模型,以预测查询概念是否是锚概念的直接信。我们在TaxoExpan中开发了两种创新技术:(1)一个位置增强的图形神经网络,该神经网络编码现有分类学中锚固概念的局部结构,以及(2)一个噪声训练训练目标,使所学习的模型能够对自我掩盖数据中的标签噪声不敏感。来自不同领域的三个大规模数据集进行的广泛实验既证明了分类法扩展的效率和效率。
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon and eBay) use taxonomies for product recommendation, and web search engines (e.g., Google and Bing) leverage taxonomies to enhance query understanding. Enormous efforts have been made on constructing taxonomies either manually or semi-automatically. However, with the fast-growing volume of web content, existing taxonomies will become outdated and fail to capture emerging knowledge. Therefore, in many applications, dynamic expansions of an existing taxonomy are in great demand. In this paper, we study how to expand an existing taxonomy by adding a set of new concepts. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of <query concept, anchor concept> pairs from the existing taxonomy as training data. Using such self-supervision data, TaxoExpan learns a model to predict whether a query concept is the direct hyponym of an anchor concept. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data. Extensive experiments on three large-scale datasets from different domains demonstrate both the effectiveness and the efficiency of TaxoExpan for taxonomy expansion.