论文标题
通过级别的对抗性PCNN和数据库扩展,提高关系提取算法的性能
Improving Performance of Relation Extraction Algorithm via Leveled Adversarial PCNN and Database Expansion
论文作者
论文摘要
这项研究使用最小描述长度(MDL)算法介绍了数据库扩展,以扩展数据库以更好地提取关系。与以前的其他关系提取研究不同,我们的方法通过扩展数据来改善系统性能。数据库扩展的目标以及强大的深度学习分类器,是由于关系数据库中关系实例的不完整或未发现的性质(例如FreeBase)而降低错误标签。该研究使用一种深度学习方法(分段卷积神经网络或PCNN)作为我们提出的方法的基本分类器:升级的对抗性注意神经网络(Lattadv-Att)。在数据库扩展过程中,语义实体识别用于使用数据中最常见的模式的最相似项来扩大新实例,以获取其成对的实体。关于深度学习方法,在PCNN中选择性句子的注意力可以减少嘈杂的句子。此外,对抗性扰动训练的使用对于提高系统性能的鲁棒性很有用。通过级别的策略和数据库扩展的结合,可以进一步提高性能。有两个问题:1)数据库扩展方法:规则生成,通过允许大多数类似项目的选定的强语上的步进大小,其目的是找到用于生成实例的实体对,2)一个更好的分类器模型用于关系提取。实验结果表明,数据库扩展的使用是有益的。与未注入方法相比,MDL数据库扩展有助于改进所有方法。 Lattadv-Att以高精度为P@100=0.842(无扩展)作为优秀分类器。在使用p@100 = 0.891(在扩展因子k = 7)上实施时,它甚至更好。
This study introduces database expansion using the Minimum Description Length (MDL) algorithm to expand the database for better relation extraction. Different from other previous relation extraction researches, our method improves system performance by expanding data. The goal of database expansion, together with a robust deep learning classifier, is to diminish wrong labels due to the incomplete or not found nature of relation instances in the relation database (e.g., Freebase). The study uses a deep learning method (Piecewise Convolutional Neural Network or PCNN) as the base classifier of our proposed approach: the leveled adversarial attention neural networks (LATTADV-ATT). In the database expansion process, the semantic entity identification is used to enlarge new instances using the most similar itemsets of the most common patterns of the data to get its pairs of entities. About the deep learning method, the use of attention of selective sentences in PCNN can reduce noisy sentences. Also, the use of adversarial perturbation training is useful to improve the robustness of system performance. The performance even further is improved using a combination of leveled strategy and database expansion. There are two issues: 1) database expansion method: rule generation by allowing step sizes on selected strong semantic of most similar itemsets with aims to find entity pair for generating instances, 2) a better classifier model for relation extraction. Experimental result has shown that the use of the database expansion is beneficial. The MDL database expansion helps improvements in all methods compared to the unexpanded method. The LATTADV-ATT performs as a good classifier with high precision P@100=0.842 (at no expansion). It is even better while implemented on the expansion data with P@100=0.891 (at expansion factor k=7).