论文标题

XREF:中文新闻评论的实体与补充文章参考

XREF: Entity Linking for Chinese News Comments with Supplementary Article Reference

论文作者

Hua, Xinyu, Li, Lei, Hua, Lifeng, Wang, Lu

论文摘要

社交媒体帖子中提到的实体的自动识别有助于快速消化趋势主题和流行意见。尽管如此,由于有限的上下文和不同的名称变化,这仍然是一项具有挑战性的任务。在本文中,我们研究了有关中国新闻评论的实体问题,提到了中文新闻评论。我们假设评论通常是指相应新闻文章中的实体以及涉及实体的主题。因此,我们提出了一个新颖的模型XREF,该模型利用注意机制在评论中指出相关上下文,以及(2)检测新闻文章中的支持实体。为了改善培训,我们做出了两个贡献:(a)除了标准的横熵熵外,我们还提出了有监督的注意力损失,(b)我们制定了一种弱监督的培训计划来利用大型未标记的语料库。为实验收集并注释了两个新的娱乐和产品领域数据集。我们提出的方法在两个数据集上都优于以前的方法。

Automatic identification of mentioned entities in social media posts facilitates quick digestion of trending topics and popular opinions. Nonetheless, this remains a challenging task due to limited context and diverse name variations. In this paper, we study the problem of entity linking for Chinese news comments given mentions' spans. We hypothesize that comments often refer to entities in the corresponding news article, as well as topics involving the entities. We therefore propose a novel model, XREF, that leverages attention mechanisms to (1) pinpoint relevant context within comments, and (2) detect supporting entities from the news article. To improve training, we make two contributions: (a) we propose a supervised attention loss in addition to the standard cross entropy, and (b) we develop a weakly supervised training scheme to utilize the large-scale unlabeled corpus. Two new datasets in entertainment and product domains are collected and annotated for experiments. Our proposed method outperforms previous methods on both datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源