论文标题
使用上下文词和句子嵌入检测正在进行的事件
Detecting Ongoing Events Using Contextual Word and Sentence Embeddings
论文作者
论文摘要
本文介绍了正在进行的事件检测(OED)任务,这是一项特定的事件检测任务,其目标是检测正在进行的事件仅提及与历史,未来,假设或其他既不是新鲜或最新的形式或其他形式或事件相反。任何需要从非结构化文本中提取有关正在进行的事件的结构化信息的应用程序都可以利用OED系统。本文的主要贡献是:(1)它引入了OED任务以及手动标记为任务的数据集; (2)它为任务的设计和实现提供了使用Bert嵌入来将上下文单词和上下文句子嵌入为属性的任务的设计和实现,从未使用过我们的最佳知识来检测新闻中的持续事件; (3)它提出了广泛的经验评估,其中包括(i)探索不同体系结构和超参数的探索,(ii)进行研究以研究每个属性的影响,以及(iii)与先进模型的复制进行比较。结果为上下文嵌入的重要性提供了一些见解,并表明所提出的方法在OED任务中有效,表现优于基线模型。
This paper introduces the Ongoing Event Detection (OED) task, which is a specific Event Detection task where the goal is to detect ongoing event mentions only, as opposed to historical, future, hypothetical, or other forms or events that are neither fresh nor current. Any application that needs to extract structured information about ongoing events from unstructured texts can take advantage of an OED system. The main contribution of this paper are the following: (1) it introduces the OED task along with a dataset manually labeled for the task; (2) it presents the design and implementation of an RNN model for the task that uses BERT embeddings to define contextual word and contextual sentence embeddings as attributes, which to the best of our knowledge were never used before for detecting ongoing events in news; (3) it presents an extensive empirical evaluation that includes (i) the exploration of different architectures and hyperparameters, (ii) an ablation test to study the impact of each attribute, and (iii) a comparison with a replication of a state-of-the-art model. The results offer several insights into the importance of contextual embeddings and indicate that the proposed approach is effective in the OED task, outperforming the baseline models.