通过基于SEQ2SEQ过渡系统解决的核心方案

论文标题

通过基于SEQ2SEQ过渡系统解决的核心方案

Coreference Resolution through a seq2seq Transition-Based System

论文作者

Bohnet, Bernd, Alberti, Chris, Collins, Michael

论文摘要

最新的核心分辨率系统使用搜索算法对可能的跨度来识别提及并解决核心。相反，我们提出了使用文本到文本（SEQ2SEQ）范式共同预测提及和链接的Coreference分辨率系统。我们将核心系统实现为过渡系统，并将多语言T5用作基础语言模型。我们在Conll-2012数据集上获得了最先进的准确性，英语的F1得分为83.3（比以前的工作（Dobrovolskii，2021）高2.3级（F1得分高2.3），仅使用训练数据，使用68.5 f1 f1 for Arabic（+4.1比以前的工作）和74.3 f1-score（+4.3 f1-score）（+4.3 f1-score）（+5.3）（+5.3）。此外，我们使用所有可用的培训数据使用Semeval-2010数据集进行零弹位设置，几次设置和监督设置的实验。与以前的方法相比，4种语言中3种的3种语言中，我们获得的零击F1分数要高得多，并且在所有五种测试语言中都显着超过了先前监督的最新结果。

Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work (Dobrovolskii, 2021)) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work) and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot setting, a few-shot setting, and supervised setting using all available training data. We get substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题