论文标题

对会话提问中的省略号和核心检测的积极学习和多标签分类

Active Learning and Multi-label Classification for Ellipsis and Coreference Detection in Conversational Question-Answering

论文作者

Brabant, Quentin, Rojas-Barahona, Lina Maria, Gardent, Claire

论文摘要

在人类的对话中,省略号和核心通常是语言现象。尽管这些现象是使人机对话更加流利和自然的平均值,但只有很少的对话语料库包含明确的指示,其中包含椭圆和/或核心发作。在本文中,我们解决了自动检测省略于对话问答中的省略号和核心发作的任务。我们建议使用基于Distilbert的多标签分类器。使用多标签分类和主动学习来补偿有限的标记数据。我们表明,这些方法大大提高了分类器在手动标记数据集中检测这些现象的性能。

In human conversations, ellipsis and coreference are commonly occurring linguistic phenomena. Although these phenomena are a mean of making human-machine conversations more fluent and natural, only few dialogue corpora contain explicit indications on which turns contain ellipses and/or coreferences. In this paper we address the task of automatically detecting ellipsis and coreferences in conversational question answering. We propose to use a multi-label classifier based on DistilBERT. Multi-label classification and active learning are employed to compensate the limited amount of labeled data. We show that these methods greatly enhance the performance of the classifier for detecting these phenomena on a manually labeled dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源