Dynasent：一种动态基准，用于情感分析

论文标题

Dynasent：一种动态基准，用于情感分析

DynaSent: A Dynamic Benchmark for Sentiment Analysis

论文作者

Potts, Christopher, Wu, Zhengxuan, Geiger, Atticus, Kiela, Douwe

论文摘要

我们介绍了Dynasent（“动态情感”），这是一项针对三元（正/负/中性）情感分析的新的英语基准任务。 Dynasent将自然发生的句子与使用开源Dynabench平台创建的句子相结合，该平台为人类和模型的数据集创建了设施。 Dynasent总共有121,634个句子，每个句子都由五名群众验证，其开发和测试拆分旨在为即使是我们能够开发的最佳模型产生机会绩效。当未来模型解决此任务时，我们将使用它们来创建Dynasent版本2，从而继续该基准的动态演变。在这里，我们报告了数据集创建工作，重点是提高质量和减少工件的步骤。我们还提供了证据，表明Dynasent的中性类别比其他基准中的可比类别更连贯，并且我们通过连续的微调来激励每轮从头开始训练模型。

We introduce DynaSent ('Dynamic Sentiment'), a new English-language benchmark task for ternary (positive/negative/neutral) sentiment analysis. DynaSent combines naturally occurring sentences with sentences created using the open-source Dynabench Platform, which facilities human-and-model-in-the-loop dataset creation. DynaSent has a total of 121,634 sentences, each validated by five crowdworkers, and its development and test splits are designed to produce chance performance for even the best models we have been able to develop; when future models solve this task, we will use them to create DynaSent version 2, continuing the dynamic evolution of this benchmark. Here, we report on the dataset creation effort, focusing on the steps we took to increase quality and reduce artifacts. We also present evidence that DynaSent's Neutral category is more coherent than the comparable category in other benchmarks, and we motivate training models from scratch for each round over successive fine-tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题