论文标题
一项关于开放式文本生成的对比对比搜索和对比度解码的实证研究
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
论文作者
论文摘要
在研究中,我们从经验上比较了最近提出的两种解码方法,即对比度搜索(CS)和对比度解码(CD),以进行开放式文本生成。自动评估结果表明,尽管CS在Mauve指标上的性能要比CD差,但它在多样性和相干指标上大大超过了CD。更值得注意的是,跨三个不同领域的大量人类评估表明,人类注释者普遍更支持CS,而不是具有很大边缘的CD。 淡紫色和人类评估之间的矛盾结果表明,淡紫色并不能准确反映人类的偏好。因此,我们呼吁研究界为开放式文本生成开发更好的评估指标。为了确保我们的工作的可重复性,我们已经开源了所有代码,评估结果以及人类注释,https://github.com/yxuansu/yxuansu/contrastive_search_versus_versus_contrastive_decoding。
In the study, we empirically compare the two recently proposed decoding methods, i.e. Contrastive Search (CS) and Contrastive Decoding (CD), for open-ended text generation. The automatic evaluation results suggest that, while CS performs worse than CD on the MAUVE metric, it substantially surpasses CD on the diversity and coherence metrics. More notably, extensive human evaluations across three different domains demonstrate that human annotators are universally more in favor of CS over CD with substantial margins. The contradicted results between MAUVE and human evaluations reveal that MAUVE does not accurately reflect human preferences. Therefore, we call upon the research community to develop better evaluation metrics for open-ended text generation. To ensure the reproducibility of our work, we have open-sourced all our code, evaluation results, as well as human annotations at https://github.com/yxuansu/Contrastive_Search_versus_Contrastive_Decoding.