论文标题

阐述生成常识性问题大规模回答

Elaboration-Generating Commonsense Question Answering at Scale

论文作者

Wang, Wenya, Srikumar, Vivek, Hajishirzi, Hanna, Smith, Noah A.

论文摘要

在有问题的回答需要常识的情况下,语言模型(例如,GPT-3)已用于生成表达有助于提高性能的背景知识的文本。然而,使用此类模型的成本很高。在这项工作中,我们对较小的语言模型产生有用的中间上下文,此处称为阐述。我们的框架在更新两个语言模型之间交替使用 - 详细发电机和一个答案预测变量 - 允许每个语言都影响彼此。使用GPT-3的参数的少于0.5%,我们的模型优于具有相似尺寸的替代方案,并在四个常识性问题上回答基准测试的gpt-3上的差距。人类评估表明,生成的阐述的质量很高。

In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other. Using less than 0.5% of the parameters of GPT-3, our model outperforms alternatives with similar sizes and closes the gap on GPT-3 on four commonsense question answering benchmarks. Human evaluations show that the quality of the generated elaborations is high.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源