阐述生成常识性问题大规模回答

论文标题

阐述生成常识性问题大规模回答

Elaboration-Generating Commonsense Question Answering at Scale

论文作者

Wang, Wenya, Srikumar, Vivek, Hajishirzi, Hanna, Smith, Noah A.

论文摘要

在有问题的回答需要常识的情况下，语言模型（例如，GPT-3）已用于生成表达有助于提高性能的背景知识的文本。然而，使用此类模型的成本很高。在这项工作中，我们对较小的语言模型产生有用的中间上下文，此处称为阐述。我们的框架在更新两个语言模型之间交替使用 - 详细发电机和一个答案预测变量 - 允许每个语言都影响彼此。使用GPT-3的参数的少于0.5％，我们的模型优于具有相似尺寸的替代方案，并在四个常识性问题上回答基准测试的gpt-3上的差距。人类评估表明，生成的阐述的质量很高。

In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other. Using less than 0.5% of the parameters of GPT-3, our model outperforms alternatives with similar sizes and closes the gap on GPT-3 on four commonsense question answering benchmarks. Human evaluations show that the quality of the generated elaborations is high.

下载PDF全文

下载文献需遵守相关版权规定

论文标题