论文标题
阐述生成常识性问题大规模回答
Elaboration-Generating Commonsense Question Answering at Scale
论文作者
论文摘要
在有问题的回答需要常识的情况下,语言模型(例如,GPT-3)已用于生成表达有助于提高性能的背景知识的文本。然而,使用此类模型的成本很高。在这项工作中,我们对较小的语言模型产生有用的中间上下文,此处称为阐述。我们的框架在更新两个语言模型之间交替使用 - 详细发电机和一个答案预测变量 - 允许每个语言都影响彼此。使用GPT-3的参数的少于0.5%,我们的模型优于具有相似尺寸的替代方案,并在四个常识性问题上回答基准测试的gpt-3上的差距。人类评估表明,生成的阐述的质量很高。
In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other. Using less than 0.5% of the parameters of GPT-3, our model outperforms alternatives with similar sizes and closes the gap on GPT-3 on four commonsense question answering benchmarks. Human evaluations show that the quality of the generated elaborations is high.