论文标题
使用QA-MEMORY增强预训练的语言模型,以回答开放域问题
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
论文作者
论文摘要
检索增强语言模型最近已成为知识密集型任务的标准。这些方法不仅依赖于大型神经模型参数中的潜在语义,而是邀请半参数记忆来编码知识索引,以供模型检索。大多数先前的工作都采用文本段落作为知识单位,其覆盖范围很高,以解释性,可控性和效率为代价。相反的特性在其他方法中依赖于知识库(KB)事实。同时,最新的工作证明了从Q-A对的索引中存储和检索的有效性,这些Q-A对衍生自text \ citep {lewis2021paq}。这种方法产生了高覆盖范围的知识表示形式,由于其表示是更多原子单位,因此保持类似KB的属性。在这项工作中,我们通过提出提问的增强编码器模型和随附的训练策略来进一步推动这一研究。这产生了一个端到端系统,该系统不仅在单跳QA任务上胜过先前的QA检索方法,而且还可以启用组合推理,这是两个多跳QA数据集上的强效果所证明的。这些方法共同提高了解释和控制模型的能力,同时使用通道检索系统缩小性能差距。
Retrieval augmented language models have recently become the standard for knowledge intensive tasks. Rather than relying purely on latent semantics within the parameters of large neural models, these methods enlist a semi-parametric memory to encode an index of knowledge for the model to retrieve over. Most prior work has employed text passages as the unit of knowledge, which has high coverage at the cost of interpretability, controllability, and efficiency. The opposite properties arise in other methods which have instead relied on knowledge base (KB) facts. At the same time, more recent work has demonstrated the effectiveness of storing and retrieving from an index of Q-A pairs derived from text \citep{lewis2021paq}. This approach yields a high coverage knowledge representation that maintains KB-like properties due to its representations being more atomic units of information. In this work we push this line of research further by proposing a question-answer augmented encoder-decoder model and accompanying pretraining strategy. This yields an end-to-end system that not only outperforms prior QA retrieval methods on single-hop QA tasks but also enables compositional reasoning, as demonstrated by strong performance on two multi-hop QA datasets. Together, these methods improve the ability to interpret and control the model while narrowing the performance gap with passage retrieval systems.