论文标题
FIE:通过利用编码器中的早期融合来构建全球概率空间
FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering
论文作者
论文摘要
最近,生成模型开始在开放域问题回答中胜过开发模型,这主要是通过利用其解码器参加多个编码段落并结合其信息。但是,由于需要解码器,由于需要解码器,由于自动回归解码器搜索而在推理期间运行速度较慢,因此生成模型往往大于提取模型,并且其生成的输出通常患有幻觉。我们建议将变压器编码器扩展到能够从多个段落中融合信息的能力,并使用全局表示,以在样本跨样本的所有令牌上提供跨样本的关注。此外,我们提出了一个替代答案跨度计算,以更好地在所有样本的全球空间中更好地汇总答案得分。使用我们提出的方法,我们在自然问题数据集上胜过当前的最新方法,同时仅使用$ 25 \%$ $的参数和$ 35 \%$ $ $ 4.4 $ $ 4.4 $ $ 4.4 $ $ 4.4 $ $ 4.4 $ $ 4.4 $ $ 4.4 $,在WebQuestions DataSet上,我们的表现要优于当前的最新方法。与合成数据增强相结合时,我们在Triviaqa数据集上的表现也优于大型模型。我们方法的延迟和参数节省使其对开放域问题的回答特别有吸引力,因为这些模型通常是计算密集型的。
Generative models have recently started to outperform extractive models in Open Domain Question Answering, largely by leveraging their decoder to attend over multiple encoded passages and combining their information. However, generative models tend to be larger than extractive models due to the need for a decoder, run slower during inference due to auto-regressive decoder beam search, and their generated output often suffers from hallucinations. We propose to extend transformer encoders with the ability to fuse information from multiple passages, using global representation to provide cross-sample attention over all tokens across samples. Furthermore, we propose an alternative answer span probability calculation to better aggregate answer scores in the global space of all samples. Using our proposed method, we outperform the current state-of-the-art method by $2.5$ Exact Match score on the Natural Question dataset while using only $25\%$ of parameters and $35\%$ of the latency during inference, and $4.4$ Exact Match on WebQuestions dataset. When coupled with synthetic data augmentation, we outperform larger models on the TriviaQA dataset as well. The latency and parameter savings of our method make it particularly attractive for open-domain question answering, as these models are often compute-intensive.