论文标题

Cort:变形金刚的互补排名

CoRT: Complementary Rankings from Transformers

论文作者

Wrzalik, Marco, Krechel, Dirk

论文摘要

许多最近的神经信息检索方法通过使用多阶段排名管道来减轻其计算成本。在第一阶段,使用有效的检索模型(例如BM25)检索了许多潜在相关的候选者。尽管BM25作为第一阶段的排名已被证明是不错的表现,但它往往会错过相关段落。在这种情况下,我们提出了CORT,这是一个简单的神经第一阶段排名模型,该模型利用审计的语言模型(例如BERT)的上下文表示为基于术语的排名函数,同时在查询时间没有显着延迟。使用MS MARCO数据集,我们表明Cort通过与缺少的候选人补充BM25来显着增加候选人的召回。因此,我们发现随后的再生阶层通过较少的候选者取得了卓越的结果。我们进一步证明,使用CORT的通道检索可以通过令人惊讶的低延迟来实现。

Many recent approaches towards neural information retrieval mitigate their computational costs by using a multi-stage ranking pipeline. In the first stage, a number of potentially relevant candidates are retrieved using an efficient retrieval model such as BM25. Although BM25 has proven decent performance as a first-stage ranker, it tends to miss relevant passages. In this context we propose CoRT, a simple neural first-stage ranking model that leverages contextual representations from pretrained language models such as BERT to complement term-based ranking functions while causing no significant delay at query time. Using the MS MARCO dataset, we show that CoRT significantly increases the candidate recall by complementing BM25 with missing candidates. Consequently, we find subsequent re-rankers achieve superior results with less candidates. We further demonstrate that passage retrieval using CoRT can be realized with surprisingly low latencies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源