RRF102：通过100多次合奏满足TREC-COVID挑战

论文标题

RRF102：通过100多次合奏满足TREC-COVID挑战

RRF102: Meeting the TREC-COVID Challenge with a 100+ Runs Ensemble

论文作者

Bendersky, Michael, Zhuang, Honglei, Ma, Ji, Han, Shuguang, Hall, Keith, McDonald, Ryan

论文摘要

在本文中，我们报告了参与TREC-COVID挑战的结果。为了满足为快速发展的生物医学收集构建搜索引擎的挑战，我们提出了一种简单而有效的加权等级排名融合方法，从（a）（a）词汇和语义检索系统组合了102次运行，（b）预培训且精细调整的Bert等级，以及（C）相关反馈反馈运行。我们的消融研究表明，这些系统中的每一个对整体合奏的贡献。在TREC-COVID挑战的第4轮和第5轮中，提交的合奏表现达到了最先进的表现。

In this paper, we report the results of our participation in the TREC-COVID challenge. To meet the challenge of building a search engine for rapidly evolving biomedical collection, we propose a simple yet effective weighted hierarchical rank fusion approach, that ensembles together 102 runs from (a) lexical and semantic retrieval systems, (b) pre-trained and fine-tuned BERT rankers, and (c) relevance feedback runs. Our ablation studies demonstrate the contributions of each of these systems to the overall ensemble. The submitted ensemble runs achieved state-of-the-art performance in rounds 4 and 5 of the TREC-COVID challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题