域的解开域建模及其与适应性密集检索的相关性

论文标题

域的解开域建模及其与适应性密集检索的相关性

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

论文作者

Zhan, Jingtao, Ai, Qingyao, Liu, Yiqun, Mao, Jiaxin, Xie, Xiaohui, Zhang, Min, Ma, Shaoping

论文摘要

最近的浓缩（DR）技术的最新进展显着提高了第一阶段检索的有效性。 DR Models经过大规模监督数据的培训，可以将查询和文档编码为低维密集的空间，并进行有效的语义匹配。但是，先前的研究表明，当训练有素的DR模型被采用在与标记数据域不同的目标域中时，DR模型的有效性将很大。可能的原因之一是，DR模型从未见过目标语料库，因此可能无法减轻训练和目标域之间的差异。实际上，不幸的是，为每个目标域训练DR模型以避免域移动通常是一项艰巨的任务，因为它需要额外的时间，存储和特定于域的数据标记，这并不总是可用。为了解决这个问题，在本文中，我们提出了一个名为Distangled密集检索（DDR）的新型DR框架，以支持DR模型的有效和灵活的域适应。 DDR由用于建模域不变的匹配模式和多个域自适应模块（DAM）的相关性估计模块（REM）组成，用于建模多个目标语料库的特定于域特异性特征。通过使REM和DAMS解开，DDR启用了灵活的培训范式，其中REM经过一次监督训练，并且大坝接受了无监督的数据培训。在不同领域和语言中进行的全面实验表明，与强大的基准的强大基准相比，DDR显着提高了排名性能，并且在大多数情况下都大大优于传统检索方法。

Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always available. To address this problem, in this paper, we propose a novel DR framework named Disentangled Dense Retrieval (DDR) to support effective and flexible domain adaptation for DR models. DDR consists of a Relevance Estimation Module (REM) for modeling domain-invariant matching patterns and several Domain Adaption Modules (DAMs) for modeling domain-specific features of multiple target corpora. By making the REM and DAMs disentangled, DDR enables a flexible training paradigm in which REM is trained with supervision once and DAMs are trained with unsupervised data. Comprehensive experiments in different domains and languages show that DDR significantly improves ranking performance compared to strong DR baselines and substantially outperforms traditional retrieval methods in most scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题