通过内部平衡正规化来进行神经检索

论文标题

通过内部平衡正规化来进行神经检索

Debiasing Neural Retrieval via In-batch Balancing Regularization

论文作者

Li, Yuantong, Wei, Xiaokai, Wang, Zijian, Wang, Shen, Bhatia, Parminder, Ma, Xiaofei, Arnold, Andrew

论文摘要

人们经常与信息检索（IR）系统进行互动，但是，IR模型对各种人口统计学表现出偏见和歧视。经过处理的公平排名方法通过在损失函数中添加与公平相关的正规化项之间的准确性和公平性之间的权衡。但是，没有直观的目标函数取决于点击概率和用户参与来直接对此进行优化。在这项工作中，我们建议内部平衡正规化（IBBR），以减轻亚组之间的排名差异。特别是，我们开发了一个可区分的\ textit {normed成对排名公平}（NPRF），并利用NPRF之上的t统计量优于亚组，以提高公平性。基于BERT的神经排名者在MS MARCO通道检索数据集上具有经验结果，其非性别性质查询基准基准\ citep {rekabsaz2020200neural}表明，我们的IBBR方法与NPRF相比，NPRF的偏见显着降低了，而与基础相比，我们的IBBR方法降低了。

People frequently interact with information retrieval (IR) systems, however, IR models exhibit biases and discrimination towards various demographics. The in-processing fair ranking methods provide a trade-offs between accuracy and fairness through adding a fairness-related regularization term in the loss function. However, there haven't been intuitive objective functions that depend on the click probability and user engagement to directly optimize towards this. In this work, we propose the In-Batch Balancing Regularization (IBBR) to mitigate the ranking disparity among subgroups. In particular, we develop a differentiable \textit{normed Pairwise Ranking Fairness} (nPRF) and leverage the T-statistics on top of nPRF over subgroups as a regularization to improve fairness. Empirical results with the BERT-based neural rankers on the MS MARCO Passage Retrieval dataset with the human-annotated non-gendered queries benchmark \citep{rekabsaz2020neural} show that our IBBR method with nPRF achieves significantly less bias with minimal degradation in ranking performance compared with the baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题