伯特排名者很脆弱：一项使用对抗文件扰动的研究

论文标题

伯特排名者很脆弱：一项使用对抗文件扰动的研究

BERT Rankers are Brittle: a Study using Adversarial Document Perturbations

论文作者

Wang, Yumeng, Lyu, Lijun, Anand, Avishek

论文摘要

现在，基于BERT的上下文排名模型已在广泛的段落和文档排名任务中得到很好的建立。但是，在对抗输入下基于BERT的排名模型的鲁棒性不足。在本文中，我们认为，伯特级居民对针对检索文件的对抗性攻击并非免疫。首先，我们提出了使用基于梯度的优化方法对高度相关和非相关文档进行对抗扰动的算法。我们算法的目的是将少量令牌添加到高度相关或非相关的文档中，以引起大量降级或晋升。我们的实验表明，少数令牌已经可以导致文档等级发生很大变化。此外，我们发现伯特级速率在很大程度上依靠文档开始/头来进行相关性预测，从而使文档的初始部分更容易受到对抗性攻击的影响。更有趣的是，我们发现一小部分反复出现的对抗性词，当将这些词添加到文档中后，这些单词分别导致任何相关/非相关文件的成功等级降级/促进。最后，我们的对抗令牌还显示了数据集内部和跨数据集内的特定主题偏好，从而暴露了BERT前训练或下游数据集的潜在偏见。

Contextual ranking models based on BERT are now well established for a wide range of passage and document ranking tasks. However, the robustness of BERT-based ranking models under adversarial inputs is under-explored. In this paper, we argue that BERT-rankers are not immune to adversarial attacks targeting retrieved documents given a query. Firstly, we propose algorithms for adversarial perturbation of both highly relevant and non-relevant documents using gradient-based optimization methods. The aim of our algorithms is to add/replace a small number of tokens to a highly relevant or non-relevant document to cause a large rank demotion or promotion. Our experiments show that a small number of tokens can already result in a large change in the rank of a document. Moreover, we find that BERT-rankers heavily rely on the document start/head for relevance prediction, making the initial part of the document more susceptible to adversarial attacks. More interestingly, we find a small set of recurring adversarial words that when added to documents result in successful rank demotion/promotion of any relevant/non-relevant document respectively. Finally, our adversarial tokens also show particular topic preferences within and across datasets, exposing potential biases from BERT pre-training or downstream datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题