对抗性示例的可转让性排名

论文标题

对抗性示例的可转让性排名

Transferability Ranking of Adversarial Examples

论文作者

Levy, Mosh, Amit, Guy, Elovici, Yuval, Mirsky, Yisroel

论文摘要

黑盒情景中的对抗性可转移性提出了一个独特的挑战：虽然攻击者可以采用替代模型来制作对抗性示例，但他们对这些示例是否成功妥协了目标模型缺乏保证。到目前为止，确定成功的普遍方法一直是在受害者模型上直接试用和错误测试制作的样本。但是，这种方法可能每次尝试都有探测，迫使攻击者要么完善他们的第一次尝试或面临暴露。我们的论文介绍了一种排名策略，该策略完善了转移攻击过程，使攻击者能够估计成功的可能性，而无需对受害者系统的重复试验。通过利用一组不同的替代模型，我们的方法可以预测对抗性例子的可转移性。该策略可用于选择用于攻击中的最佳样本，也可以选择适用于特定样本的最佳扰动。使用我们的策略，我们能够从仅20％的20％提高对抗性示例的可转移性 - 类似于随机选择到接近上限的水平，有些情况甚至见证了100％的成功率。这种重大的改进不仅阐明了各种体系结构之间的共同敏感性，而且还表明攻击者可以放弃可检测到的试验和错误策略，从而增加了基于代理攻击的威胁。

Adversarial transferability in black-box scenarios presents a unique challenge: while attackers can employ surrogate models to craft adversarial examples, they lack assurance on whether these examples will successfully compromise the target model. Until now, the prevalent method to ascertain success has been trial and error-testing crafted samples directly on the victim model. This approach, however, risks detection with every attempt, forcing attackers to either perfect their first try or face exposure. Our paper introduces a ranking strategy that refines the transfer attack process, enabling the attacker to estimate the likelihood of success without repeated trials on the victim's system. By leveraging a set of diverse surrogate models, our method can predict transferability of adversarial examples. This strategy can be used to either select the best sample to use in an attack or the best perturbation to apply to a specific sample. Using our strategy, we were able to raise the transferability of adversarial examples from a mere 20% - akin to random selection-up to near upper-bound levels, with some scenarios even witnessing a 100% success rate. This substantial improvement not only sheds light on the shared susceptibilities across diverse architectures but also demonstrates that attackers can forego the detectable trial-and-error tactics raising increasing the threat of surrogate-based attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题