论文标题
切换可转移的梯度方向,以进行查询有效的黑盒对抗攻击
Switching Transferable Gradient Directions for Query-Efficient Black-Box Adversarial Attacks
论文作者
论文摘要
我们提出了一个名为Switch的简单且高度查询的黑框对抗攻击,该攻击在基于分数的设置中具有最先进的性能。 Switch具有替代模型的梯度$ \ hat {\ Mathbf {g}} $ W.R.T.的梯度的高效利用。输入图像,即可转移的梯度。在每次迭代中,Switch首先尝试沿$ \ hat {\ Mathbf {g}} $的方向更新当前样本,但如果我们的algorithm检测到它不会增加攻击目标功能的价值,则考虑切换到相反的方向$ - \ hat {\ hat {\ mathbf {g}} $。我们通过局部近似线性假设来选择切换到相反方向的选择是合理的。在开关中,由于可转移梯度提供了丰富的信息,因此仅需要一个或两个查询,但是由于可转移的梯度提供了丰富的信息,因此导致了前所未有的查询效率。为了提高开关的鲁棒性,我们进一步建议开关$ _ \ text {rgf} $,其中更新遵循随机无梯度(RGF)估算的方向,当$ \ hat {\ hat {\ mathbf {g}} $ ness n n n s n n s n n s n n s n n s都无法提高目标,同时保持开关的优势,以提高目标的优势。在CIFAR-10,CIFAR-100和TINYIMAGENET上进行的实验结果表明,与其他方法相比,使用更少的查询获得了令人满意的攻击成功率,并且Switch $ _ \ text {RGF} $达到了最先进的攻击成功率,而质量更少。由于其简单性,我们的方法可以作为未来黑盒攻击的强大基准。 Pytorch源代码在https://github.com/machanic/switch上发布。
We propose a simple and highly query-efficient black-box adversarial attack named SWITCH, which has a state-of-the-art performance in the score-based setting. SWITCH features a highly efficient and effective utilization of the gradient of a surrogate model $\hat{\mathbf{g}}$ w.r.t. the input image, i.e., the transferable gradient. In each iteration, SWITCH first tries to update the current sample along the direction of $\hat{\mathbf{g}}$, but considers switching to its opposite direction $-\hat{\mathbf{g}}$ if our algorithm detects that it does not increase the value of the attack objective function. We justify the choice of switching to the opposite direction by a local approximate linearity assumption. In SWITCH, only one or two queries are needed per iteration, but it is still effective due to the rich information provided by the transferable gradient, thereby resulting in unprecedented query efficiency. To improve the robustness of SWITCH, we further propose SWITCH$_\text{RGF}$ in which the update follows the direction of a random gradient-free (RGF) estimate when neither $\hat{\mathbf{g}}$ nor its opposite direction can increase the objective, while maintaining the advantage of SWITCH in terms of query efficiency. Experimental results conducted on CIFAR-10, CIFAR-100 and TinyImageNet show that compared with other methods, SWITCH achieves a satisfactory attack success rate using much fewer queries, and SWITCH$_\text{RGF}$ achieves the state-of-the-art attack success rate with fewer queries overall. Our approach can serve as a strong baseline for future black-box attacks because of its simplicity. The PyTorch source code is released on https://github.com/machanic/SWITCH.