advfoolgen：为深层分类器创建持久的麻烦

论文标题

advfoolgen：为深层分类器创建持久的麻烦

AdvFoolGen: Creating Persistent Troubles for Deep Classifiers

论文作者

Ding, Yuzhen, Thakur, Nupur, Li, Baoxin

论文摘要

研究表明，深层神经网络容易受到恶意攻击的影响，即使这些图像可能会引起人类眼睛完全不同的标签，在这种情况下，创建了对抗性图像。为了使深层网络更加强大，在文献中提出了许多防御机制，其中一些对于防止典型攻击非常有效。在本文中，我们提出了一种新的Black-Box攻击，称为AdvFoolgen，该攻击可以从与自然图像相同的特征空间中产生攻击图像，即使已经应用了最先进的防御机制，也可以继续困扰网络。我们通过与建立良好的攻击算法进行比较来系统地评估我们的模型。通过实验，我们证明了面对最先进的防御技术的攻击的有效性和鲁棒性，并通过原则分析揭示了其有效性的潜在原因。因此，AdvFoolgen有助于从新的角度了解深网的脆弱性，进而有助于开发和评估新的防御机制。

Researches have shown that deep neural networks are vulnerable to malicious attacks, where adversarial images are created to trick a network into misclassification even if the images may give rise to totally different labels by human eyes. To make deep networks more robust to such attacks, many defense mechanisms have been proposed in the literature, some of which are quite effective for guarding against typical attacks. In this paper, we present a new black-box attack termed AdvFoolGen, which can generate attacking images from the same feature space as that of the natural images, so as to keep baffling the network even though state-of-the-art defense mechanisms have been applied. We systematically evaluate our model by comparing with well-established attack algorithms. Through experiments, we demonstrate the effectiveness and robustness of our attack in the face of state-of-the-art defense techniques and unveil the potential reasons for its effectiveness through principled analysis. As such, AdvFoolGen contributes to understanding the vulnerability of deep networks from a new perspective and may, in turn, help in developing and evaluating new defense mechanisms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题