神经进化方法的对抗性鲁棒性评估

论文标题

神经进化方法的对抗性鲁棒性评估

Adversarial Robustness Assessment of NeuroEvolution Approaches

论文作者

Valentim, Inês, Lourenço, Nuno, Antunes, Nuno

论文摘要

神经进化可以通过应用进化计算的技术来自动化人工神经网络的产生。这些方法的主要目标是构建最大程度地提高预测性能的模型，有时还具有最大程度地减少计算复杂性的目标。尽管进化的模型从竞争成果方面取得了竞争成果，但它们对对抗性实例的稳健性（在关键方案中引起了人们的关注）受到了有限的关注。在本文中，我们评估了CIFAR-10图像分类任务上两种突出的神经进化方法发现的模型的对抗性鲁棒性：密集和NSGA-NET。由于这些模型是公开可用的，因此我们考虑白盒不固定的攻击，其中扰动由L2或Linfital-norm界定。与手动设计的网络类似，我们的结果表明，当通过迭代方法攻击演变的模型时，它们的准确性通常在两个距离指标下降至或接近零。密集的模型是这种趋势的一个例外，显示了L2威胁模型下的某些阻力，即使在迭代攻击中，其精度也从93.70％下降到18.10％。此外，我们分析了在网络第一层之前应用于数据的预处理的影响。我们的观察结果表明，其中一些技术会加剧添加到原始输入中的扰动，并可能损害鲁棒性。因此，当自动设计网络的应用程序时，不应忽略此选择。

NeuroEvolution automates the generation of Artificial Neural Networks through the application of techniques from Evolutionary Computation. The main goal of these approaches is to build models that maximize predictive performance, sometimes with an additional objective of minimizing computational complexity. Although the evolved models achieve competitive results performance-wise, their robustness to adversarial examples, which becomes a concern in security-critical scenarios, has received limited attention. In this paper, we evaluate the adversarial robustness of models found by two prominent NeuroEvolution approaches on the CIFAR-10 image classification task: DENSER and NSGA-Net. Since the models are publicly available, we consider white-box untargeted attacks, where the perturbations are bounded by either the L2 or the Linfinity-norm. Similarly to manually-designed networks, our results show that when the evolved models are attacked with iterative methods, their accuracy usually drops to, or close to, zero under both distance metrics. The DENSER model is an exception to this trend, showing some resistance under the L2 threat model, where its accuracy only drops from 93.70% to 18.10% even with iterative attacks. Additionally, we analyzed the impact of pre-processing applied to the data before the first layer of the network. Our observations suggest that some of these techniques can exacerbate the perturbations added to the original inputs, potentially harming robustness. Thus, this choice should not be neglected when automatically designing networks for applications where adversarial attacks are prone to occur.

下载PDF全文

下载文献需遵守相关版权规定

论文标题