深度学习中对抗性鲁棒性的本地竞争和不确定性

论文标题

深度学习中对抗性鲁棒性的本地竞争和不确定性

Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

论文作者

Alexos, Antonios, Panousis, Konstantinos P., Chatzis, Sotirios

论文摘要

这项工作试图通过新颖的学习论点来解决深层网络的对抗性鲁棒性。具体而言，受到神经科学的结果的启发，我们提出了本地竞争原则，作为对抗性深度学习的一种手段。我们认为，新颖的本地赢家赢家 - 全部（LWTA）非线性，再加上后抽样方案，可以极大地改善传统深层网络对抗困难的对抗性攻击方案的对抗性鲁棒性。我们将这些LWTA参数与贝叶斯非参数领域的工具，特别是印度自助餐过程的破坏性结构，以灵活地说明数据驱动的建模的固有不确定性。正如我们在实验上表明的那样，新提出的模型对MNIST和CIFAR10数据集的对抗扰动实现了高度鲁棒性。我们的模型实现最新的模型会产生强大的白盒攻击，同时保持其良性准确性高度。同样重要的是，我们的方法实现了这一结果，同时需要的训练模型参数要比现有的最新型号要少得多。

This work attempts to address adversarial robustness of deep networks by means of novel learning arguments. Specifically, inspired from results in neuroscience, we propose a local competition principle as a means of adversarially-robust deep learning. We argue that novel local winner-takes-all (LWTA) nonlinearities, combined with posterior sampling schemes, can greatly improve the adversarial robustness of traditional deep networks against difficult adversarial attack schemes. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to flexibly account for the inherent uncertainty in data-driven modeling. As we experimentally show, the new proposed model achieves high robustness to adversarial perturbations on MNIST and CIFAR10 datasets. Our model achieves state-of-the-art results in powerful white-box attacks, while at the same time retaining its benign accuracy to a high degree. Equally importantly, our approach achieves this result while requiring far less trainable model parameters than the existing state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题