对抗性鲁棒性保证随机深神网络

论文标题

对抗性鲁棒性保证随机深神网络

Adversarial Robustness Guarantees for Random Deep Neural Networks

论文作者

De Palma, Giacomo, Kiani, Bobak T., Lloyd, Seth

论文摘要

深度学习算法的可靠性从根本上受到了对抗性示例的存在，这些示例的存在是错误地分类的输入，这些输入非常接近正确分类的输入。我们探讨了具有随机权重和偏见的深神经网络的对抗性示例的特性，并证明对于任何$ p \ ge1 $，分类边界从分类边界缩放的任何给定输入的$ \ ell^p $距离作为输入时间的平方根上的一个$ \ ell \ ell^p $输入的范围。结果基于最近证明的高斯过程与隐藏层无限宽度极限之间的等效性，并通过对MNIST和CIFAR10数据集进行培训的随机深神经网络和深层神经网络的实验进行了验证。该结果构成了对对抗性例子的理论理解的基本进步，并为对对抗性扰动的网络体系结构与鲁棒性之间的关系开辟了彻底的理论表征。

The reliability of deep learning algorithms is fundamentally challenged by the existence of adversarial examples, which are incorrectly classified inputs that are extremely close to a correctly classified input. We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any $p\ge1$, the $\ell^p$ distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the $\ell^p$ norm of the input. The results are based on the recently proved equivalence between Gaussian processes and deep neural networks in the limit of infinite width of the hidden layers, and are validated with experiments on both random deep neural networks and deep neural networks trained on the MNIST and CIFAR10 datasets. The results constitute a fundamental advance in the theoretical understanding of adversarial examples, and open the way to a thorough theoretical characterization of the relation between network architecture and robustness to adversarial perturbations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题