提高语义特征社区的鲁棒性验证

论文标题

提高语义特征社区的鲁棒性验证

Boosting Robustness Verification of Semantic Feature Neighborhoods

论文作者

Kabaha, Anan, Drachsler-Cohen, Dana

论文摘要

深度神经网络已被证明容易受到基于语义特征扰动输入的对抗性攻击。现有的鲁棒性分析仪可以建议语义特征社区提高网络的可靠性。但是，尽管这些技术取得了重大进展，但他们仍然很难扩展到深层网络和大型社区。在这项工作中，我们介绍了VEEP，这是一种主动学习方法，将验证过程分为一系列较小的验证步骤，每个验证步骤都会提交给现有的鲁棒性分析仪。关键想法是基于先前的步骤来预测下一个最佳步骤。通过参数回归估算认证速度和灵敏度来预测最佳步骤。我们评估了MNIST，时尚摄像机，CIFAR-10和Imagenet上的VEEP，并表明它可以分析各种特征的邻域：亮度，对比，色调，饱和度和轻度。我们表明，平均而言，鉴于90分钟的超时，VEEP在29分钟内验证了96％的最大认证社区，而现有的拆分接近则在58分钟内平均验证了73％的最大认证社区的73％。

Deep neural networks have been shown to be vulnerable to adversarial attacks that perturb inputs based on semantic features. Existing robustness analyzers can reason about semantic feature neighborhoods to increase the networks' reliability. However, despite the significant progress in these techniques, they still struggle to scale to deep networks and large neighborhoods. In this work, we introduce VeeP, an active learning approach that splits the verification process into a series of smaller verification steps, each is submitted to an existing robustness analyzer. The key idea is to build on prior steps to predict the next optimal step. The optimal step is predicted by estimating the certification velocity and sensitivity via parametric regression. We evaluate VeeP on MNIST, Fashion-MNIST, CIFAR-10 and ImageNet and show that it can analyze neighborhoods of various features: brightness, contrast, hue, saturation, and lightness. We show that, on average, given a 90 minute timeout, VeeP verifies 96% of the maximally certifiable neighborhoods within 29 minutes, while existing splitting approaches verify, on average, 73% of the maximally certifiable neighborhoods within 58 minutes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题