对基于脑启发的高维计算分类器的对抗性攻击

论文标题

对基于脑启发的高维计算分类器的对抗性攻击

Adversarial Attacks on Brain-Inspired Hyperdimensional Computing-Based Classifiers

论文作者

Yang, Fangfang, Ren, Shaolei

论文摘要

作为新兴的内存计算体系结构，脑启发的高维度计算（HDC）模拟脑认知并利用随机的过度向量（即具有数千或更多维度的向量）来表示特征并执行分类任务。独特的HyperVector表示使HDC分类器能够表现出高能量效率，低推断潜伏期和强大的鲁棒性，以针对硬件引起的位误差。因此，它们越来越被认为是用于在设备分类的本地的传统深层神经网络（DNN）的吸引人替代方案，尤其是在低功耗设备的设备上。尽管如此，与他们的DNN同行不同，HDC分类器的最先进设计主要是符合安全性的，对它们对对抗性输入的安全性和免疫力产生了怀疑。在本文中，我们首次研究对HDC分类器的对抗攻击，并强调HDC分类器可能容易受到最少扰动的对抗样本的影响。具体而言，以手写数字分类为例，我们构建了HDC分类器并制定了灰色框攻击问题，在此问题中，攻击者的目标是误导目标HDC分类器以产生错误的预测标签，同时尽可能地保持添加的扰动噪声的数量。然后，我们提出了一种修改的遗传算法，以在相当少量的查询中生成对抗样本。我们的结果表明，我们的算法产生的对抗图像可以成功地误导HDC分类器以产生错误的预测标签，即高概率（即HDC分类器使用固定多数规则进行决策时，78％）。最后，我们还提出了两种防御策略 - 对抗性训练和训练 - 以增强HDC分类器的安全性。

Being an emerging class of in-memory computing architecture, brain-inspired hyperdimensional computing (HDC) mimics brain cognition and leverages random hypervectors (i.e., vectors with a dimensionality of thousands or even more) to represent features and to perform classification tasks. The unique hypervector representation enables HDC classifiers to exhibit high energy efficiency, low inference latency and strong robustness against hardware-induced bit errors. Consequently, they have been increasingly recognized as an appealing alternative to or even replacement of traditional deep neural networks (DNNs) for local on device classification, especially on low-power Internet of Things devices. Nonetheless, unlike their DNN counterparts, state-of-the-art designs for HDC classifiers are mostly security-oblivious, casting doubt on their safety and immunity to adversarial inputs. In this paper, we study for the first time adversarial attacks on HDC classifiers and highlight that HDC classifiers can be vulnerable to even minimally-perturbed adversarial samples. Concretely, using handwritten digit classification as an example, we construct a HDC classifier and formulate a grey-box attack problem, where an attacker's goal is to mislead the target HDC classifier to produce erroneous prediction labels while keeping the amount of added perturbation noise as little as possible. Then, we propose a modified genetic algorithm to generate adversarial samples within a reasonably small number of queries. Our results show that adversarial images generated by our algorithm can successfully mislead the HDC classifier to produce wrong prediction labels with a high probability (i.e., 78% when the HDC classifier uses a fixed majority rule for decision). Finally, we also present two defense strategies -- adversarial training and retraining-- to strengthen the security of HDC classifiers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题