论文标题
在音频域中针对神经网络的对抗性攻击:利用主要组件
Adversarial Attacks against Neural Networks in Audio Domain: Exploiting Principal Components
论文作者
论文摘要
对抗性攻击是类似于原始输入但故意更改的输入。当今广泛使用的语音到文本神经网络容易误解对抗性攻击。在这项研究中,首先,我们通过改变通用语音数据集的波浪形式来研究目标对抗攻击的存在。我们通过Connectionist暂时分类损失函数来制作对抗波的形式,并攻击DeepSpeech,这是Mozilla实施的语音到文本神经网络。我们在我们制作的所有25种对抗波形上实现了100%的对抗性成功率(Zero -deepspeech的零成功分类)。其次,我们调查了PCA作为针对对抗攻击的防御机制的使用。我们通过将PCA应用于我们创建的25次攻击并针对DeepSpeech进行测试来降低维度。我们观察到通过DeepSpeech的零成功分类,这表明PCA不是音频域中的良好防御机制。最后,我们这次使用PCA使用PCA来制作对抗性的投入,而不是将PCA用作防御机制,而不是使用PCA来制作具有最小对手知识的黑框设置。在不了解模型,参数或权重的情况下,我们通过将PCA应用于普通语音数据集中的样本中,并在针对DeepSpeech测试时再次在黑色框设置下实现100%对抗性成功来制作对抗性攻击。我们还试验在攻击过程中导致分类所需的不同百分比的组件。在所有情况下,对手都会成功。
Adversarial attacks are inputs that are similar to original inputs but altered on purpose. Speech-to-text neural networks that are widely used today are prone to misclassify adversarial attacks. In this study, first, we investigate the presence of targeted adversarial attacks by altering wave forms from Common Voice data set. We craft adversarial wave forms via Connectionist Temporal Classification Loss Function, and attack DeepSpeech, a speech-to-text neural network implemented by Mozilla. We achieve 100% adversarial success rate (zero successful classification by DeepSpeech) on all 25 adversarial wave forms that we crafted. Second, we investigate the use of PCA as a defense mechanism against adversarial attacks. We reduce dimensionality by applying PCA to these 25 attacks that we created and test them against DeepSpeech. We observe zero successful classification by DeepSpeech, which suggests PCA is not a good defense mechanism in audio domain. Finally, instead of using PCA as a defense mechanism, we use PCA this time to craft adversarial inputs under a black-box setting with minimal adversarial knowledge. With no knowledge regarding the model, parameters, or weights, we craft adversarial attacks by applying PCA to samples from Common Voice data set and achieve 100% adversarial success under black-box setting again when tested against DeepSpeech. We also experiment with different percentage of components necessary to result in a classification during attacking process. In all cases, adversary becomes successful.