论文标题

毒液:针对语音识别的有针对性中毒

VenoMave: Targeted Poisoning Against Speech Recognition

论文作者

Aghakhani, Hojjat, Schönherr, Lea, Eisenhofer, Thorsten, Kolossa, Dorothea, Holz, Thorsten, Kruegel, Christopher, Vigna, Giovanni

论文摘要

尽管有显着的改进,但自动语音识别却容易受到对抗性扰动的影响。与标准的机器学习体系结构相比,这些攻击更具挑战性,尤其是因为语音识别系统的输入是时间序列,这些时间序列既包含语音的声学和语言特性。提取所有与识别相关的信息需要更多复杂的管道和专用组件的集合。因此,攻击者需要考虑整个管道。在本文中,我们提出了VENOMAVE,这是对语音识别的第一次训练时间中毒攻击。与主要研究的逃避攻击类似,我们实现了相同的目标:导致系统进行目标音频波形的不正确和攻击者选择的转录。但是,与逃避攻击相反,我们假设攻击者只能在运行时操纵训练数据的一小部分而不会更改目标音频波形。我们评估了对两个数据集的攻击:TIDIGITS和语音命令。当中毒不到数据集的0.17%时,毒液的攻击成功率超过80.0%,而无需访问受害者的网络体系结构或超参数。在更现实的情况下,当目标音频波形在不同房间的空中播放时,Venomave的成功率高达73.3%。最后,Venomave在两个不同的模型体系结构之间达到了36.4%的攻击可转移率。

Despite remarkable improvements, automatic speech recognition is susceptible to adversarial perturbations. Compared to standard machine learning architectures, these attacks are significantly more challenging, especially since the inputs to a speech recognition system are time series that contain both acoustic and linguistic properties of speech. Extracting all recognition-relevant information requires more complex pipelines and an ensemble of specialized components. Consequently, an attacker needs to consider the entire pipeline. In this paper, we present VENOMAVE, the first training-time poisoning attack against speech recognition. Similar to the predominantly studied evasion attacks, we pursue the same goal: leading the system to an incorrect and attacker-chosen transcription of a target audio waveform. In contrast to evasion attacks, however, we assume that the attacker can only manipulate a small part of the training data without altering the target audio waveform at runtime. We evaluate our attack on two datasets: TIDIGITS and Speech Commands. When poisoning less than 0.17% of the dataset, VENOMAVE achieves attack success rates of more than 80.0%, without access to the victim's network architecture or hyperparameters. In a more realistic scenario, when the target audio waveform is played over the air in different rooms, VENOMAVE maintains a success rate of up to 73.3%. Finally, VENOMAVE achieves an attack transferability rate of 36.4% between two different model architectures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源