通过自我监督的学习来防御黑盒对反欺骗模型的攻击

论文标题

通过自我监督的学习来防御黑盒对反欺骗模型的攻击

Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning

论文作者

Wu, Haibin, Liu, Andy T., Lee, Hung-yi

论文摘要

用于自动扬声器验证（ASV）的高性能反动体模型已通过识别和过滤伪造音频来保护ASV，这些音频是由文本到语音，语音转换，音频重播等故意生成的。但是，已显示出高性能的反性抗烟模模型对Everversarial everversarial satersarial satersarial saterable overnable overable。与原始数据无法区分但导致不正确预测的对抗性攻击对于反欺骗模型是危险的，而不是在争议中我们应该以任何代价检测到它们。为了探索这个问题，我们建议采用基于自我监督的学习模型MockingJay，以保护反欺骗模型免受黑盒场景中的对抗攻击。自我监督的学习模型可有效改善下游任务性能，例如电话分类或ASR。但是，尚未探讨它们对对抗攻击的影响。在这项工作中，我们通过使用防御对抗性攻击来探讨自我监督学到的高级表示的鲁棒性。提出了一个层噪声与信号比（LNSR），以量化和测量深层模型在对抗对抗噪声中的有效性。 ASVSPOOF 2019数据集的实验结果表明，MockingJay提取的高级表示可以防止对抗性示例的转移性，并成功反击黑盒攻击。

High-performance anti-spoofing models for automatic speaker verification (ASV), have been widely used to protect ASV by identifying and filtering spoofing audio that is deliberately generated by text-to-speech, voice conversion, audio replay, etc. However, it has been shown that high-performance anti-spoofing models are vulnerable to adversarial attacks. Adversarial attacks, that are indistinguishable from original data but result in the incorrect predictions, are dangerous for anti-spoofing models and not in dispute we should detect them at any cost. To explore this issue, we proposed to employ Mockingjay, a self-supervised learning based model, to protect anti-spoofing models against adversarial attacks in the black-box scenario. Self-supervised learning models are effective in improving downstream task performance like phone classification or ASR. However, their effect in defense for adversarial attacks has not been explored yet. In this work, we explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks. A layerwise noise to signal ratio (LNSR) is proposed to quantize and measure the effectiveness of deep models in countering adversarial noise. Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples, and successfully counter black-box attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题