2022 ICASSP的伪造音频生成任务的HCCL-DKU系统添加挑战

论文标题

2022 ICASSP的伪造音频生成任务的HCCL-DKU系统添加挑战

The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge

论文作者

Chen, Ziyi, Hua, Hua, Zhang, Yuxiang, Li, Ming, Zhang, Pengyuan

论文摘要

语音转换任务是在保留语言内容的同时修改连续语音的说话者身份。通常，自然性和相似性是评估转化质量的两个主要指标，近年来已经显着改善。本文介绍了2022 ICASSP添加挑战的假音频生成任务的HCCL-DKU条目。我们提出了一种基于PPG的新型语音转换模型，该模型采用了完全端到端的结构。实验结果表明，有关转换质量和针对反欺骗系统的欺骗性能，该提出的方法优于其他转换模型，包括基于TACOTRON的模型和基于Fastspeech的模型。此外，我们研究了几种后处理方法，以更好地欺骗功率。最后，我们在“增加挑战”中以0.916的欺骗成功率获得第二名。

The voice conversion task is to modify the speaker identity of continuous speech while preserving the linguistic content. Generally, the naturalness and similarity are two main metrics for evaluating the conversion quality, which has been improved significantly in recent years. This paper presents the HCCL-DKU entry for the fake audio generation task of the 2022 ICASSP ADD challenge. We propose a novel ppg-based voice conversion model that adopts a fully end-to-end structure. Experimental results show that the proposed method outperforms other conversion models, including Tacotron-based and Fastspeech-based models, on conversion quality and spoofing performance against anti-spoofing systems. In addition, we investigate several post-processing methods for better spoofing power. Finally, we achieve second place with a deception success rate of 0.916 in the ADD challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题