在您的声音中学习：基于说话者一致性损失的非平行语音转换

论文标题

在您的声音中学习：基于说话者一致性损失的非平行语音转换

Learning in your voice: Non-parallel voice conversion based on speaker consistency loss

论文作者

Kwon, Yoohwan, Chung, Soo-Whan, Heo, Hee-Soo, Kang, Hong-Goo

论文摘要

在本文中，我们提出了一种新颖的语音转换策略，以解决训练和转换方案之间的不匹配，而平行语音语料库无法进行培训。基于自动编码器和解开框架，我们设计了提出的模型，以提取身份和内容表示，同时重建输入语音信号本身。由于我们在培训过程中使用其他说话者的身份信息，因此培训理念与语音转换过程的目的自然匹配。此外，我们有效地设计了解开框架，以可靠地保留语言信息并提高转换的语音信号的质量。提出的方法的优越性显示在主观听力测试以及客观措施中。

In this paper, we propose a novel voice conversion strategy to resolve the mismatch between the training and conversion scenarios when parallel speech corpus is unavailable for training. Based on auto-encoder and disentanglement frameworks, we design the proposed model to extract identity and content representations while reconstructing the input speech signal itself. Since we use other speaker's identity information in the training process, the training philosophy is naturally matched with the objective of voice conversion process. In addition, we effectively design the disentanglement framework to reliably preserve linguistic information and to enhance the quality of converted speech signals. The superiority of the proposed method is shown in subjective listening tests as well as objective measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题