论文标题

Skipconvgan:通过复杂的时频掩蔽使用生成对抗网络的单声道语音覆盖

SkipConvGAN: Monaural Speech Dereverberation using Generative Adversarial Networks via Complex Time-Frequency Masking

论文作者

Kothapally, Vinay, Hansen, J. H. L.

论文摘要

随着深度学习方法的进步,在背景噪声的存在下,语音增强系统的性能已显示出显着改善。但是,改善系统对混响的鲁棒性仍然是一项正在进行的工作,因为由于时间和频率的涂抹效果,混响往往会导致赋予共同剂结构的丧失。广泛的基于深度学习的系统可以增强幅度响应,并使用复杂的时频掩模重复使用扭曲的相位,或者增强复杂频谱图。尽管这些方法表现出令人满意的性能,但它们并未直接解决由混响引起的损失的义剂结构。我们认为,检索共振剂结构可以帮助提高现有系统的效率。在这项研究中,我们提出了Skipconvgan-我们先前的工作SkipConvnet的扩展。拟议的系统的发电机网络试图估计有效的复杂时频掩模,而鉴别器网络有助于驱动发电机恢复丢失的赋形结构。我们评估了我们提出的系统对Reverb挑战语料库单渠道任务的混响语音的模拟和真实记录的性能。提出的系统比其他基于深度学习的生成对抗框架显示出多个房间配置的一致改进。

With the advancements in deep learning approaches, the performance of speech enhancing systems in the presence of background noise have shown significant improvements. However, improving the system's robustness against reverberation is still a work in progress, as reverberation tends to cause loss of formant structure due to smearing effects in time and frequency. A wide range of deep learning-based systems either enhance the magnitude response and reuse the distorted phase or enhance complex spectrogram using a complex time-frequency mask. Though these approaches have demonstrated satisfactory performance, they do not directly address the lost formant structure caused by reverberation. We believe that retrieving the formant structure can help improve the efficiency of existing systems. In this study, we propose SkipConvGAN - an extension of our prior work SkipConvNet. The proposed system's generator network tries to estimate an efficient complex time-frequency mask, while the discriminator network aids in driving the generator to restore the lost formant structure. We evaluate the performance of our proposed system on simulated and real recordings of reverberant speech from the single-channel task of the REVERB challenge corpus. The proposed system shows a consistent improvement across multiple room configurations over other deep learning-based generative adversarial frameworks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源