音频编解码器增强具有生成对抗网络

论文标题

音频编解码器增强具有生成对抗网络

Audio Codec Enhancement with Generative Adversarial Networks

论文作者

Biswas, Arijit, Jia, Dai

论文摘要

音频编解码器通常是基于变换域的，并且有效地代码固定音频信号，但是它们在语音和信号中挣扎，其中包含诸如掌声之类的密集瞬态事件。具体而言，以这两类信号为示例，我们演示了一种基于生成对抗网络（GAN）来恢复音频噪声的技术。提出的基于GAN的编码音频增强器的主要优点是该方法直接在解码的音频样品上端到端运行，从而消除了设计任何手动制作的前端的需求。此外，本文描述的增强方法可以提高低位速率编码音频的声音质量，而不会对现有的符合标准编码器进行任何修改。主观测试表明，拟议的增强剂可以提高语音质量，并且很难大大掌声摘要。

Audio codecs are typically transform-domain based and efficiently code stationary audio signals, but they struggle with speech and signals containing dense transient events such as applause. Specifically, with these two classes of signals as examples, we demonstrate a technique for restoring audio from coding noise based on generative adversarial networks (GAN). A primary advantage of the proposed GAN-based coded audio enhancer is that the method operates end-to-end directly on decoded audio samples, eliminating the need to design any manually-crafted frontend. Furthermore, the enhancement approach described in this paper can improve the sound quality of low-bit rate coded audio without any modifications to the existent standard-compliant encoders. Subjective tests illustrate that the proposed enhancer improves the quality of speech and difficult to code applause excerpts significantly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题