具有生成模型的音频信号的源编码

论文标题

具有生成模型的音频信号的源编码

Source coding of audio signals with a generative model

论文作者

Fejgin, Roy, Klejsa, Janusz, Villemoes, Lars, Zhou, Cong

论文摘要

我们考虑借助生成模型来考虑音频信号的源编码。我们使用首先量化波形的结构，从而产生有限的比特率表示。然后，通过从量化波形条件的模型中进行随机采样来重建波形。理论上分析了提出的编码方案。使用Samplernn作为生成模型，我们证明了所提出的编码结构为特定类别的音频信号提供了具有最新的源源编码工具的性能竞争。

We consider source coding of audio signals with the help of a generative model. We use a construction where a waveform is first quantized, yielding a finite bitrate representation. The waveform is then reconstructed by random sampling from a model conditioned on the quantized waveform. The proposed coding scheme is theoretically analyzed. Using SampleRNN as the generative model, we demonstrate that the proposed coding structure provides performance competitive with state-of-the-art source coding tools for specific categories of audio signals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题