论文标题

在大声的商业音乐上朝着强大的音乐源分离

Towards robust music source separation on loud commercial music

论文作者

Jeon, Chang-Bin, Lee, Kyogu

论文摘要

如今,与过去相比,商业音乐具有极端的响度和压缩的动态范围。然而,在音乐源分离中,这些特征尚未被彻底考虑,从而导致实验室与现实世界之间的领域不匹配。在本文中,我们确认该领域不匹配对音乐源分离网络的性能产生负面影响。为此,我们首先通过模仿音乐掌握过程来创建MUSDB-L和XL的室外评估数据集。然后,我们定量验证了最新算法的性能是否在我们的数据集中显着恶化。最后,我们提出了限制性数据增强方法,以减少域不匹配,该域在训练数据采样过程中利用在线限制器。我们确认,它不仅减轻了我们室外数据集的性能降解,而且还会导致更高的域数据性能。

Nowadays, commercial music has extreme loudness and heavily compressed dynamic range compared to the past. Yet, in music source separation, these characteristics have not been thoroughly considered, resulting in the domain mismatch between the laboratory and the real world. In this paper, we confirmed that this domain mismatch negatively affect the performance of the music source separation networks. To this end, we first created the out-of-domain evaluation datasets, musdb-L and XL, by mimicking the music mastering process. Then, we quantitatively verify that the performance of the state-of-the-art algorithms significantly deteriorated in our datasets. Lastly, we proposed LimitAug data augmentation method to reduce the domain mismatch, which utilizes an online limiter during the training data sampling process. We confirmed that it not only alleviates the performance degradation on our out-of-domain datasets, but also results in higher performance on in-domain data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源