论文标题

Glottal源估计技术的比较研究

A Comparative Study of Glottal Source Estimation Techniques

论文作者

Drugman, Thomas, Bozkurt, Baris, Dutoit, Thierry

论文摘要

源分解(或声门流量估计)是语音处理的基本问题之一。为此,文献中已经提出了几种技术。但是,比较不同方法的研究几乎不存在。此外,实验已在合成语音或持续元音上系统地进行。在这项研究中,我们比较了glottal流量估计的三种主要代表性最先进方法:闭合相位逆滤波,迭代和适应性逆滤波和混合相分解。这些技术首先提交有关合成语音信号的客观评估测试。它们对影响估计质量的各种因素的敏感性以及对噪声的鲁棒性的敏感性。在第二个实验中,在大量的真实连接语音的语料库中研究了它们标记语音质量(紧张,模态,软)的能力。结果表明,语音质量的变化反映了glottal特征分布中的重大修改。基于混合相分解和闭合相位滤波过程的技术,可以在清洁合成和真实的语音信号上获得最佳结果。另一方面,建议在嘈杂的环境中进行迭代和自适应逆滤波,以提高其较高的鲁棒性。

Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However studies comparing different approaches are almost nonexistent. Besides, experiments have been systematically performed either on synthetic speech or on sustained vowels. In this study we compare three of the main representative state-of-the-art methods of glottal flow estimation: closed-phase inverse filtering, iterative and adaptive inverse filtering, and mixed-phase decomposition. These techniques are first submitted to an objective assessment test on synthetic speech signals. Their sensitivity to various factors affecting the estimation quality, as well as their robustness to noise are studied. In a second experiment, their ability to label voice quality (tensed, modal, soft) is studied on a large corpus of real connected speech. It is shown that changes of voice quality are reflected by significant modifications in glottal feature distributions. Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals. On the other hand, iterative and adaptive inverse filtering is recommended in noisy environments for its high robustness.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源