德国音素识别具有文本到音量数据的增强

论文标题

德国音素识别具有文本到音量数据的增强

German Phoneme Recognition with Text-to-Phoneme Data Augmentation

论文作者

Park, Dojun, Park, Seohyun

论文摘要

在这项研究中，我们试验了使用文本到词素数据增强策略在德国音素识别模型上为基本词汇添加最常见的n音素大bigram的效果。结果，与基线模型相比，元音30模型和CONST20模型显示出BLEU得分的增加超过1点，总30模型显示BLEU得分的显着降低超过20分，这表明Phoneme BigRAMS可能对模型性能产生正或负面影响。此外，我们确定了模型通过错误分析反复显示的错误类型。

In this study, we experimented to examine the effect of adding the most frequent n phoneme bigrams to the basic vocabulary on the German phoneme recognition model using the text-to-phoneme data augmentation strategy. As a result, compared to the baseline model, the vowel30 model and the const20 model showed an increased BLEU score of more than 1 point, and the total30 model showed a significant decrease in the BLEU score of more than 20 points, showing that the phoneme bigrams could have a positive or negative effect on the model performance. In addition, we identified the types of errors that the models repeatedly showed through error analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题