论文标题
插入暂停和填充单词的归化文本
Naturalization of Text by the Insertion of Pauses and Filler Words
论文作者
论文摘要
在本文中,我们介绍了一组基于自然人类语音的文本的方法。基于语音的互动提供了一种自然的与电子系统接口的方式,并看到了最近的广泛改编。可以通过在适当的位置插入停顿和填充单词,在某种程度上归化这些计算机化的声音。第一个提出的文本转换方法在训练数据中使用Bigrams的频率来在输入句子中进行适当的插入。它使用概率分布来从一组所有可能的插入中选择插入。此方法很快,可以在文本到语音模块之前包含。第二种方法使用复发性神经网络来预测要插入的下一个单词。它确认了Bigram方法给出的插入。另外,在这两种方法中都可以控制归化程度。在进行盲目调查时,我们得出结论,这些文本转换方法的输出与自然语音相当。
In this article, we introduce a set of methods to naturalize text based on natural human speech. Voice-based interactions provide a natural way of interfacing with electronic systems and are seeing a widespread adaptation of late. These computerized voices can be naturalized to some degree by inserting pauses and filler words at appropriate positions. The first proposed text transformation method uses the frequency of bigrams in the training data to make appropriate insertions in the input sentence. It uses a probability distribution to choose the insertions from a set of all possible insertions. This method is fast and can be included before a Text-To-Speech module. The second method uses a Recurrent Neural Network to predict the next word to be inserted. It confirms the insertions given by the bigram method. Additionally, the degree of naturalization can be controlled in both these methods. On the conduction of a blind survey, we conclude that the output of these text transformation methods is comparable to natural speech.