论文标题

对ASR性能的进行性纹章障碍引起的退化语音分析

An analysis of degenerating speech due to progressive dysarthria on ASR performance

论文作者

Tomanek, Katrin, Seaver, Katie, Jiang, Pan-Pan, Cave, Richard, Harrel, Lauren, Green, Jordan R.

论文摘要

尽管个性化的自动语音识别(ASR)模型最近被设计为识别甚至严重受损的语音,但模型性能可能会随着时间的流逝而降级,因为患有言语的人。这项研究的目的是(1)分析语音降解的个体中ASR随时间的变化,以及(2)探索缓解策略,以优化整个疾病进展的识别。由于肌萎缩性侧硬化症(ALS)而导致的四名患者记录了四名言语的人。计算了三种ASR模型的录制会话中的单词错误率(WER):远不适应的说话者独立(U-SI),适应性的扬声器独立(A-SI)和调整的扬声器依赖性(A-SD或个性化)。随着语音变得更加损害,随着时间的流逝,所有三种模型的性能都大大退化,但是当通过录音中从严重的语音进展阶段进行录音时,A-SD模型的性能显着改善。在疾病的早期记录其他话语之前,在言语降解之前并没有提高A-SD模型的性能。总体而言,我们的发现强调了为有渐进语音障碍的人提供个性化模型时,连续记录(和模型再培训)的重要性。

Although personalized automatic speech recognition (ASR) models have recently been designed to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition throughout disease progression. Speech was recorded by four individuals with degrading speech due to amyotrophic lateral sclerosis (ALS). Word error rates (WER) across recording sessions were computed for three ASR models: Unadapted Speaker Independent (U-SI), Adapted Speaker Independent (A-SI), and Adapted Speaker Dependent (A-SD or personalized). The performance of all three models degraded significantly over time as speech became more impaired, but the performance of the A-SD model improved markedly when it was updated with recordings from the severe stages of speech progression. Recording additional utterances early in the disease before speech degraded significantly did not improve the performance of A-SD models. Overall, our findings emphasize the importance of continuous recording (and model retraining) when providing personalized models for individuals with progressive speech impairments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源