论文标题
在L2英语语音中,基于深度节段的语音后验发现非类别的发现
Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech
论文作者
论文摘要
第二语言(L2)语音通常用本机,电话类别标记。但是,在许多情况下,很难决定使用L2段所属的分类手机。这些细分被视为非类别。大多数现有的错误发音检测和诊断方法(MDD)仅与分类错误有关,即手机类别被另一个错误插入,删除或替代。但是,不考虑非类别错误。为了建模这些非类别错误,这项工作旨在探索非类别模式以扩展分类手机集。我们应用语音段分类器来生成分段语音后验(SPPG)来表示电话段级信息。然后,我们通过寻找多个峰的SPPG来探索非类别。与基线系统相比,该方法探讨了更多非类别模式,并且感知实验结果表明,在两种不同的措施下,探索的非类别更准确,混淆度增加了7.3%和7.5%。最后,我们初步分析了这些非类别背后的原因。
Second language (L2) speech is often labeled with the native, phone categories. However, in many cases, it is difficult to decide on a categorical phone that an L2 segment belongs to. These segments are regarded as non-categories. Most existing approaches for Mispronunciation Detection and Diagnosis (MDD) are only concerned with categorical errors, i.e. a phone category is inserted, deleted or substituted by another. However, non-categorical errors are not considered. To model these non-categorical errors, this work aims at exploring non-categorical patterns to extend the categorical phone set. We apply a phonetic segment classifier to generate segmental phonetic posterior-grams (SPPGs) to represent phone segment-level information. And then we explore the non-categories by looking for the SPPGs with more than one peak. Compared with the baseline system, this approach explores more non-categorical patterns, and also perceptual experimental results show that the explored non-categories are more accurate with increased confusion degree by 7.3% and 7.5% under two different measures. Finally, we preliminarily analyze the reason behind those non-categories.