论文标题
丹麦童话般的对话的语音检测对于低资源的野外条件:一个案例研究
Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study
论文作者
论文摘要
将语音模型用于自动语音处理任务可以提高筛查,分析,诊断和治疗医学和精神病学的效率。但是,预处理任务(例如分割和诊断)的执行可能会大大下降,特别是当目标数据集组成非典型语音时。在本文中,我们研究了预先训练的语音模型的性能,该模型包括丹麦在分类阈值的丹麦语对话中。由于我们无法访问足够的标记数据,因此我们提出了很少的现实阈值适应,其中我们采用语音对话的第一分钟来获得最佳分类阈值。通过本文的工作,我们了解到,具有默认分类阈值的模型对患者组的儿童的表现较差。此外,该模型的错误率与患者的诊断严重程度直接相关。最后,我们对几乎没有建立适应性的研究表明,三分钟的临床医生对话足以获得最佳的分类阈值。
Use of speech models for automatic speech processing tasks can improve efficiency in the screening, analysis, diagnosis and treatment in medicine and psychiatry. However, the performance of pre-processing speech tasks like segmentation and diarization can drop considerably on in-the-wild clinical data, specifically when the target dataset comprises of atypical speech. In this paper we study the performance of a pre-trained speech model on a dataset comprising of child-clinician conversations in Danish with respect to the classification threshold. Since we do not have access to sufficient labelled data, we propose few-instance threshold adaptation, wherein we employ the first minutes of the speech conversation to obtain the optimum classification threshold. Through our work in this paper, we learned that the model with default classification threshold performs worse on children from the patient group. Furthermore, the error rates of the model is directly correlated to the severity of diagnosis in the patients. Lastly, our study on few-instance adaptation shows that three-minutes of clinician-child conversation is sufficient to obtain the optimum classification threshold.