论文标题

婴儿在现实环境中哭泣的检测

Infant Crying Detection in Real-World Environments

论文作者

Yao, Xuewen, Micheletti, Megan, Johnson, Mckensey, Thomaz, Edison, de Barbaro, Kaya

论文摘要

大多数现有的CRY检测模型已通过在受控设置中收集的数据进行了测试。因此,他们概括为嘈杂和生活环境的程度尚不清楚。在本文中,我们评估了几种既定的机器学习方法,包括利用深度频谱和声学特征的模型。该模型能够识别F1得分为0.613的哭泣事件(精度:0.672,召回:0.552),显示出在日常现实世界中哭泣检测时现有方法比现有方法提高的外部有效性。作为评估的一部分,我们收集并注释了一个新颖的婴儿数据集,该数据集是从780个小时的现实世界中的标签上哭泣的,这些数据集是通过婴儿在他们的家中穿着的录音机捕获的,我们可以公开使用。我们的发现证实,在提供现实世界数据(LAB测试F1:0.656,现实世界测试F1:0.236)时,经过LAB数据表现不佳的哭泣检测模型突出了我们新数据集和模型的价值。

Most existing cry detection models have been tested with data collected in controlled settings. Thus, the extent to which they generalize to noisy and lived environments is unclear. In this paper, we evaluate several established machine learning approaches including a model leveraging both deep spectrum and acoustic features. This model was able to recognize crying events with F1 score 0.613 (Precision: 0.672, Recall: 0.552), showing improved external validity over existing methods at cry detection in everyday real-world settings. As part of our evaluation, we collect and annotate a novel dataset of infant crying compiled from over 780 hours of labeled real-world audio data, captured via recorders worn by infants in their homes, which we make publicly available. Our findings confirm that a cry detection model trained on in-lab data underperforms when presented with real-world data (in-lab test F1: 0.656, real-world test F1: 0.236), highlighting the value of our new dataset and model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源