通过异步分布式麦克风的话语会议转录系统

论文标题

通过异步分布式麦克风的话语会议转录系统

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones

论文作者

Horiguchi, Shota, Fujita, Yusuke, Nagamatsu, Kenji

论文摘要

本文提出了一种使用异步麦克风来满足转录的新型框架。它包括音频同步，说话者诊断，使用指导源分离，自动语音识别和降低重复的语音来增强语音的语音。在语音增强之前进行扬声器诊断使系统能够处理重叠的语音，而无需考虑麦克风之间的采样频率不匹配。在我们的实际会议数据集中的评估表明，我们的框架通过使用11个分布式麦克风实现了28.7％的字符错误率（CER），而放置在桌子中央的单声道麦克风的CER为38.2％。我们还表明，我们的框架达到了21.8％的CER，在基于耳机麦克风的转录中仅比CER高2.1个百分点。

A novel framework for meeting transcription using asynchronous microphones is proposed in this paper. It consists of audio synchronization, speaker diarization, utterance-wise speech enhancement using guided source separation, automatic speech recognition, and duplication reduction. Doing speaker diarization before speech enhancement enables the system to deal with overlapped speech without considering sampling frequency mismatch between microphones. Evaluation on our real meeting datasets showed that our framework achieved a character error rate (CER) of 28.7 % by using 11 distributed microphones, while a monaural microphone placed on the center of the table had a CER of 38.2 %. We also showed that our framework achieved CER of 21.8 %, which is only 2.1 percentage points higher than the CER in headset microphone-based transcription.

下载PDF全文

下载文献需遵守相关版权规定

论文标题