论文标题
开源MagicData-RAMC:丰富的注释普通话对话(RAMC)语音数据集
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
论文作者
论文摘要
本文介绍了称为MagicData-Ramc的高质量注释的普通话对话(RAMC)语音数据集。 MagicData-RAMC语料库包含180小时的对话语音数据,这些语音数据是从手机的母语中汉语中汉语中的母语人士记录的,采样率为16 kHz。 MagicData-RAMC中的对话框分为15个多元化域,并标有主题标签,从科学和技术到普通生活。为每个样品手动标记准确的转录和精确的扬声器语音活动时间戳。还提供了演讲者的详细信息。作为一个用于对话方案的普通话语音数据集,MagicData-RAMC丰富了普通话社区中的数据多样性,并允许对一系列与语音相关的任务进行广泛的研究,包括自动语音识别,包括扬声器诊断,扬声器诊断,主题搜索,探测,文本到文本到文本,文本到语音,我们还提供了几个相关的任务,以帮助您进行几个相关的任务。
This paper introduces a high-quality rich annotated Mandarin conversational (RAMC) speech dataset called MagicData-RAMC. The MagicData-RAMC corpus contains 180 hours of conversational speech data recorded from native speakers of Mandarin Chinese over mobile phones with a sampling rate of 16 kHz. The dialogs in MagicData-RAMC are classified into 15 diversified domains and tagged with topic labels, ranging from science and technology to ordinary life. Accurate transcription and precise speaker voice activity timestamps are manually labeled for each sample. Speakers' detailed information is also provided. As a Mandarin speech dataset designed for dialog scenarios with high quality and rich annotations, MagicData-RAMC enriches the data diversity in the Mandarin speech community and allows extensive research on a series of speech-related tasks, including automatic speech recognition, speaker diarization, topic detection, keyword search, text-to-speech, etc. We also conduct several relevant tasks and provide experimental results to help evaluate the dataset.