论文标题
业余声乐分析的新数据集
A New Dataset for Amateur Vocal Percussion Analysis
论文作者
论文摘要
通过人类的声音模仿打击乐器是我们传达节奏思想的自然方式,因此,它吸引了音乐制作者的兴趣。具体而言,这些声音模仿对其模拟仪器的自动映射将使创作者能够更快地将原型的节奏现实。这项研究的贡献是两倍。首先,引入了新的业余声乐打击乐(AVP)数据集,以调查几乎没有或根本没有节拍盒经验的人如何处理声乐打击乐的任务。该分析的最终目标是帮助映射算法在受试者之间更好地概括并实现更高的性能。该数据集总共包括28名参与者记录的9780个话语,并带有完全注释的打击和标签(踢鼓,圈套鼓,闭合的座帽和打开的ho帽)。最后,我们与记录的数据集进行了关于音频发作检测的基线实验,并比较了在人声打击乐中的四种最新算法的性能。
The imitation of percussive instruments via the human voice is a natural way for us to communicate rhythmic ideas and, for this reason, it attracts the interest of music makers. Specifically, the automatic mapping of these vocal imitations to their emulated instruments would allow creators to realistically prototype rhythms in a faster way. The contribution of this study is two-fold. Firstly, a new Amateur Vocal Percussion (AVP) dataset is introduced to investigate how people with little or no experience in beatboxing approach the task of vocal percussion. The end-goal of this analysis is that of helping mapping algorithms to better generalise between subjects and achieve higher performances. The dataset comprises a total of 9780 utterances recorded by 28 participants with fully annotated onsets and labels (kick drum, snare drum, closed hi-hat and opened hi-hat). Lastly, we conducted baseline experiments on audio onset detection with the recorded dataset, comparing the performance of four state-of-the-art algorithms in a vocal percussion context.