Deep Autotuner：用于唱歌表演的音调校正网络

论文标题

Deep Autotuner：用于唱歌表演的音调校正网络

Deep Autotuner: a Pitch Correcting Network for Singing Performances

论文作者

Wager, Sanna, Tzanetakis, George, Wang, Cheng-i, Kim, Minje

论文摘要

我们引入了一种数据驱动的方法，以自动对独奏表演的音调校正。提出的方法可以预测笔记的音高从唱歌和伴奏的各个频谱图之间的关系转移。这种方法与商业系统不同，商业系统通常会以用户定义的分数为中心，以围绕音调或映射到十二个相等的刻度比例度中的最接近音高。拟议的系统将音调视为连续的价值，而不是依靠音乐分数中的一组离散的音符，从而使演唱表演的即兴演奏和统一。我们使用4,702个业余卡拉OK表演的数据集训练我们的神经网络模型。我们的模型都经过了不正确的语调训练，它可以学习校正和有意的音高变化，并将其学会保留。拟议的深神经网络在卷积层之上具有封闭式复发单元的封闭式复发单元显示出令人鼓舞的无数分数自动调节校正任务的表现。

We introduce a data-driven approach to automatic pitch correction of solo singing performances. The proposed approach predicts note-wise pitch shifts from the relationship between the respective spectrograms of the singing and accompaniment. This approach differs from commercial systems, where vocal track notes are usually shifted to be centered around pitches in a user-defined score, or mapped to the closest pitch among the twelve equal-tempered scale degrees. The proposed system treats pitch as a continuous value rather than relying on a set of discretized notes found in musical scores, thus allowing for improvisation and harmonization in the singing performance. We train our neural network model using a dataset of 4,702 amateur karaoke performances selected for good intonation. Our model is trained on both incorrect intonation, for which it learns a correction, and intentional pitch variation, which it learns to preserve. The proposed deep neural network with gated recurrent units on top of convolutional layers shows promising performance on the real-world score-free singing pitch correction task of autotuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题