论文标题
一个两阶段的U-NET,用于高保真的历史记录
A Two-Stage U-Net for High-Fidelity Denoising of Historical Recordings
论文作者
论文摘要
提高历史音乐录音的声音质量是一个长期存在的问题。本文介绍了一种基于完全跨跨深度神经网络的新型降级方法。两阶段的U-NET模型体系结构旨在建模和抑制高保真度的降解。该方法处理音频的时频表示,并使用逼真的嘈杂数据进行训练,以共同删除旧模拟盘的嘶嘶声,点击,重击和其他常见的加性干扰。所提出的模型在客观和主观指标中均优于先前的方法。正式的盲目听力测试的结果表明,使用此方法剥夺的真实搜索录制的质量明显优于基线方法。这项研究表明了现实的培训数据和音频恢复中深度学习的力量的重要性。
Enhancing the sound quality of historical music recordings is a long-standing problem. This paper presents a novel denoising method based on a fully-convolutional deep neural network. A two-stage U-Net model architecture is designed to model and suppress the degradations with high fidelity. The method processes the time-frequency representation of audio, and is trained using realistic noisy data to jointly remove hiss, clicks, thumps, and other common additive disturbances from old analog discs. The proposed model outperforms previous methods in both objective and subjective metrics. The results of a formal blind listening test show that real gramophone recordings denoised with this method have significantly better quality than the baseline methods. This study shows the importance of realistic training data and the power of deep learning in audio restoration.