自我淘汰：通过分离和混音无监督的语音分离

论文标题

自我淘汰：通过分离和混音无监督的语音分离

Self-Remixing: Unsupervised Speech Separation via Separation and Remixing

论文作者

Saijo, Kohei, Ogawa, Tetsuji

论文摘要

我们提出了一种新型的自我监督语音分离方法，以无监督的方式完善了预训练的分离模型。所提出的方法由调动器模块和一个求解器模块组成，它们通过分离和混合过程一起生长。具体而言，洗牌者首先将观察到的混合物分开，并通过改组和混合分离的信号来制造伪混合物。然后，求解器将伪混合物分开，并将分离的信号重新混合回观察到的混合物。使用观察到的混合物作为监督对求解器进行了训练，而随机的重量则通过使用求解器的平均值来更新，从而产生伪混合物的扭曲较少。我们的实验表明，自我淘汰在现有的基于重新混合的自我监督方法方面具有更好的性能，并在无监督的设置下具有相同或更少的培训费用。自我淘汰在半监督域的适应性中也优于基准，显示了多种设置的有效性。

We present Self-Remixing, a novel self-supervised speech separation method, which refines a pre-trained separation model in an unsupervised manner. The proposed method consists of a shuffler module and a solver module, and they grow together through separation and remixing processes. Specifically, the shuffler first separates observed mixtures and makes pseudo-mixtures by shuffling and remixing the separated signals. The solver then separates the pseudo-mixtures and remixes the separated signals back to the observed mixtures. The solver is trained using the observed mixtures as supervision, while the shuffler's weights are updated by taking the moving average with the solver's, generating the pseudo-mixtures with fewer distortions. Our experiments demonstrate that Self-Remixing gives better performance over existing remixing-based self-supervised methods with the same or less training costs under unsupervised setup. Self-Remixing also outperforms baselines in semi-supervised domain adaptation, showing effectiveness in multiple setups.

下载PDF全文

下载文献需遵守相关版权规定

论文标题