论文标题
提交给Voxceleb扬声器识别挑战2020年的UPC扬声器验证系统(VoxSRC-20)
The UPC Speaker Verification System Submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)
论文作者
论文摘要
该报告描述了加泰罗尼亚技术大学(UPC)对2020年Interspeech的Voxceleb发言人识别挑战(VoxSRC-20)的提交。最终提交是三个系统的组合。 System-1是一种基于自动编码器的方法,它试图重建类似的I-向量,而System-2和-3是基于卷积神经网络(CNN)的暹罗体系结构。暹罗网络分别有两个和三个分支,每个分支是CNN编码器。双支暹罗人在训练过程中使用横熵损失进行二进制分类。鉴于,我们的三支分支暹罗人经过培训,可以使用三重态损失来学习扬声器的嵌入。我们在Voxceleb-1测试,VoxSRC-20验证和测试集上提供系统的结果。
This report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. System-1 is an autoencoder based approach which tries to reconstruct similar i-vectors, whereas System-2 and -3 are Convolutional Neural Network (CNN) based siamese architectures. The siamese networks have two and three branches, respectively, where each branch is a CNN encoder. The double-branch siamese performs binary classification using cross entropy loss during training. Whereas, our triple-branch siamese is trained to learn speaker embeddings using triplet loss. We provide results of our systems on VoxCeleb-1 test, VoxSRC-20 validation and test sets.