提交给Voxceleb扬声器识别挑战2020年的UPC扬声器验证系统（VoxSRC-20）

论文标题

提交给Voxceleb扬声器识别挑战2020年的UPC扬声器验证系统（VoxSRC-20）

The UPC Speaker Verification System Submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)

论文作者

Khan, Umair, Hernando, Javier

论文摘要

该报告描述了加泰罗尼亚技术大学（UPC）对2020年Interspeech的Voxceleb发言人识别挑战（VoxSRC-20）的提交。最终提交是三个系统的组合。 System-1是一种基于自动编码器的方法，它试图重建类似的I-向量，而System-2和-3是基于卷积神经网络（CNN）的暹罗体系结构。暹罗网络分别有两个和三个分支，每个分支是CNN编码器。双支暹罗人在训练过程中使用横熵损失进行二进制分类。鉴于，我们的三支分支暹罗人经过培训，可以使用三重态损失来学习扬声器的嵌入。我们在Voxceleb-1测试，VoxSRC-20验证和测试集上提供系统的结果。

This report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. System-1 is an autoencoder based approach which tries to reconstruct similar i-vectors, whereas System-2 and -3 are Convolutional Neural Network (CNN) based siamese architectures. The siamese networks have two and three branches, respectively, where each branch is a CNN encoder. The double-branch siamese performs binary classification using cross entropy loss during training. Whereas, our triple-branch siamese is trained to learn speaker embeddings using triplet loss. We provide results of our systems on VoxCeleb-1 test, VoxSRC-20 validation and test sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题