论文标题

概率的球形判别分析:PLDA的替代方案,用于长度归一化嵌入

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

论文作者

Brümmer, Niko, Swart, Albert, Mošner, Ladislav, Silnova, Anna, Plchot, Oldřich, Stafylakis, Themos, Burget, Lukáš

论文摘要

在说话者的识别中,将语音段映射到嵌入在单位孔隙的嵌入中,通常使用两个评分后端,即余弦评分或PLDA。两者都有优势和缺点,具体取决于上下文。余弦得分自然而然地来自球形几何形状,但是对于PLDA而言,祝福是混合的 - 长度正常化高斯扬声器之间的分布,但违反了与说话者无关的内部言论中言论的假设。我们提出了PSDA,这是与PLDA的类似物,该类似物使用von mises-fisher分布在高​​层分布和阶段之间的分布中。我们展示了此分布的自我轭性如何给出封闭形式的似然比分数,从而使其在得分时间下替换了PLDA。可以对各种试验进行评分,包括单个注释和多个注册验证,以及可用于聚类和诊断的更复杂的似然比率。学习是通过具有封闭形式更新的EM-Algorithm完成的。我们解释了该模型,并提出了一些第一个实验。

In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA. Both have advantages and disadvantages, depending on the context. Cosine scoring follows naturally from the spherical geometry, but for PLDA the blessing is mixed -- length normalization Gaussianizes the between-speaker distribution, but violates the assumption of a speaker-independent within-speaker distribution. We propose PSDA, an analogue to PLDA that uses Von Mises-Fisher distributions on the hypersphere for both within and between-class distributions. We show how the self-conjugacy of this distribution gives closed-form likelihood-ratio scores, making it a drop-in replacement for PLDA at scoring time. All kinds of trials can be scored, including single-enroll and multi-enroll verification, as well as more complex likelihood-ratios that could be used in clustering and diarization. Learning is done via an EM-algorithm with closed-form updates. We explain the model and present some first experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源