论文标题
探索连续的集成和火力,以同时自适应语音翻译
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
论文作者
论文摘要
同时的语音翻译(Simulst)是一项具有挑战性的任务,旨在在观察到完整输入之前翻译流语音。 Simulst系统通常包括两个组件:汇总语音信息和决定读写的策略的预定。尽管最近的作品提出了各种策略来改善预定前的决定,但他们主要采用固定的Wait-K政策,而自适应政策很少探讨。本文提议通过调整连续的集成与火(CIF)来对自适应政策进行建模。与单调的多头注意(MMA)相比,我们的方法具有更简单的计算,低潜伏期的卓越质量以及对长话语的更好概括。我们在必C1 V2数据集上进行实验,并显示我们方法的有效性。
Simultaneous speech translation (SimulST) is a challenging task aiming to translate streaming speech before the complete input is observed. A SimulST system generally includes two components: the pre-decision that aggregates the speech information and the policy that decides to read or write. While recent works had proposed various strategies to improve the pre-decision, they mainly adopt the fixed wait-k policy, leaving the adaptive policies rarely explored. This paper proposes to model the adaptive policy by adapting the Continuous Integrate-and-Fire (CIF). Compared with monotonic multihead attention (MMA), our method has the advantage of simpler computation, superior quality at low latency, and better generalization to long utterances. We conduct experiments on the MuST-C V2 dataset and show the effectiveness of our approach.