学习不歧视：提高单语言和代码转换语音识别的任务不可思议的学习

论文标题

学习不歧视：提高单语言和代码转换语音识别的任务不可思议的学习

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

论文作者

Madhumani, Gurunath Reddy, Shah, Sanket, Abraham, Basil, Joshi, Vikas, Sitaram, Sunayana

论文摘要

出于多种原因，包括缺乏代码切换的培训数据，识别代码开关的语音对自动语音识别（ASR）具有挑战性。最近，我们表明，单语ASR系统在单语音识别方面的性能中进行了微调，因此在多语言场景中部署的ASR系统应该识别单语和代码转换的语音，这是不可取的。我们的实验表明，通过使用某些策略进行微调和正则化，可以减轻性能损失，从而改善单语和代码转换ASR。在这项工作中，我们通过使用域对抗性学习来培训任务不可知模型，对以前的工作进行进一步的改进。我们评估了对抗歧视器的分类精度，并表明它可以学习任务不可知的共享层参数。我们从汇总模型开始训练端到端的ASR系统，该模型与对抗性歧视器一起使用单语和代码开关数据。我们提出的技术导致在三个语言对的单语言和代码切换测试集中降低单词错误率（WER）。

Recognizing code-switched speech is challenging for Automatic Speech Recognition (ASR) for a variety of reasons, including the lack of code-switched training data. Recently, we showed that monolingual ASR systems fine-tuned on code-switched data deteriorate in performance on monolingual speech recognition, which is not desirable as ASR systems deployed in multilingual scenarios should recognize both monolingual and code-switched speech with high accuracy. Our experiments indicated that this loss in performance could be mitigated by using certain strategies for fine-tuning and regularization, leading to improvements in both monolingual and code-switched ASR. In this work, we present further improvements over our previous work by using domain adversarial learning to train task agnostic models. We evaluate the classification accuracy of an adversarial discriminator and show that it can learn shared layer parameters that are task agnostic. We train end-to-end ASR systems starting with a pooled model that uses monolingual and code-switched data along with the adversarial discriminator. Our proposed technique leads to reductions in Word Error Rates (WER) in monolingual and code-switched test sets across three language pairs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题