通过利用ASR n-最佳假设来提高口语理解

论文标题

通过利用ASR n-最佳假设来提高口语理解

Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

论文作者

Li, Mingda, Ruan, Weitong, Liu, Xinyue, Soldaini, Luca, Hamza, Wael, Su, Chengwei

论文摘要

在现代的口语理解（SLU）系统中，自然语言理解（NLU）模块将自动语音识别（ASR）模块中的语音解释为输入。 NLU模块通常在下游任务（例如域和意图分类）中使用给定语音的第一个最佳解释。但是，ASR模块可能会误解某些演讲，第一个最佳解释可能是错误的和嘈杂的。仅依靠第一个最佳解释可能会使下游任务的执行不最佳。为了解决这个问题，我们介绍了一系列简单但有效的模型，以通过从ASR模块中共同利用N-O-T-O-TEAS语音解释来改善对输入语音的语义的理解。

In a modern spoken language understanding (SLU) system, the natural language understanding (NLU) module takes interpretations of a speech from the automatic speech recognition (ASR) module as the input. The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification. However, the ASR module might misrecognize some speeches and the first best interpretation could be erroneous and noisy. Solely relying on the first best interpretation could make the performance of downstream tasks non-optimal. To address this issue, we introduce a series of simple yet efficient models for improving the understanding of semantics of the input speeches by collectively exploiting the n-best speech interpretations from the ASR module.

下载PDF全文

下载文献需遵守相关版权规定

论文标题