关于异质来源的会话问题回答

论文标题

关于异质来源的会话问题回答

Conversational Question Answering on Heterogeneous Sources

论文作者

Christmann, Philipp, Roy, Rishiraj Saha, Weikum, Gerhard

论文摘要

会话问题回答（CONSQA）可以解决序列信息的需求，而后续问题中的上下文隐含了。当前的ConvQA系统通过均质信息来源运行：知识库（KB）或文本语料库或表格集合。本文介绍了共同利用所有这些的新颖问题，从而提高答案的覆盖范围和信心。 We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer.我们构建并发布了第一个基准Convmix，用于Convqa，以异质来源进行Convqa，其中包括3000个房地产对话，其中包含16000个问题，以及实体注释，完成的问题说法和问题释义。与最先进的基线相比，实验证明了我们方法的生存能力和优势。

Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverage and confidence. We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer. We construct and release the first benchmark, ConvMix, for ConvQA over heterogeneous sources, comprising 3000 real-user conversations with 16000 questions, along with entity annotations, completed question utterances, and question paraphrases. Experiments demonstrate the viability and advantages of our method, compared to state-of-the-art baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题