论文标题

关于异质来源的会话问题回答

Conversational Question Answering on Heterogeneous Sources

论文作者

Christmann, Philipp, Roy, Rishiraj Saha, Weikum, Gerhard

论文摘要

会话问题回答(CONSQA)可以解决序列信息的需求,而后续问题中的上下文隐含了。当前的ConvQA系统通过均质信息来源运行:知识库(KB)或文本语料库或表格集合。本文介绍了共同利用所有这些的新颖问题,从而提高答案的覆盖范围和信心。 We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer.我们构建并发布了第一个基准Convmix,用于Convqa,以异质来源进行Convqa,其中包括3000个房地产对话,其中包含16000个问题,以及实体注释,完成的问题说法和问题释义。与最先进的基线相比,实验证明了我们方法的生存能力和优势。

Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverage and confidence. We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer. We construct and release the first benchmark, ConvMix, for ConvQA over heterogeneous sources, comprising 3000 real-user conversations with 16000 questions, along with entity annotations, completed question utterances, and question paraphrases. Experiments demonstrate the viability and advantages of our method, compared to state-of-the-art baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源