桥接文本和表格数据，用于跨域文本到SQL语义解析

论文标题

桥接文本和表格数据，用于跨域文本到SQL语义解析

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

论文作者

Lin, Xi Victoria, Socher, Richard, Xiong, Caiming

论文摘要

我们提出桥梁，这是一种强大的顺序体系结构，用于在跨DB语义解析中自然语言问题与关系数据库之间建模依赖性。桥梁表示标记的序列中的问题和DB架构，其中一个字段的子集被问题中提到的单元格值增强。杂种序列由BERT用最小的后续层编码，文本DB上下文化是通过BERT中的微调深度注意来实现的。再加上指针生成器解码器与模式一致性驱动的搜索空间修剪，Bridge在流行的跨DB文本到SQL基准测试中获得了最先进的性能，蜘蛛（71.1 \％\％dev，67.5 \％\％均采用集合模型测试）和wikisql（92.6 \％dev dev，91.91.91 .91.9 \％test）。我们的分析表明，桥梁有效地捕获了所需的跨模式依赖性，并有可能推广到更多与文本DB相关的任务。我们的实现可在\ url {https://github.com/salesforce/tabularsemanticparsing}上获得。

We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider (71.1\% dev, 67.5\% test with ensemble model) and WikiSQL (92.6\% dev, 91.9\% test). Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our implementation is available at \url{https://github.com/salesforce/TabularSemanticParsing}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题