多副手：基准测试多语言文本到SQL语义解析

论文标题

多副手：基准测试多语言文本到SQL语义解析

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

论文作者

Dou, Longxu, Gao, Yan, Pan, Mingyang, Wang, Dingzirui, Che, Wanxiang, Zhan, Dechen, Lou, Jian-Guang

论文摘要

文本到SQL语义解析是一项重要的NLP任务，它极大地促进了用户与数据库之间的交互作用，并成为许多人类计算机交互系统中的关键组件。文本到SQL的最新进展是由大规模数据集驱动的，但其中大多数以英语为中心。在这项工作中，我们介绍了Multospider，这是最大的多语言文本到SQL数据集，涵盖了七种语言（英语，德语，法语，西班牙语，日语，中文和越南语）。在MultiSpider的情况下，我们进一步确定了文本到SQL的词汇和结构性挑战（由特定的语言属性和方言说法引起）及其在不同语言上的强度。在三种典型设置（零射，单语和多语言）下的实验结果显示，非英语语言的准确性绝对下降了6.1％。进行了定性和定量分析，以了解每种语言性能下降的原因。除了数据集外，我们还提出了一个简单的模式增强框架节省（带有验证架构审计），该框架可显着提高整体性能约1.8％，并缩小跨语言的29.5％的性能差距。

Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese). Upon MultiSpider, we further identify the lexical and structural challenges of text-to-SQL (caused by specific language properties and dialect sayings) and their intensity across different languages. Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language. Besides the dataset, we also propose a simple schema augmentation framework SAVe (Schema-Augmentation-with-Verification), which significantly boosts the overall performance by about 1.8% and closes the 29.5% performance gap across languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题