Bertrand-DR：使用判别性重新级别改进文本到SQL

论文标题

Bertrand-DR：使用判别性重新级别改进文本到SQL

Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker

论文作者

Kelkar, Amol, Relan, Rohan, Bhardwaj, Vaishali, Vaichal, Saurabh, Khatri, Chandra, Relan, Peter

论文摘要

要访问关系数据库中存储的数据，用户需要了解数据库架构并使用SQL等查询语言编写查询。为了简化此任务，文本到SQL模型尝试将用户的自然语言问题转换为相应的SQL查询。最近，已经开发了几种生成文本到SQL模型。我们提出了一个新颖的判别重新级别，以通过从文本到SQL发电机预测的光束输出中提取最佳的SQL查询来提高生成文本到SQL模型的性能，从而在候选列表中最佳查询的情况下，但不在列表中，从而提高了性能。我们将重新级别构建为架构不可知论的BERT微调分类器。我们分析了在不同查询硬度级别上的文本到SQL和重新级别模型的相对强度，并建议如何结合两个模型以获得最佳性能。我们通过将其应用于两个最先进的文本到SQL模型，并在撰写本文时在Spider排行榜上获得前4个得分，从而证明了重新级别的有效性。

To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the performance of generative text-to-SQL models by extracting the best SQL query from the beam output predicted by the text-to-SQL generator, resulting in improved performance in the cases where the best query was in the candidate list, but not at the top of the list. We build the re-ranker as a schema agnostic BERT fine-tuned classifier. We analyze relative strengths of the text-to-SQL and re-ranker models across different query hardness levels, and suggest how to combine the two models for optimal performance. We demonstrate the effectiveness of the re-ranker by applying it to two state-of-the-art text-to-SQL models, and achieve top 4 score on the Spider leaderboard at the time of writing this article.

下载PDF全文

下载文献需遵守相关版权规定

论文标题