The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Maamari, Karime; Abubaker, Fadhil; Jaroslawicz, Daniel; Mhedhbi, Amine

Computer Science > Computation and Language

arXiv:2408.07702 (cs)

[Submitted on 14 Aug 2024 (v1), last revised 18 Aug 2024 (this version, v2)]

Title:The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Authors:Karime Maamari, Fadhil Abubaker, Daniel Jaroslawicz, Amine Mhedhbi

View PDF HTML (experimental)

Abstract:Schema linking is a crucial step in Text-to-SQL pipelines. Its goal is to retrieve the relevant tables and columns of a target database for a user's query while disregarding irrelevant ones. However, imperfect schema linking can often exclude required columns needed for accurate query generation. In this work, we revisit schema linking when using the latest generation of large language models (LLMs). We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones. As such, our Text-to-SQL pipeline entirely forgoes schema linking in cases where the schema fits within the model's context window in order to minimize issues due to filtering required schema elements. Furthermore, instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-SQL pipeline. Our approach ranks first on the BIRD benchmark achieving an accuracy of 71.83%.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2408.07702 [cs.CL]
	(or arXiv:2408.07702v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2408.07702

Submission history

From: Karime Maamari [view email]
[v1] Wed, 14 Aug 2024 17:59:04 UTC (939 KB)
[v2] Sun, 18 Aug 2024 19:06:04 UTC (1,116 KB)

Computer Science > Computation and Language

Title:The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators