SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL

Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hwanhee Lee

2025-02-18

SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example
Selection for Text-to-SQL

Summary

This paper talks about SAFE-SQL, a new way to help AI understand and create SQL database queries from regular language questions. It's like teaching a computer to translate human questions into a language that databases can understand, even when it hasn't seen similar questions before.

What's the problem?

Current methods for turning text into SQL queries work well when they have similar examples to learn from, but they struggle when faced with new or very complex questions. It's like trying to solve a math problem without seeing any similar examples in your textbook.

What's the solution?

The researchers created SAFE-SQL, which uses a clever trick. Instead of relying on pre-existing examples, it asks an AI to create its own examples that are similar to the new question. Then, it carefully checks these made-up examples to make sure they're good and relevant. Finally, it uses these self-made examples to help answer the original question.

Why it matters?

This matters because it could make AI systems much better at understanding and answering complex database questions, even in situations they haven't been specifically trained for. This could help businesses, researchers, and anyone who needs to get information from databases more easily, without needing to know how to write SQL code themselves.

Abstract

Text-to-SQL aims to convert natural language questions into executable SQL queries. While previous approaches, such as skeleton-masked selection, have demonstrated strong performance by retrieving similar training examples to guide large language models (LLMs), they struggle in real-world scenarios where such examples are unavailable. To overcome this limitation, we propose Self-Augmentation <PRE_TAG>in-context learning</POST_TAG> with Fine-grained Example selection for Text-to-SQL (SAFE-SQL), a novel framework that improves SQL generation by generating and filtering self-augmented examples. SAFE-SQL first prompts an LLM to generate multiple Text-to-SQL examples relevant to the test input. Then SAFE-SQL filters these examples through three relevance assessments, constructing high-quality in-context learning examples. Using self-generated examples, SAFE-SQL surpasses the previous zero-shot, and few-shot Text-to-SQL frameworks, achieving higher execution accuracy. Notably, our approach provides additional performance gains in extra hard and unseen scenarios, where conventional methods often fail.

View Paper