Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

E3-Rewrite: Learning to Rewrite SQL for Executability, Equivalence, and Efficiency

Created by
  • Haebom

Author

Dongjie Xu, Yue Cui, Weijie Shi, Qingzhi Ma, Hanghui Guo, Jiaming Li, Yao Zhao, Ruiyuan Zhang, Shimin Di, Jia Zhu, Kai Zheng, Jiajie Xu

Outline

This paper proposes E3-Rewrite, a novel framework leveraging a large-scale language model (LLM) to overcome the limitations of existing rule-based SQL query rewriting methods. Existing methods rely on a fixed set of rules, making it difficult to generalize to new query patterns or complex queries and failing to fully capture effective rewriting strategies. E3-Rewrite constructs context using execution plans and retrieved demos and performs reinforcement learning using a reward function targeting feasibility, equivalence, and efficiency. Through a step-by-step curriculum, it first emphasizes feasibility and equivalence, gradually considering efficiency to ensure stable multi-objective learning.

Takeaways, Limitations

Takeaways:
LLM overcomes the limitations of existing rule-based methods and enables more complex and efficient SQL query rewriting.
Generate feasible, equivalent, and efficient queries using contextual construction and reinforcement learning-based reward functions leveraging execution plans and demos.
Achieved up to 25.6% reduction in query execution time and up to 24.4% increase in equivalence-satisfying rewrite results compared to existing methods in various SQL benchmarks.
It is also effective for complex query patterns that existing methods could not effectively optimize.
Limitations:
It depends on the performance of LLM, and the limitations of LLM may also affect the performance of E3-Rewrite.
Designing reward functions and optimizing reinforcement learning processes are important and require further research.
There is a possibility of overfitting to certain datasets or query patterns.
Performance can be significantly impacted by the quality of the execution plan and demo data.
👍