Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Thinkquel: A Model Dedicated to Text-to-dbt Using Synthetic Data and a Span-Aware Objective

Created by
  • Haebom

Author

Anni Li, Aria Attar, Paul Dong

Outline

Translating natural language requests into robust, production-ready data transformations remains a challenging task. Accuracy relies on precise schema mapping and warehouse-specific SQL dialects, and the strongest supervision available during training (execution success and result matching) is provided only at the sequence level. At the same time, assembling large, execution-validated corpora is costly, and token-level objectives do not align with these global signals, resulting in unstable optimization and limited portability. Thinkquel is a fine-tuned model for generating robust, portable, and execution-validated database queries. Thinkquel's methodology incorporates TS-SQL, a novel synthetic data pipeline that leverages dbt as a portable intermediate representation, and TS-GRPO (Token-Sequence GRPO), a span-aware reinforcement learning objective specifically designed to bridge the gap between token-level training signals and sequence-level execution rewards when fine-tuning LLM. On the 500-example TS-SQL test set, Thinkquel (32B) achieved a 93.2% execution success rate and 61.8% accurate match rate with a two-stage SFT curriculum, improving the baseline model by 67.2% (execution) and 44.4% (match). In Spider (14B) experiments, TS-GRPO improved training stability and convergence speed compared to GRPO and GSPO.

Takeaways, Limitations

Takeaways:
Thinkquel presents a novel approach to the problem of translating natural language requests into actionable database queries.
We improved the accuracy and stability of the model through innovative methodologies such as TS-SQL and TS-GRPO.
Experimental results demonstrate that Thinkquel outperforms existing models.
We also improved training stability and convergence speed on the Spider dataset.
Limitations:
The performance of the model may depend on the database schema and SQL dialect.
The cost of building a large-scale, run-verified corpus.
Further research on the portability of the model is needed.
👍