Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

FinStat2SQL: A Text2SQL Pipeline for Financial Statement Analysis

Created by
  • Haebom

Author

Quang Hung Nguyen, Phuong Anh Trinh, Phan Quoc Hung Mai, Tuan Phong Trinh

Outline

In this paper, we present FinStat2SQL, a lightweight text2sql pipeline to address the challenges of complex and domain-specific queries in the financial domain. Designed for local standards such as the Vietnamese VAS standard, FinStat2SQL combines large-scale language models and small-scale language models in a multi-agent setup for entity extraction, SQL generation, and self-correction. We build a domain-specific database and evaluate the models on a synthetic QA dataset, showing that the fine-tuned 7B model achieves 61.33% accuracy with a response time of less than 4 seconds on consumer hardware, outperforming GPT-4o-mini. FinStat2SQL provides a scalable and cost-effective financial analytics solution that provides AI-based query capabilities to Vietnamese enterprises.

Takeaways, Limitations

Takeaways:
Presenting an efficient and lightweight pipeline for solving text2sql challenges in the financial sector.
Providing domain-specific solutions that take into account local standards (VAS, etc.).
Improving performance through a multi-agent approach.
Achieve performance and faster response times than GPT-4o-mini.
Expanding Accessibility to AI-Based Financial Analytics for Vietnamese Enterprises.
Limitations:
Evaluation using synthetic QA datasets may result in performance degradation when applied to real environments.
It relies on a domain-specific database and requires validation for scalability to other financial environments or standards.
Further research is needed on the generalization performance of the model.
The size of the 7B model may be cumbersome in some environments.
👍