Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning

Created by
  • Haebom

Author

Chuang Jiang, Mingyue Cheng, Xiaoyu Tao, Qingyang Mao, Jie Ouyang, Qi Liu

Outline

This paper presents TableMind, a novel agent based on a Large-Scale Language Model (LLM), focusing on table reasoning. TableMind autonomously performs multi-step tool invocations and writes and executes data analysis code in a secure sandbox environment, enabling accurate numerical inference. Furthermore, it adaptively adjusts strategies through higher-order capabilities such as planning and self-reflection. Based on a powerful pre-trained language model, we adopt a two-step fine-tuning paradigm: supervised learning fine-tuning for high-quality inference paths and reinforcement learning fine-tuning to optimize multi-objective strategies. Specifically, we propose Rank-Aware Policy Optimization (RAPO), which increases update weights when the output probability of a high-quality path is lower than that of a low-quality path, thereby guiding the model to obtain more accurate answers. Extensive experiments on several key benchmarks demonstrate that TableMind outperforms competing baseline models, demonstrating significant improvements in both inference accuracy and computational precision.

Takeaways, Limitations

Takeaways:
In the field of table inference using LLM, we present improved accuracy through autonomous multi-stage tool invocation and safe code execution.
Enhance strategic adaptability through higher-order abilities such as planning and self-reflection.
Effectively improving high-quality inference path learning through RAPO techniques.
Experimentally verified improvement in inference accuracy and calculation precision compared to existing methods.
Limitations:
There is a need to verify generalization performance on various types of tabular data and complex inference tasks beyond the scope of currently presented benchmarks.
Although code execution is within a secure sandbox environment, additional security considerations may be required for the possibility of malicious code execution.
Further analysis is needed to determine the effectiveness and generalizability of the RAPO algorithm.
If trained with a dataset biased towards a specific domain, its applicability to other domains may be limited.
👍