Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TASER: Table Agents for Schema-guided Extraction and Recommendation

Created by
  • Haebom

Author

Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, Manuela Veloso

Outline

Real-world financial documents contain information on various financial instruments, but they are presented in complex, multi-page, tabular formats. This paper proposes TASER (Table Agents for Schema-guided Extraction and Recommendation), an agent-based table extraction system that continuously learns to extract this unstructured data into a normalized format. TASER performs table detection, classification, extraction, and recommendation, and recommends schema modifications through a Recommender Agent, determining the final result. TASER outperforms Table Transformer by 10.1%, and with large batch sizes, we observed a 104.3% increase in actual schema recommendations and a 9.8% increase in extracted assets. To train TASER, we manually labeled 22,584 pages, 3,213 tables, and $731.6 billion in assets. We made the TASERTab dataset publicly available, making real-world financial tables accessible to the research community.

Takeaways, Limitations

Takeaways:
We demonstrate that an agent-based, schema-based extraction system is effective in understanding real-world financial statements.
It suggests the possibility of performance improvement and schema improvement through continuous learning process.
Contribute to research activation by making the TASERTab dataset public.
Limitations:
The paper does not specifically mention Limitations (although this may present challenges in dealing with the complexity and irregularity of actual financial tables).
👍