Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs

Created by
  • Haebom

Author

Gaye Colakoglu, G urkan Solmaz, Jonathan F urst

Outline

This paper defines and explores a design space for information extraction (IE) from layout-rich documents using a large-scale language model (LLM). The three core challenges of layout-aware IE using LLMs are data structuring, model engagement, and output improvement. We investigate subproblems and methods for input representation, chunking, prompting, LLM selection, and multimodal models. Using LayIE-LLM, a novel open-source layout-aware IE test suite, we benchmark the effectiveness of various design choices against existing fine-tuned IE models. Results on two IE datasets demonstrate that LLMs require tuning the IE pipeline to achieve competitive performance. The optimized configurations found with LayIE-LLM outperform the common baseline configurations using the same LLM by 13.3 and 37.5 F1 points, respectively. We develop a one-factor asynchronous (OFAT) method that approaches the optimal result, requiring a fraction (2.8%) of the computational effort and only underperforming the best full factorial search by 0.8 and 1.8 points, respectively. Overall, we demonstrate that a properly configured general-purpose LLM matches the performance of specialized models and provides a cost-effective, fine-tuning-free alternative. The test suite is available at https://github.com/gayecolakoglu/LayIE-LLM .

Takeaways, Limitations

Takeaways:
We present an efficient methodology for information extraction from layout-rich documents: LLM can achieve similar or better performance than existing fine-tuned models.
Providing a cost-effective alternative: Extracting information using a general-purpose LLM without fine-tuning.
Open source test suite LayIE-LLM released: Contribute to performance comparison and research of various LLMs and methodologies.
An effective parameter search method (OFAT) is presented: Approaching optimal performance while reducing computational complexity.
Limitations:
Limitations of the datasets used: Lack of validation of generalizability using only two datasets.
Guaranteeing optimality of OFAT method X: There is a slight performance degradation compared to full factorial search.
Dependency on LLM performance: Results may change as LLM performance improves.
👍