This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
Created by
Haebom
Author
Gaye Colakoglu, G urkan Solmaz, Jonathan F urst
Outline
This paper defines and explores a design space for information extraction (IE) from layout-rich documents using a large-scale language model (LLM). The three core challenges of layout-aware IE using LLMs are data structuring, model engagement, and output improvement. We investigate subproblems and methods for input representation, chunking, prompting, LLM selection, and multimodal models. Using LayIE-LLM, a novel open-source layout-aware IE test suite, we benchmark the effectiveness of various design choices against existing fine-tuned IE models. Results on two IE datasets demonstrate that LLMs require tuning the IE pipeline to achieve competitive performance. The optimized configurations found with LayIE-LLM outperform the common baseline configurations using the same LLM by 13.3 and 37.5 F1 points, respectively. We develop a one-factor asynchronous (OFAT) method that approaches the optimal result, requiring a fraction (2.8%) of the computational effort and only underperforming the best full factorial search by 0.8 and 1.8 points, respectively. Overall, we demonstrate that a properly configured general-purpose LLM matches the performance of specialized models and provides a cost-effective, fine-tuning-free alternative. The test suite is available at https://github.com/gayecolakoglu/LayIE-LLM .
We present an efficient methodology for information extraction from layout-rich documents: LLM can achieve similar or better performance than existing fine-tuned models.
◦
Providing a cost-effective alternative: Extracting information using a general-purpose LLM without fine-tuning.
◦
Open source test suite LayIE-LLM released: Contribute to performance comparison and research of various LLMs and methodologies.
◦
An effective parameter search method (OFAT) is presented: Approaching optimal performance while reducing computational complexity.
•
Limitations:
◦
Limitations of the datasets used: Lack of validation of generalizability using only two datasets.
◦
Guaranteeing optimality of OFAT method X: There is a slight performance degradation compared to full factorial search.
◦
Dependency on LLM performance: Results may change as LLM performance improves.