Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Automatic Demonstration Selection for LLM-based Tabular Data Classification

Created by
  • Haebom

Author

Shuchu Han, Wolfgang Bruckner

Outline

This paper presents an algorithm to automatically determine the optimal number of demos in In-Context Learning (ICL) for tabular data classification. Unlike the existing random selection algorithms, it considers the distribution of tabular data, the user-selected prompt template, and a specific Large Language Model (LLM). Based on the Spectral Graph Theory, we define a new metric to quantify the similarity between demos, construct a similarity graph, and analyze the eigenvalues of its Laplacian to derive the minimum number of demos that can represent the data in the internal representation space of the LLM. We verify the effectiveness of the proposed method through experiments on various datasets and LLMs.

Takeaways, Limitations

Takeaways:
An efficient method for automatically selecting the optimal number of demos in ICL for tabular data classification is presented.
More accurate demo count estimation possible by integrating data distribution, prompt templates, and LLM.
A novel similarity measurement and analysis method based on Spectral Graph Theory is presented.
Limitations:
Lack of analysis of the computational complexity of the proposed algorithm.
Further validation is needed on the generalization performance for different types of tabular data and LLM.
Further research is needed on the dependency on specific prompt templates and their applicability to other templates.
👍