Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Robustness is Important: Limitations of LLMs for Data Fitting

Created by
  • Haebom

Author

Hejia Liu, Mochen Yang, Gediminas Adomavicius

Outline

This paper addresses the vulnerabilities of using large-scale language models (LLMs) for data fitting and prediction generation. While LLMs demonstrate competitive predictive performance across a variety of tasks, we find that they are vulnerable to task-irrelevant changes in data representation (e.g., variable renaming). This phenomenon occurs in both in-context learning and supervised fine-tuning, as well as in both close-weight and open-weight LLMs. Analysis of the attention mechanism in open-weight LLMs reveals that they over-focus on tokens in specific positions. Even state-of-the-art models like TabPFN, specifically trained for data fitting, are not immune to these vulnerabilities. Therefore, current LLMs lack even a basic level of robustness to be used as a principled data fitting tool.

Takeaways, Limitations

Takeaways: Emphasizes the importance of recognizing the vulnerability of LLM to changes in data representation that are irrelevant to the task when using it for data fitting. This suggests the need for further research to address this vulnerability to increase confidence in the predictive performance of LLM. This raises the need for new LLM design and training methods that consider the robustness of data representation.
Limitations: This study is based on experimental results for a specific LLM and dataset. Therefore, further research is needed to determine whether it can be generalized to all LLMs and situations. It does not provide a complete explanation of the root causes of vulnerability to task-unrelated changes. It does not suggest specific solutions to mitigate vulnerability.
👍