Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Large Language Models Do Not Simulate Human Psychology

Created by
  • Haebom

Author

Sarah Schr oder, Thekla Morgenroth, Ulrike Kuhl, Valerie Vaquet, Benjamin Paa{\ss}en

Outline

This paper critically examines the claim that large-scale language models (LLMs), such as ChatGPT, can replace human participants in psychological research. We present a conceptual argument for the hypothesis that LLMs simulate human psychology and empirically support this hypothesis by demonstrating discrepancies between LLMs and human responses based on semantic changes. Specifically, we demonstrate that several LLMs, including the CENTAUR model fine-tuned for psychological responses, respond differently to novel items, highlighting the unreliability of LLMs. Therefore, we conclude that while LLMs are useful tools, they should be treated as fundamentally unreliable tools that must be validated against human responses in any new application.

Takeaways, Limitations

Takeaways: By empirically demonstrating that LLMs do not accurately simulate human psychology, we urge a cautious approach to the use of LLMs in psychological research. We emphasize the importance of validating LLMs with human responses when applying them to psychological research.
Limitations: This study presents results based on a specific LLM and a limited dataset. Therefore, caution is needed in generalizing to other LLMs or broader datasets. Given the rapid pace of LLM development, further research is needed to determine the long-term validity of this study's conclusions.
👍