Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Large Language Models Do Not Simulate Human Psychology

Created by
  • Haebom

Author

Sarah Schr oder, Thekla Morgenroth, Ulrike Kuhl, Valerie Vaquet, Benjamin Paa{\ss}en

Outline

This paper critically examines the claim that large-scale language models (LLMs), such as ChatGPT, can replace human participants in psychological research. We present conceptual arguments for the hypothesis that LLMs simulate human psychology and provide empirical evidence using several LLMs, including the CENTAUR model, which is specifically tuned to psychological responses. We demonstrate that significant differences arise between LLMs and human responses when subtle word changes lead to large semantic shifts, and that different LLMs exhibit very different responses to novel items, demonstrating the unreliability of LLMs. In conclusion, we argue that LLMs do not simulate human psychology, and that psychological researchers should consider LLMs useful but fundamentally unreliable tools, requiring validation against human responses in all new applications.

Takeaways, Limitations

Takeaways: While LLMs can be a useful tool for psychological research, they cannot completely replace human participants. LLM results should always be compared and validated with human responses. Recognizing the limitations of LLMs and caution in research design and interpretation is emphasized.
Limitations: This study may have limited generalizability due to its reliance on a specific LLM and limited dataset. Further research is needed on various types of psychology research and LLMs. Given the rapid pace of LLM development, it is uncertain whether the conclusions of this study will hold up in the long term.
👍