Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models

Created by
  • Haebom

Author

Dongmin Choi, Woojung Song, Jongwook Han, Eun-Ju Lee, Yohan Jo

Outline

This paper presents a critical analysis of existing studies that applied existing psychological scales (e.g., the Behavioral Factor Index (BFI) and the Prescriptive Questionnaire (PVQ)) to measure personality traits and values in large-scale language models (LLMs). We address the lack of ecological validity of existing scales and compare and analyze the differences in results between existing and ecologically valid questionnaires. Our analysis reveals that existing scales (1) generate LLM profiles that are inconsistent with the psychological traits expressed in the context of user queries, (2) lack sufficient items for reliable measurement, (3) create the misconception that LLMs are a stable construct, and (4) generate inflated profiles in LLMs using persona prompts. Therefore, we urge caution in applying existing psychological scales to LLMs.

Takeaways, Limitations

Takeaways: This paper clearly identifies the limitations of applying existing psychological scales to the LLM exam and emphasizes the need for the development of ecologically valid scales. It also identifies the limitations of existing methodologies for measuring personality traits and values in the LLM exam and suggests directions for developing more appropriate assessment methods.
Limitations: The specific details of the ecologically valid questionnaire presented in this study and the validation process are insufficient. Further research is needed to determine its generalizability to various types of LLMs. Furthermore, clear criteria for defining and measuring "ecologically valid questionnaires" are needed.
👍