Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Mind the Gap: The Divergence Between Human and LLM-Generated Tasks

작성자
  • Haebom

Author

Yi-Long Lu, Jiajun Song, Chunhui Zhang, Wei Wang

Outline

This paper conducted task generation experiments with humans and GPT-4o to investigate whether generative agents based on large-scale language models (LLMs) generate tasks in a human-like manner. Our results show that while human task generation is consistently influenced by personal values like openness to experience and psychological drivers like cognitive style, LLMs fail to reflect these behavioral patterns even when explicitly provided with psychological drivers. LLM-generated tasks were less social, less physically demanding, and more focused on abstract topics. While LLM-generated tasks were rated as more engaging and novel, this demonstrates a gap between LLMs' linguistic abilities and their ability to generate human-like, concrete goals. In conclusion, there is a fundamental difference between the value-driven and concrete nature of human cognition and the statistical patterns of LLMs. Designing more human-centric agents requires integrating intrinsic motivation and physical foundations.

Takeaways, Limitations

Takeaways:
We have found that human work creation is significantly influenced by personal values and cognitive styles.
Unlike humans, LLMs are not social, have less physical activity, and tend to produce abstract tasks.
There is a gap between LLM's linguistic abilities and human-like goal-generating abilities.
Integrating intrinsic motivation and physical foundations is essential for developing human-centered agents.
Limitations:
GPT-4o experiments were conducted using only one LLM, which limits generalizability.
It is possible that the performance of the LLM was not properly assessed due to limitations in the way psychological motivations were provided to the LLM.
Further review of the reliability of the results is necessary due to insufficient information on the number and diversity of participants in the study.
👍