Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Assessing Consciousness-Related Behaviors in Large Language Models Using the Maze Test

Created by
  • Haebom

Author

Rui A. Pimenta, Tim Schlippe, Kristina Schaaff

Outline

This paper investigates consciousness-like behavior in large-scale language models (LLMs) using the Maze Test. The Maze Test challenges models to navigate a maze from a first-person perspective, simultaneously exploring key features associated with consciousness, including spatial awareness, perspective-taking, goal-directed behavior, and temporal sequencing. Twelve key LLMs were evaluated across zero-shot, one-shot, and few-shot learning scenarios, integrating 13 essential conscious features. The results show that LLMs with inference capabilities consistently outperform their standard counterparts, with Gemini 2.0 Pro achieving 52.9% complete path accuracy and DeepSeek-R1 achieving 80.5% partial path accuracy. These discrepancies suggest that LLMs struggle to maintain a consistent self-model throughout the solution process, a fundamental aspect of consciousness. While LLMs demonstrate improvements in conscious behaviors through inference mechanisms, they lack the integrated and sustained self-awareness characteristic of consciousness.

Takeaways, Limitations

Takeaways:
LLMs with reasoning abilities performed better on the maze test, suggesting a correlation between reasoning abilities and consciousness-related behavior.
We present the maze test, a new benchmark for assessing consciousness-like behavior in LLM.
Some LLMs, such as Gemini 2.0 Pro and DeepSeek-R1, demonstrate significant maze navigation capabilities.
Limitations:
LLM's performance in the maze test may not indicate true consciousness. It merely demonstrates that reasoning skills can mimic conscious behavior.
The difficulty LLMs have in maintaining a consistent self-model throughout the problem-solving process suggests a lack of true self-awareness and consciousness.
The maze test may not comprehensively assess all aspects of consciousness.
👍