This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Towards Understanding the Cognitive Habits of Large Reasoning Models
Created by
Haebom
Author
Jianshuo Dong, Yujia Fu, Chuanrui Hu, Chao Zhang, Han Qiu
Outline
This paper presents a method for interpreting and monitoring the behavior of a large-scale reasoning model (LRM) by exploiting the way it autonomously generates a chain of thoughts (CoT) and then generates a final response. We investigate whether LRMs exhibit human-like cognitive habits, given that certain CoT patterns (e.g., “Wait, did I miss something?”) consistently appear across a variety of tasks. Based on Habits of Mind, a well-established framework of cognitive habits associated with successful human problem solving, we propose CogTest, a principled benchmark designed to assess the cognitive habits of LRMs. CogTest contains 16 cognitive habits, each implemented across 25 different tasks, and uses evidence-first extraction to ensure reliable habit identification. Using CogTest, we perform a comprehensive evaluation of 16 widely used LLMs (13 LRMs and 3 non-inference models). Our results show that, unlike existing LLMs, LRMs not only exhibit human-like habits, but also adaptively utilize habits across different tasks. Detailed analysis uncovers patterns of similarities and differences in the cognitive habit profile of LRM, particularly between specific families (e.g., Qwen-3 model and DeepSeek-R1). Extending the study to safety-related tasks, we observe that certain habits, such as responsible risk taking, are strongly associated with the generation of harmful responses. These results suggest that studying persistent behavioral patterns in LRM’s CoT is a valuable step toward a deeper understanding of LLM malfunction. The code can be found at https://github.com/jianshuod/CogTest .
We demonstrate that LRM exhibits cognitive habits similar to those of humans and adaptively utilizes them depending on the task.
◦
We found similarities between specific series in the cognitive habit profiles of LRMs.
◦
Contributes to understanding LLM dysfunction by finding that specific habits (e.g., responsible risk taking) are associated with the generation of harmful responses.
◦
We provide a foundation for assessing the cognitive habits of LLMs by introducing a new benchmark called CogTest.
•
Limitations:
◦
Since the CogTest consists of 16 cognitive habits and 25 tasks, additional habits and tasks may be needed for a comprehensive assessment.
◦
Given the limited number and types of LLMs currently evaluated, further research on more diverse models is needed.
◦
Further research is needed on the precise correspondence between human cognitive habits and LRM cognitive habits.
◦
There may be a lack of clarity regarding the definition and criteria for judging a “harmful response.”