Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Hierarchical Reasoning Model

Created by
  • Haebom

Author

Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, Yasin Abbasi Yadkori

Outline

In this paper, we propose a hierarchical reasoning model (HRM) to address the difficulty of the reasoning process of designing and executing complex goal-oriented action sequences in artificial intelligence. It is inspired by the hierarchical and multi-timescale processing of the human brain to overcome the weak task decomposition, massive data requirements, and high latency problems of the chain of thought (CoT) technique used in existing large-scale language models (LLMs). HRM executes sequential reasoning tasks in a single forward pass through two interdependent recurrent modules: a high-level module responsible for high-level abstract planning and a low-level module that handles fast detailed computations. It achieves outstanding performance on complex reasoning tasks such as finding optimal paths in complex Sudoku puzzles and large mazes using only 27 million parameters and 1000 training samples without explicit supervision of intermediate processes. It also outperforms models with much larger models and much longer context windows on the Abstraction and Reasoning Corpus (ARC), a major benchmark for measuring artificial general intelligence capabilities, without pretraining or CoT data.

Takeaways, Limitations

Takeaways:
Hierarchical and multi-time scale processing greatly improves the efficiency and stability of inference tasks.
It achieved outstanding performance on complex inference tasks with few parameters and training data.
It showed excellent performance even without pre-training or CoT data.
It has demonstrated performance that outperforms existing large-scale models on benchmarks such as ARC.
It has demonstrated the potential for development into a general-purpose computational and general-purpose inference system.
Limitations:
Further validation of the generalization performance of the proposed model is needed.
Performance evaluation for more complex and diverse inference tasks is needed.
A more in-depth analysis of the model's hierarchical structure and interactions is needed.
A review of the possibility of overfitting for certain types of problems is needed.
👍