[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

HyperCLOVA X THINK Technical Report

Created by
  • Haebom

Author

NAVER Cloud HyperCLOVA X Team

Outline

HyperCLOVA X THINK is the first inference-driven large-scale language model pre-trained on approximately 6 trillion Korean and English tokens. It is implemented by adding target synthetic Korean data and extending the Peri-LN Transformer with μP considering the computation-memory balance. It is pre-trained with a three-stage curriculum that expands the context window to 128K tokens and undergoes supervised fine-tuning via reinforcement learning from verifiable rewards. It supports both detailed evidence and concise answer modes and shows competitive performance compared to similar-sized models on Korean-centric benchmarks such as KMMLU, CSAT, KoBALT-700, HAERAE-1.0, and KoBigBench. It also maintains good bilingual consistency and translation quality, and the vision-augmented variant achieves performance on par with or better than GPT-4.1 on the KCSAT STEM benchmark. It achieves this with much less training computation than existing similar-sized models, and also presents pruning and distillation techniques for an open-source and business-friendly base model.

Takeaways, Limitations

Takeaways:
A successful case study of the development of a large-scale Korean language model focusing on inference capabilities.
Achieve competitive performance with lower training computational load compared to existing models.
Excellent performance in Korean-centric benchmarks.
Gaining competitiveness in STEM fields through vision augmentation models.
Plans to develop an open source and business friendly model.
Providing a powerful foundation model for Korean AI innovation.
Limitations:
Not yet open sourced (planned for the future).
Lack of details on specific pruning and distillation techniques.
Lack of detailed explanation of the use of synthetic data.
Lack of performance evaluation for other languages.
👍