Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

ENIGMA: The Geometry of Reasoning and Alignment in Large-Language Models

Created by
  • Haebom

Author

Gareth Seneque, Lap-Hang Ho, Nafise Erfanian Saeedi, Jeffrey Molendijk, Ariel Kuperman, Tim Elson

Outline

ENIGMA is a novel approach to training large-scale language models (LLMs) that jointly improves inference, alignment, and robustness by treating organizational policies/principles as directions moving along the model's information manifold. The single-loop trainer combines Group-Relative Policy Optimization (GRPO), an on-policy, critic-free RL method using rewards in the CoT format only, a symmetric infoNCE assistant in the style of mutual information (SAMI), and an entropy Sinkhorn optimal transport regularizer on hidden-state distributions to limit geometric drift. Furthermore, to measure how strongly the model's CoT encodes these policies, we introduce an infoNCE metric specific to the standard MI lower bound under matched negation. These metrics include the Sufficiency Index (SI), which enables the selection and generation of principles that maximize downstream performance before training.

Takeaways, Limitations

Takeaways:
We present a novel approach to improve inference, alignment, and robustness in LLM training by projecting them into a single information geometry objective.
Development of a methodology that enables principled reasoning without a compensation model.
Introducing the Sufficiency Index (SI) metric to predict downstream performance before training.
Demonstrating improved benchmark performance and stable training dynamics compared to GRPO ablation using small LLM (1B).
Validation of desirable structural changes in the manifold through information geometric analysis of the trained model.
Limitations:
The presented experiments are limited to small LLM (1B).
Performance verification in large-scale models is required.
There is no specific mention of Limitations in this paper.
👍