Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Discovering Hidden Algebraic Structures via Transformers with Rank-Aware Beam GRPO

Created by
  • Haebom

Author

Jaeha Lee, Gio Huh, Ning Su, Tony Yue YU

Outline

This paper explores the solution of multivariable polynomial decomposition problems using the Transformer model. Polynomial decomposition, while widely applied in science and engineering, is known to be NP-hard and requires high accuracy and insight. This study develops a synthetic data generation pipeline that allows for fine-grained control of problem complexity and trains a Transformer model using supervised learning to evaluate scaling behavior and generalization performance. Furthermore, we propose Beam Grouped Relative Policy Optimization (BGRPO), a hierarchy-aware reinforcement learning method suitable for difficult algebraic problems. Fine-tuning using BGRPO improves accuracy and reduces the beam width by up to half, reducing the inference workload by approximately 75%. Furthermore, the proposed model outperforms Mathematica in polynomial simplification.

Takeaways, Limitations

Takeaways:
We show that the Transformer model can be applied to nonlinear latent pattern discovery and multivariate polynomial decomposition, which is an NP-hard problem.
We present a method to efficiently reduce the amount of inference computation through the BGRPO algorithm.
Achieves superior performance over traditional Mathematica in polynomial simplification tasks.
Provides a synthetic data generation pipeline that can control the complexity of polynomial decomposition problems.
Limitations:
Performance evaluations have mainly been conducted on synthetic data, and generalization performance on real data requires additional verification.
The performance of the BGRPO algorithm may depend on the specific problem type and settings.
Performance on very complex polynomials requires further study.
👍