Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

MathBode: Frequency-Domain Fingerprints of LLM Mathematical Reasoning

Created by
  • Haebom

Author

Charles L. Wang

MathBode: Mathematical Reasoning for Dynamic Diagnosis

Outline

This paper presents MathBode, a dynamic diagnostic tool for mathematical inference of large-scale language models (LLMs). Instead of focusing on one-time accuracy, MathBode treats each parameter as a system, driving a single parameter with a sine wave and fitting the fundamental harmonic response of the model output to the exact solution. This yields interpretable frequency-decomposition metrics, such as gain (amplitude tracking) and phase (delay), which are then formed into a Bode-style fingerprint. MathBode exhibits systematic lowpass behavior and phase delay growth in five closed-form families: linear equations, rate/saturation, compounding, 2x2 linear systems, and pseudo-triangles, which are not detectable by accuracy alone. Various models are compared against a symbolic baseline for instrument calibration ($G \approx 1$, $\phi \approx 0$). The results provide a concise and reproducible protocol that complements standard benchmarks by providing actionable measures of inference fidelity and consistency, distinguishing between leading and intermediate models for dynamics. The dataset and code are made publicly available, enabling further research and adoption.

Takeaways, Limitations

Takeaways:
Provides dynamic analysis of LLM's mathematical reasoning abilities to identify issues not revealed by accuracy alone.
We utilize Bode-style fingerprints to visually compare and evaluate model performance.
It complements standard benchmarks by providing actionable measures of mathematical reasoning ability.
It is provided as open source to facilitate research and use.
Limitations:
It focuses on five specific types of mathematical problems, so its general applicability may be limited.
It may explain a specific behavior of the model, but may not fully identify the root cause.
👍