[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Loss-Complexity Landscape and Model Structure Functions

Created by
  • Haebom

Author

Alexander Kolpakov

Outline

We develop a framework for the duality of the Kolmogorov structure function h x (α) to enable computationally feasible complexity proxies. We establish a mathematical analogy between information-theoretic constructions and statistical mechanics, and introduce appropriate partition functions and free energy functions. We explicitly prove the Legendre-Fenchel duality between structure functions and free energies, show a detailed balance of Metropolis kernels, and interpret the acceptance probability in terms of information-theoretic scattering amplitudes. It is shown that variance, such as the susceptibility of model complexity, peaks precisely at the loss-complexity trade-off, which is interpreted as a phase transition. Practical experiments with linear and tree-based regression models verify these theoretical predictions, and we explicitly show the interplay between model complexity, generalization, and overfitting thresholds.

Takeaways, Limitations

Takeaways: We show that the duality framework of Kolmogorov structure functions can be used to analyze and quantify the relationship between model complexity, generalization performance, and overfitting from information-theoretic and statistical mechanics perspectives. We present a novel perspective that interprets the loss-complexity trade-off as a phase transition, and provide theoretical rigor by connecting the detailed balance of Metropolis kernels with information-theoretic scattering amplitudes. We verify the validity of the theoretical predictions with experiments.
Limitations: The current study is limited to linear and tree-based regression models, and the generalizability to other types of models requires further study. There is a lack of analysis on the influence of the characteristics of the dataset used for experimental validation on the results. Experimental validation on more diverse datasets and models is needed.
👍