Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Deep neural networks have an inbuilt Occam's razor

Created by
  • Haebom

Author

Chris Mingard, Henry Rees, Guillermo Valle- Perez, Ard A. Louis

Outline

The remarkable performance of overparameterized deep neural networks (DNNs) arises from the interplay between the network architecture, the training algorithm, and the structure of the data. This paper disentangles these three components by applying a Bayesian perspective on supervised learning. The prior probabilities for functions are determined by the network and vary by exploiting transitions between ordinal and chaotic regimes. For Boolean function classification, the likelihood is approximated using the error spectrum of the function in the data. When combined with the prior probabilities, this accurately predicts the measured posterior probabilities for DNNs trained with stochastic gradient descent. This analysis reveals that the inherent Occam’s razor-like inductive bias toward structured data and (Kolmogorov) simple functions (strong enough to counteract the exponential growth of the number of functions with complexity) is a critical factor in the success of DNNs.

Takeaways, Limitations

Takeaways: Provides a new understanding of the success of overparameterized DNNs. Shows that the structure of the data and the inherent bias of the network play an important role. Presents a new methodology for analyzing the learning process of DNNs using a Bayesian perspective.
Limitations: The analysis is limited to Boolean function classification. Generalizability to complex real-world datasets may be limited. Approximation methods for posterior probabilities may depend on the specific architecture and training algorithm of the DNN.
👍