The remarkable performance of overparameterized deep neural networks (DNNs) arises from the interplay between the network architecture, the training algorithm, and the structure of the data. This paper disentangles these three components by applying a Bayesian perspective on supervised learning. The prior probabilities for functions are determined by the network and vary by exploiting transitions between ordinal and chaotic regimes. For Boolean function classification, the likelihood is approximated using the error spectrum of the function in the data. When combined with the prior probabilities, this accurately predicts the measured posterior probabilities for DNNs trained with stochastic gradient descent. This analysis reveals that the inherent Occam’s razor-like inductive bias toward structured data and (Kolmogorov) simple functions (strong enough to counteract the exponential growth of the number of functions with complexity) is a critical factor in the success of DNNs.