Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Precise Bayesian Neural Networks

Created by
  • Haebom

Author

Carlos Stein Brito

Outline

This paper analyzes the under-utilization of Bayesian neural networks (BNNs) due to the inconsistency of the standard Gaussian posterior probability distribution with the network geometry, the instability of the KL term in high dimensions, and the unreliable uncertainty correction despite the increased implementation complexity. We revisit the problem from a regularization perspective and model uncertainty using the von Mises-Fisher posterior probability distribution, which depends only on the weight direction. This yields a single, interpretable scalar per layer, the effective regularized noise ($\sigma_{\mathrm{eff}}$), which corresponds to simple additive Gaussian noise in the forward pass and allows for a compact, closed-form, dimensionally-aware KL correction. By deriving an exact closed-form approximation between the concentration $\kappa$, the activation variance, and $\sigma_{\mathrm{eff}}$, we create a lightweight, implementable variational unit that fits modern regularized architectures and improves calibration without sacrificing accuracy. Dimensionality awareness is crucial for stable optimization in high dimensions, and we show that BNNs can be principled, practical, and accurate by aligning variational posterior probabilities with the network's intrinsic geometry.

Takeaways, Limitations

Takeaways:
We propose the possibility of stable and efficient Bayesian neural network learning even in high dimensions using the von Mises-Fisher posterior probability distribution for the weight direction.
Improve model understandability by representing uncertainty through an interpretable scalar value called noise ($\sigma_{\mathrm{eff}}$) after effective normalization.
Provides lightweight variational units applicable to modern regularized neural network architectures.
Improved compensation performance and prevented accuracy degradation
Limitations:
Further verification is needed to determine whether the assumptions made using the von Mises-Fisher distribution are applicable to all types of neural network architectures.
Further experiments are needed to determine how well the proposed method generalizes across different datasets and tasks.
Further analysis of the accuracy of closed-form approximations is needed.
👍