Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Geometric-Aware Variational Inference: Robust and Adaptive Regularization with Directional Weight Uncertainty

Created by
  • Haebom

Author

Carlos Stein Brito

Outline

In this paper, we propose Concentration-Adapted Perturbations (CAP), a variational framework that models weight uncertainty directly on the unit hypersphere using the von Mises-Fisher distribution to address the problem that existing variational inference methods that use isotropic Gaussian approximations in the weight space of neural networks do not fit well with the intrinsic geometry of neural networks. Building on recent work on radial-directional posterior decomposition and spherical weight constraints, CAP provides the first complete theoretical framework that connects directional statistics to practical noise regularization in neural networks. The key contribution is an analytical derivation that connects the vMF concentration parameter to the activation noise variance, allowing each layer to learn an optimal uncertainty level via a novel closed-form KL divergence regularizer. On CIFAR-10 experiments, CAP significantly improves model calibration, reducing the expected calibration error by a factor of 5.6, while providing interpretable layer-wise uncertainty profiles. CAP requires minimal computational overhead and is seamlessly integrated into standard architectures, providing a theoretically grounded and practical approach to uncertainty quantification in deep learning.

Takeaways, Limitations

Takeaways:
A novel variational inference framework, CAP, is presented to effectively model weight uncertainty in neural networks.
Considering the geometric characteristics of neural networks using the von Mises-Fisher distribution
Deriving a closed-form KL divergence regularizer that learns layer-wise optimal uncertainty levels
Significantly improves model calibration performance and provides interpretable uncertainty profiles
Easy to integrate into standard architectures with minimal computational overhead
Limitations:
Only experimental results for the CIFAR-10 dataset are presented, and performance verification on other datasets or complex models is required.
Further research is needed on the generalization performance of the method presented in this paper.
Further comparative analysis with other uncertainty quantification methods is needed.
👍