In this paper, we present a novel self-supervised learning-based latent acoustic mapping (LAM) model as an acoustic mapping technique for direction-of-arrival (DoAE) in spatial audio processing, which overcomes the limitations of conventional beamforming techniques and recent supervised deep learning techniques. The LAM model combines the interpretability of conventional methods with the adaptability and efficiency of deep learning methods to generate high-resolution acoustic maps and operate efficiently under various acoustic conditions and microphone arrays. The robustness to DoAE is evaluated using LOCATA and STARSS benchmarks, and the results show that it achieves equivalent or superior localization performance compared to conventional supervised learning methods, and the generated acoustic maps can be used as features of supervised learning models to further improve the DoAE accuracy.