This paper addresses visual localization (VPR), essential for the safe navigation of mobile robots. It utilizes a deep learning model fine-tuned with a triplet loss function that integrates panoramic images and a curriculum learning strategy. By progressively presenting more challenging examples during training, the model learns more discriminative and robust feature representations, overcoming the limitations of conventional contrastive loss functions. After training, VPR is performed in two stages: coarse (room search) and fine (localization). The proposed method is evaluated in various indoor and outdoor environments, and tested against common challenges encountered in real-world operating conditions, such as severe lighting changes, dynamic visual effects such as noise and occlusion, and limited training data.