In this paper, we present YOLO-DCAP, an improved version of YOLOv5, to address the challenges of object localization in satellite images (high variability of objects, low spatial resolution, interference of key features with noises such as clouds and city lights). YOLO-DCAP improves the performance of object localization in satellite images by integrating the MDRC (Multi-scale Dilated Residual Convolution) block, which captures multi-scale features with various dilation rates, and the AaSP (Attention-aided Spatial Pooling) module, which focuses on globally relevant spatial regions. Experimental results on three satellite datasets (mesospheric bore, upper-atmosphere gravity wave, and ocean eddy) show that YOLO-DCAP improves the performance by an average of 20.95% in mAP50 and 32.23% in IoU compared to the baseline YOLO model and the state-of-the-art models. This demonstrates the robustness and generalizability of the proposed method. The code is publicly available.