This paper systematically evaluates nine state-of-the-art deep denoising models (e.g., Neighbor2Neighbor, Blind2Unblind, DSPNet, etc.) applied to sonar image preprocessing to address the accuracy degradation problem caused by complex noise patterns such as speckle, echo, and non-Gaussian noise in object detection of underwater robots for autonomous navigation and resource exploration. Using five public sonar datasets and four representative object detection algorithms (YOLOX, Faster R-CNN, SSD300, and SSDMobileNetV2), we evaluate the effectiveness of applying optical image denoising models to sonar data, the optimal model for sonar noise, and whether denoising improves detection accuracy in real pipelines. The experimental results show that denoising generally improves detection performance, but the effects vary due to the inherent bias of each model for noise types. Therefore, we propose a cross-supervised multi-source denoising fusion framework, in which the outputs of multiple denoisers mutually supervise each other at the pixel level to produce cleaner images.