In this paper, we propose a novel method for 3D reconstruction using spike cameras, called USP-Gaussian. Existing spike-based 3D reconstruction methods use a cascaded approach that sequentially performs image reconstruction, camera pose estimation, and 3D reconstruction from spike streams, which causes accumulated errors. USP-Gaussian solves these problems with an end-to-end framework that integrates spike-based image reconstruction, pose correction, and Gaussian splatting. It performs an iterative optimization that seamlessly integrates information between spike-image networks and 3DGS by leveraging the multi-view consistency of 3DGS and the motion capture capability of spike cameras. Experimental results on synthetic datasets show that our method outperforms existing methods and achieves robust 3D reconstruction even when the initial pose is inaccurate in real environments.