This paper presents a novel source separation model specialized for accurate vocal separation. To overcome the difficulty of existing Transformer-based models in capturing intermittent vocals, we leverage Mamba2, a state-of-the-art state-space model that better captures long-term temporal dependencies. To efficiently process long input sequences, we combine a band-splitting strategy with a dual-path architecture. Experimental results demonstrate that the proposed model outperforms current state-of-the-art models, achieving a cSDR (best-in-class) of 11.03 dB and demonstrating significant performance improvements even at uSDR. Furthermore, it demonstrates stable and consistent performance across a wide range of input lengths and vocal occurrence patterns. These results demonstrate the effectiveness of the Mamba-based model for high-resolution audio processing and suggest new directions for broader applications in audio research.