This paper proposes Stylus, a novel training-free framework that performs musical style transfer by directly manipulating the self-attention layer of a pre-trained latent diffusion model (LDM). Operating in the Mel Spectrogram domain, Stylus transfers musical styles by replacing key and value representations of content audio with representations of stylistic references without any fine-tuning. It integrates query-preserving, CFG-inspired guided scaling, multi-style interpolation, and phase-preserving reconstruction to enhance styling quality and controllability. It significantly improves perceptual quality and structure preservation compared to existing work, while remaining lightweight and easy to deploy. This study highlights the potential of diffusion-based attention manipulation for efficient, high-fidelity, and interpretable music generation without training.