This paper studies parameter-efficient fine-tuning (PEFT) techniques, particularly adapter-based methods, for large-scale music generation models such as MusicGen and Mustango. We explore optimal adapter designs by comparing various adapter configurations (architecture, layout, and size) for two resource-sparse music genres: Hindustani classical music and Turkish Makam music. We find that convolution-based adapters excel at fine-grained musical details, while transformer-based adapters better preserve long-term dependencies. Furthermore, we find that a medium-sized adapter (40M parameters) offers the optimal balance between expressiveness and quality. Mustango (a diffusion-based model) offers excellent diversity but suffers from instability, while MusicGen (an autoregressive model) trains quickly and produces high-quality artifacts but produces somewhat redundant artifacts.