This paper presents an effective method to fine-tune pre-trained generative models with reinforcement learning (RL) to match complex human preferences. In particular, we focus on fine-tuning a next-generation visual autoregressive (VAR) model using group-relative policy optimization (GRPO). Experimental results show that alignment of complex reward signals obtained from the aesthetic predictor and CLIP embeddings significantly improves image quality and provides precise control over the generative style. By leveraging CLIP, we help the VAR model generalize beyond the initial ImageNet distribution, and through RL-based exploration, we can generate images tailored to prompts that refer to image styles that were not present during pre-training. In conclusion, we demonstrate that RL-based fine-tuning is efficient and effective for VAR models, and is advantageous over diffusion-based alternatives, especially due to its fast inference speed and its favorable online sampling.