This paper presents a novel approach that addresses two key challenges of existing methods that directly align diffusion models with human preferences: the computational cost and the need for continuous offline compensation model adaptation. Existing methods require gradient calculations during multi-stage denoising, resulting in high computational costs. Furthermore, they have limited optimization steps and require continuous offline compensation model adaptation to achieve realistic images and accurate lighting effects. To overcome the limitations of multi-stage denoising, this paper proposes a Direct-Align method that predefines a noise dictionary and effectively reconstructs the original image through interpolation at arbitrary time steps. Furthermore, we introduce Semantic Relative Preference Optimization (SRPO), which uses textual conditional cues as compensation. This method adjusts the compensation online based on positive and negative prompt reinforcement, reducing the reliance on offline compensation fine-tuning. By fine-tuning the FLUX model with optimized denoising and online compensation adjustment, we achieve a more than threefold improvement in human-rated realism and aesthetic quality.