Despite advances in human preference alignment using Group Relative Policy Optimization (GRPO) for image and video generation, existing approaches suffer from inefficiencies due to sequential rollout, excessive sampling steps, and sparse terminal rewards. In this paper, we propose BranchGRPO, which restructures the rollout process into a branching tree to distribute computation and eliminate low-value paths and redundant depths. BranchGRPO introduces a branching scheme that distributes rollout costs through a shared prefix, a reward fusion and depth-specific advantage estimator that convert sparse terminal rewards into dense step-level signals, and a pruning strategy that reduces gradient computation. On HPDv2.1 image alignment, BranchGRPO improves alignment scores by up to 16% compared to DanceGRPO while reducing training time per iteration by approximately 55%. A hybrid variant, BranchGRPO-Mix, trains 4.7x faster than DanceGRPO without compromising alignment performance. On WanX video generation, BranchGRPO achieves higher Video-Align scores and sharper, temporally consistent frames than DanceGRPO.