This paper presents a method for using differentially privacy-preserving synthetic data in privacy-conscious federated learning (DP-FL). Existing DP synthetic data generation algorithms require careful prompt engineering based on public information or iterative private client feedback. In this paper, we propose the POPri algorithm, which treats private client feedback collected from existing methods as reinforcement learning (RL) rewards and fine-tunes LLMs using a policy optimization algorithm (e.g., DPO) to generate high-quality DP synthetic data. We evaluate POPri using LargeFedBench, a novel federated text benchmark, and find that it significantly improves the usability of DP synthetic data compared to existing methods, reducing the gap between fully private and non-private settings by up to 58%. The code and data are available on GitHub.