This paper presents a novel methodology based on large-scale language models (LLMs) for automating dynamic programming (DP) modeling. Traditional DP modeling requires expert knowledge, but LLMs have the potential to automate this process. However, the probabilistic nature of DP problems and limited training data make direct application of traditional LLM-based models difficult. Therefore, in this paper, we introduce DP-Bench, a benchmark that covers a variety of DP problems, and present DPLM, a specialized model with 7 billion parameters. DPLM extends the training data from limited initial examples by utilizing DualReflect, a synthetic data generation pipeline. DualReflect combines forward generation for diversity and backward generation for reliability, and shows that backward generation is more effective in low-data environments and forward generation is more effective in large-data environments. DPLM achieves performance comparable to state-of-the-art LLMs such as OpenAI's o1 and DeepSeek-R1, and outperforms them on difficult problems.