This paper focuses on agent workflows, where multiple AI agents perform complex tasks (e.g., reasoning, planning, etc.). The performance of these workflows heavily relies on prompts that guide each agent's role, and incorrect prompts can degrade the overall system performance. To address this issue, we present a novel inference time optimization method, ProRefine. ProRefine dynamically improves prompts for multi-step inference tasks by generating and applying textual feedback through a loop of LLM agents, without requiring additional training or correct labeling. On five mathematical inference benchmark datasets, ProRefine outperforms a zero-shot Chain-of-Thought baseline model by 3-37%, and also demonstrates its effectiveness in elevating the performance of smaller models to that of larger models. This suggests its potential to contribute to the construction of cost-effective and powerful hybrid AI systems and to improving the accessibility of high-performance AI.