This paper proposes an agent-based retrieval augmented generation (RAG) framework for radiology question answering (QA). To overcome the limitations of conventional single-step retrieval methods, the proposed framework enables a large-scale language model (LLM) to autonomously decompose radiology questions, iteratively retrieve targeted clinical evidence from Radiopaedia.org, and dynamically synthesize evidence-based responses. Twenty-five LLMs with various architectures, parameter sizes (0.5B to >670B), and training paradigms (general purpose, inference optimization, and clinical fine-tuning) were used to evaluate the framework on 104 expert-curated radiology questions from the RSNA-RadioQA and ExtendedQA datasets and 65 real radiology exam questions. The results show that agent retrieval significantly improves average diagnostic accuracy compared to zero-shot prompting and conventional online RAG, particularly for small models. Furthermore, it significantly improves factual evidence by reducing hallucinations and retrieving clinically relevant context. The benefits of agent retrieval were also observed in clinically fine-tuned models. All datasets, code, and the entire agent framework are publicly available.