To address the lack of a robust medical machine translation (MT) evaluation dataset for low-resource languages, this paper introduces OpenWHO, a document-level parallel corpus consisting of 2,978 documents and 26,824 sentences extracted from the World Health Organization (WHO) e-learning platform. OpenWHO encompasses over 20 diverse languages, nine of which are low-resource languages. Leveraging this new resource, we evaluate state-of-the-art large-scale language models (LLMs) and existing MT models. Our evaluation results show that LLMs consistently outperform existing MT models, particularly on the low-resource language test set. Gemini 2.5 Flash outperforms NLLB-54B by 4.79 ChrF points. Furthermore, we investigate the impact of contextual leverage on accuracy in LLMs, demonstrating that the benefits of document-level translation are particularly pronounced in specialized fields such as healthcare. Finally, we release the OpenWHO corpus to encourage research in low-resource language MT in healthcare.