Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

Created by
  • Haebom

Author

Rapha el Merx, Hanna Suominen, Trevor Cohn, Ekaterina Vylomova

Outline

To address the lack of a robust medical machine translation (MT) evaluation dataset for low-resource languages, this paper introduces OpenWHO, a document-level parallel corpus consisting of 2,978 documents and 26,824 sentences extracted from the World Health Organization (WHO) e-learning platform. OpenWHO encompasses over 20 diverse languages, nine of which are low-resource languages. Leveraging this new resource, we evaluate state-of-the-art large-scale language models (LLMs) and existing MT models. Our evaluation results show that LLMs consistently outperform existing MT models, particularly on the low-resource language test set. Gemini 2.5 Flash outperforms NLLB-54B by 4.79 ChrF points. Furthermore, we investigate the impact of contextual leverage on accuracy in LLMs, demonstrating that the benefits of document-level translation are particularly pronounced in specialized fields such as healthcare. Finally, we release the OpenWHO corpus to encourage research in low-resource language MT in healthcare.

Takeaways, Limitations

Takeaways:
We present OpenWHO, a new high-quality parallel corpus for low-resource language medical machine translation.
We experimentally demonstrate that LLM outperforms existing MT models in low-resource language medical translation.
The benefits of document-level translation are even greater in specialized fields, especially in the medical field.
By releasing the OpenWHO corpus, we contribute to activating MT research in the field of low-resource language medicine.
Limitations:
The OpenWHO corpus may be relatively small compared to other large corpora.
The types of LLM and existing MT models used in the evaluation may be limited.
In-depth analysis of the specific vocabulary and grammatical features of the medical field may be lacking.
The linguistic diversity of the OpenWHO corpus may not be perfect.
👍