Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

BioPars: A Pretrained Biomedical Large Language Model for Persian Biomedical Text Mining

Created by
  • Haebom

Author

Baqer M. Merzah, Tania Taami, Salman Asoudeh, Saeed Mirzaee, Amir Reza Hossein pour, Amir Ali Bengari

Outline

In this paper, we introduce BIOPARS-BENCH, a dataset extracted from over 10,000 scientific papers, textbooks, and medical websites, and BioParsQA, a dataset of 5,231 Persian medical question-answer pairs, to evaluate the potential of large-scale language models (LLMs) in bioinformatics. We propose a new metric, BioPars, to evaluate the LLM’s ability to acquire expertise, interpret and synthesize knowledge, and present evidence, and compare and analyze ChatGPT, Llama, and Galactica to evaluate the performance of each model. BioPars is the first application of LLM to Persian medical question-answering, especially for long-answer generation. The evaluation results for BioParsQA achieved a ROUGE-L score of 29.99, a BERTScore of 90.87 (using the MMR method), a MoverScore of 60.43, and a BLEURT of 50.78, showing improved performance than the existing GPT-4 1.0. BioPars is under continuous development and related materials will be made available via the GitHub repository ( https://github.com/amirap80/BioPars) .

Takeaways, Limitations

Takeaways:
A new evaluation metric, BioPars, is presented to demonstrate the potential of LLM in bioinformatics.
Application and achievement of excellent performance in the field of Persian Medical Q&A.
Analyze the strengths and weaknesses of existing LLMs and suggest the need for further fine-tuning to address bioinformatics challenges.
Provides comprehensive performance evaluation through various evaluation indices (ROUGE-L, BERTScore, MoverScore, BLEURT).
Limitations:
BioPars is specialized for Persian medical Q&A, so generalizability to other languages or fields may be limited.
LLM shows poor performance in high-dimensional real-world problems and fine-grained reasoning capabilities.
This is a project currently in development and will require further improvements and expansion in the future.
👍