Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Grounding DINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models

Created by
  • Haebom

Author

Hamza Rasaee, Taha Koleilat, Hassan Rivaz

Outline

This paper highlights that accurate and generalizable object segmentation in ultrasound images is challenging due to anatomical variations, diverse imaging protocols, and limited annotation data. To address this challenge, we propose a prompt-based visual-language model (VLM) that integrates Grounding DINO and SAM2. Using 18 publicly available ultrasound datasets, including breast, thyroid, liver, prostate, kidney, and paraspinal muscles, Grounding DINO is fine-tuned and validated on 15 datasets using Low Rank Adaptation (LoRA) in the ultrasound domain. The remaining three datasets are used for testing to evaluate performance on unknown distributions. Experimental results demonstrate that the proposed method outperforms state-of-the-art segmentation methods, including UniverSeg, MedSAM, MedCLIP-SAM, BiomedParse, and SAMUS, on most existing datasets, maintaining robust performance even on unknown datasets without additional fine-tuning. These results highlight the promise of VLM for scalable and robust ultrasound image analysis and suggest that it can reduce the reliance on large-scale organ-specific annotation data. The code will be published at code.sonography.ai after acceptance.

Takeaways, Limitations

Takeaways:
Improving object segmentation performance in various ultrasound organs using VLM integrating Grounding DINO and SAM2.
Achieving performance that surpasses state-of-the-art methods.
Maintains robust performance even on unknown datasets (without additional fine-tuning).
Reduced reliance on large-scale, long-term specific annotation data.
Presenting the possibility of scalable and robust ultrasound image analysis.
Limitations:
Lack of detailed description of the type and distribution of the provided dataset.
Lack of detailed information on fine-tuning process and hyperparameters using LoRA.
The code is scheduled to be published on code.sonography.ai, but is not yet public.
Further verification of the generalizability of the experimental results is needed.
👍