Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

VoxelPrompt: A Vision Agent for End-to-End Medical Image Analysis

Created by
  • Haebom

Author

Andrew Hoopes, Neel Dey, Victor Ion Butoi, John V. Guttag, Adrian V. Dalca

VoxelPrompt: End-to-End Image Analysis Agent for Radiological Tasks

Outline

VoxelPrompt is an end-to-end image analysis agent that takes arbitrary volumetric medical images and natural language prompts as input and performs free-form radiological tasks. It integrates a language model to generate executable code, which then calls a jointly trained adaptive vision network. This code performs analysis steps to achieve practical quantitative goals, such as tumor growth measurement. VoxelPrompt automates analyses currently performed by medical professionals, combining multiple specialized vision and statistical tools. We evaluate VoxelPrompt on a variety of neuroimaging tasks and demonstrate its ability to identify hundreds of anatomical and pathological features, measure complex morphological characteristics, and perform open-language analysis of lesion characteristics. VoxelPrompt performs these goals with accuracy comparable to specialized single-task models for image analysis, facilitating a wide range of configurable biomedical workflows.

Takeaways, Limitations

Achieves accuracy similar to expert single-task models across a variety of neuroimaging tasks.
Facilitating diverse, configurable biomedical workflows.
Providing end-to-end solutions for free-form radiological work.
Executable code generation and adaptive vision network integration.
(Specific Limitations is not specified in the paper)
👍