Sign In

Does artificial intelligence dream of radiology?

Haebom
The title is an homage to 'Do Androids Dream of Electric Sheep?', and ever since the introduction of GPT-4V, we've started to see some fascinating examples. However, because processing and inputting images uses an enormous number of tokens, it's inevitably used only for high value-added services for cost reasons—the field that naturally comes to mind is what's called radiology.
To put it simply, this field diagnoses by imaging and capturing parts or the whole body using various techniques like CT, MRI, and X-ray. It's often used in areas that are hard to directly examine, such as in health screenings, orthopedics, and obstetrics and gynecology.
Additionally, according to a study released by Microsoft <Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine>, the general AI model GPT-4 demonstrated expert-level performance in the medical field. It actually showed remarkable results, surpassing even models specialized for medical applications. This research found that GPT-4 excelled in a variety of medical problem-solving benchmarks, especially showing much better results than existing models on tasks requiring medical knowledge.
What's impressive is that GPT-4 was reported to achieve top performance using a technique called 'Medprompt,' even without any special fine-tuning. This marked the first time GPT-4 surpassed 90% accuracy on the MedQA dataset, and reduced the misdiagnosis rate by 27% compared to Google's Med-PaLM2.
Many AI experts believe that you need domain-focused fine-tuning to get general foundational models to perform well in a specific field. But fine-tuning can be costly—it requires professionally labeled datasets, experts, and compute resources for parameter updates. This process eats up a lot of resources and money, making it a tough challenge, especially for small and medium businesses.
This research highlights the value of digging deeper into the potential of prompting to turn general models into expert-level models. What's even more interesting is that the prompting techniques shown here have proven useful in a wide range of professional qualification tests—without the need to update experts or expertise.
💡
Simply put, fine-tuning itself is expensive, but you can achieve solid performance just with prompt settings. In other words, if your basic model is good enough even without fine-tuning, with well-crafted prompts, you can reach expert-level performance in a specific domain.
The papers and studies released this time show that general AI models like GPT-4 can actually play the role of an expert in specific fields. This opens up new possibilities, allowing even small businesses or organizations with limited resources to leverage advanced AI capabilities. AI technology will continue to develop, and it's expected to bring about innovative changes across various industries.
Subscribe to 'haebom'
📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
haebom@kakao.com
Subscribe
1
미미공주
우와 엄청 재밌네요 저도 프롬프트 연구를 집중해야겠다고 생각이 드네요 !
See latest comments