Author
Listed:
- Juan Manuel Zambrano Chaves
(Microsoft Research
Stanford University)
- Shih-Cheng Huang
(Stanford University)
- Yanbo Xu
(Microsoft Research)
- Hanwen Xu
(University of Washington)
- Naoto Usuyama
(Microsoft Research)
- Sheng Zhang
(Microsoft Research)
- Fei Wang
(University of Southern California)
- Yujia Xie
(Microsoft Research)
- Mahmoud Khademi
(Microsoft Research)
- Ziyi Yang
(Microsoft Research)
- Hany Awadalla
(Microsoft Research)
- Julia Gong
(Microsoft Research)
- Houdong Hu
(Microsoft Research)
- Jianwei Yang
(Microsoft Research)
- Chunyuan Li
(Microsoft Research)
- Jianfeng Gao
(Microsoft Research)
- Yu Gu
(Microsoft Research)
- Cliff Wong
(Microsoft Research)
- Mu Wei
(Microsoft Research)
- Tristan Naumann
(Microsoft Research)
- Muhao Chen
(University of California)
- Matthew P. Lungren
(Microsoft Research
Stanford University
University of California)
- Akshay Chaudhari
(Stanford University)
- Serena Yeung-Levy
(Stanford University)
- Curtis P. Langlotz
(Stanford University)
- Sheng Wang
(University of Washington)
- Hoifung Poon
(Microsoft Research)
Abstract
Large foundation models show promise in biomedicine but face challenges in clinical use due to performance gaps, accessibility, cost, and lack of scalable evaluation. Here we show that open-source small multimodal models can bridge these gaps in radiology by generating free-text findings from chest X-ray images. Our data-centric approach leverages 697K curated radiology image-text pairs to train a specialized, domain-adapted chest X-ray encoder. We integrate this encoder with pre-trained language models via a lightweight adapter that aligns image and text modalities. To enable robust, clinically relevant evaluation, we develop and validate CheXprompt, a GPT-4-based metric for assessing factual accuracy aligned with radiologists’ evaluations. Benchmarked with CheXprompt and other standard factuality metrics, LLaVA-Rad (7B) achieves state-of-the-art performance, outperforming much larger models like GPT-4V and Med-PaLM M (84B). While not immediately ready for real-time clinical deployment, LLaVA-Rad is a scalable, privacy-preserving and cost-effective step towards clinically adaptable multimodal AI for radiology.
Suggested Citation
Juan Manuel Zambrano Chaves & Shih-Cheng Huang & Yanbo Xu & Hanwen Xu & Naoto Usuyama & Sheng Zhang & Fei Wang & Yujia Xie & Mahmoud Khademi & Ziyi Yang & Hany Awadalla & Julia Gong & Houdong Hu & Jia, 2025.
"A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings,"
Nature Communications, Nature, vol. 16(1), pages 1-15, December.
Handle:
RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-58344-x
DOI: 10.1038/s41467-025-58344-x
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-58344-x. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.