Author
Listed:
- Lavender Yao Jiang
(NYU Langone Health
New York University)
- Xujin Chris Liu
(NYU Langone Health
Tandon School of Engineering)
- Nima Pour Nejatian
(NVIDIA)
- Mustafa Nasir-Moin
(NYU Langone Health)
- Duo Wang
(NYU Langone Health)
- Anas Abidin
(NVIDIA)
- Kevin Eaton
(NYU Langone Health)
- Howard Antony Riina
(NYU Langone Health)
- Ilya Laufer
(NYU Langone Health)
- Paawan Punjabi
(NYU Langone Health)
- Madeline Miceli
(NYU Langone Health)
- Nora C. Kim
(NYU Langone Health)
- Cordelia Orillac
(NYU Langone Health)
- Zane Schnurman
(NYU Langone Health)
- Christopher Livia
(NYU Langone Health)
- Hannah Weiss
(NYU Langone Health)
- David Kurland
(NYU Langone Health)
- Sean Neifert
(NYU Langone Health)
- Yosef Dastagirzada
(NYU Langone Health)
- Douglas Kondziolka
(NYU Langone Health)
- Alexander T. M. Cheung
(NYU Langone Health)
- Grace Yang
(NYU Langone Health
New York University)
- Ming Cao
(NYU Langone Health
New York University)
- Mona Flores
(NVIDIA)
- Anthony B. Costa
(NVIDIA)
- Yindalon Aphinyanaphongs
(NYU Langone Health
NYU Langone Health)
- Kyunghyun Cho
(New York University
Prescient Design, Genentech
New York University
Canadian Institute for Advanced Research)
- Eric Karl Oermann
(NYU Langone Health
New York University
NYU Langone Health)
Abstract
Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment1–3. Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing4,5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7–94.9%, with an improvement of 5.36–14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
Suggested Citation
Lavender Yao Jiang & Xujin Chris Liu & Nima Pour Nejatian & Mustafa Nasir-Moin & Duo Wang & Anas Abidin & Kevin Eaton & Howard Antony Riina & Ilya Laufer & Paawan Punjabi & Madeline Miceli & Nora C. K, 2023.
"Health system-scale language models are all-purpose prediction engines,"
Nature, Nature, vol. 619(7969), pages 357-362, July.
Handle:
RePEc:nat:nature:v:619:y:2023:i:7969:d:10.1038_s41586-023-06160-y
DOI: 10.1038/s41586-023-06160-y
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nature:v:619:y:2023:i:7969:d:10.1038_s41586-023-06160-y. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.