Author
Abstract
Introduction US claims data contain medical data on large heterogeneous populations and are excellent sources for medical research. Some claims data do not contain complete death records, limiting their use for mortality or mortality-related studies. A model to predict whether a patient died at the end of the follow-up time (referred to as the end of observation) is needed to enable mortality-related studies. Objective The objective of this study was to develop a patient-level model to predict whether the end of observation was due to death in US claims data. Methods We used a claims dataset with full death records, Optum© De-Identified Clinformatics® Data-Mart-Database—Date of Death mapped to the Observational Medical Outcome Partnership common data model, to develop a model that classifies the end of observations into death or non-death. A regularized logistic regression was trained using 88,514 predictors (recorded within the prior 365 or 30 days) and externally validated by applying the model to three US claims datasets. Results Approximately 25 in 1000 end of observations in Optum are due to death. The Discriminating End of observation into Alive and Dead (DEAD) model obtained an area under the receiver operating characteristic curve of 0.986. When defining death as a predicted risk of > 0.5, only 2% of the end of observations were predicted to be due to death and the model obtained a sensitivity of 62% and a positive predictive value of 74.8%. The external validation showed the model was transportable, with area under the receiver operating characteristic curves ranging between 0.951 and 0.995 across the US claims databases. Conclusions US claims data often lack complete death records. The DEAD model can be used to impute death at various sensitivity, specificity, or positive predictive values depending on the use of the model. The DEAD model can be readily applied to any observational healthcare database mapped to the Observational Medical Outcome Partnership common data model and is available from https://github.com/OHDSI/StudyProtocolSandbox/tree/master/DeadModel .
Suggested Citation
Jenna M. Reps & Peter R. Rijnbeek & Patrick B. Ryan, 2019.
"Identifying the DEAD: Development and Validation of a Patient-Level Model to Predict Death Status in Population-Level Claims Data,"
Drug Safety, Springer, vol. 42(11), pages 1377-1386, November.
Handle:
RePEc:spr:drugsa:v:42:y:2019:i:11:d:10.1007_s40264-019-00827-0
DOI: 10.1007/s40264-019-00827-0
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:42:y:2019:i:11:d:10.1007_s40264-019-00827-0. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.