IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0136651.html
   My bibliography  Save this article

Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts

Author

Listed:
  • Katherine P Liao
  • Ashwin N Ananthakrishnan
  • Vishesh Kumar
  • Zongqi Xia
  • Andrew Cagan
  • Vivian S Gainer
  • Sergey Goryachev
  • Pei Chen
  • Guergana K Savova
  • Denis Agniel
  • Susanne Churchill
  • Jaeyoung Lee
  • Shawn N Murphy
  • Robert M Plenge
  • Peter Szolovits
  • Isaac Kohane
  • Stanley Y Shaw
  • Elizabeth W Karlson
  • Tianxi Cai

Abstract

Background: Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. Methods and Results: We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p

Suggested Citation

  • Katherine P Liao & Ashwin N Ananthakrishnan & Vishesh Kumar & Zongqi Xia & Andrew Cagan & Vivian S Gainer & Sergey Goryachev & Pei Chen & Guergana K Savova & Denis Agniel & Susanne Churchill & Jaeyoun, 2015. "Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-11, August.
  • Handle: RePEc:plo:pone00:0136651
    DOI: 10.1371/journal.pone.0136651
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0136651
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0136651&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0136651?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zongqi Xia & Elizabeth Secor & Lori B Chibnik & Riley M Bove & Suchun Cheng & Tanuja Chitnis & Andrew Cagan & Vivian S Gainer & Pei J Chen & Katherine P Liao & Stanley Y Shaw & Ashwin N Ananthakrishna, 2013. "Modeling Disease Severity in Multiple Sclerosis Using Electronic Health Records," PLOS ONE, Public Library of Science, vol. 8(11), pages 1-9, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chuan Hong & Katherine P. Liao & Tianxi Cai, 2019. "Semiā€supervised validation of multiple surrogate outcomes with application to electronic medical records phenotyping," Biometrics, The International Biometric Society, vol. 75(1), pages 78-89, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jessica Gronsbell & Molei Liu & Lu Tian & Tianxi Cai, 2022. "Efficient evaluation of prediction rules in semiā€supervised settings under stratified sampling," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1353-1391, September.
    2. Jessica Gronsbell & Jessica Minnier & Sheng Yu & Katherine Liao & Tianxi Cai, 2019. "Automated feature selection of predictors in electronic medical records data," Biometrics, The International Biometric Society, vol. 75(1), pages 268-277, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0136651. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.