Author
Listed:
- Dhouha Grissa
- Ditlev Nytoft Rasmussen
- Aleksander Krag
- Søren Brunak
- Lars Juhl Jensen
Abstract
Alcoholic-related liver disease (ALD) is the cause of more than half of all liver-related deaths. Sustained excess drinking causes fatty liver and alcohol-related steatohepatitis, which may progress to alcoholic liver fibrosis (ALF) and eventually to alcohol-related liver cirrhosis (ALC). Unfortunately, it is difficult to identify patients with early-stage ALD, as these are largely asymptomatic. Consequently, the majority of ALD patients are only diagnosed by the time ALD has reached decompensated cirrhosis, a symptomatic phase marked by the development of complications as bleeding and ascites. The main goal of this study is to discover relevant upstream diagnoses helping to understand the development of ALD, and to highlight meaningful downstream diagnoses that represent its progression to liver failure. Here, we use data from the Danish health registries covering the entire population of Denmark during nineteen years (1996–2014), to examine if it is possible to identify patients likely to develop ALF or ALC based on their past medical history. To this end, we explore a knowledge discovery approach by using high-dimensional statistical and machine learning techniques to extract and analyze data from the Danish National Patient Registry. Consistent with the late diagnoses of ALD, we find that ALC is the most common form of ALD in the registry data and that ALC patients have a strong over-representation of diagnoses associated with liver dysfunction. By contrast, we identify a small number of patients diagnosed with ALF who appear to be much less sick than those with ALC. We perform a matched case–control study using the group of patients with ALC as cases and their matched patients with non-ALD as controls. Machine learning models (SVM, RF, LightGBM and NaiveBayes) trained and tested on the set of ALC patients achieve a high performance for data classification (AUC = 0.89). When testing the same trained models on the small set of ALF patients, their performance unsurprisingly drops a lot (AUC = 0.67 for NaiveBayes). The statistical and machine learning results underscore small groups of upstream and downstream comorbidities that accurately detect ALC patients and show promise in prediction of ALF. Some of these groups are conditions either caused by alcohol or caused by malnutrition associated with alcohol-overuse. Others are comorbidities either related to trauma and life-style or to complications to cirrhosis, such as oesophageal varices. Our findings highlight the potential of this approach to uncover knowledge in registry data related to ALD.Author summary: Alcoholic liver disease (ALD) is one of the most common chronic liver disease worldwide. It progresses from fatty liver to alcoholic liver fibrosis then to cirrhosis. Unfortunately, people with early-stage ALD have almost no symptoms, for which reason most patients are only discovered when it is already too late. We have thus worked on finding an effective way to detect ALD at an early stage by searching for signs of alcohol over-use among patients. To this end, we analyzed big data from the Danish National Patient Registry, which covers the whole population of Denmark. We found that only 499 patients were diagnosed with alcoholic liver fibrosis and that alcoholic liver cirrhosis is the most frequent form of ALD in registry data. We identified typical diagnoses seen in patients before they developed cirrhosis, many of which relate to the liver not functioning properly. However, we found that patients with fibrosis were much harder to identify based on their past medical records, since most of them show very few signs of being sick.
Suggested Citation
Dhouha Grissa & Ditlev Nytoft Rasmussen & Aleksander Krag & Søren Brunak & Lars Juhl Jensen, 2020.
"Alcoholic liver disease: A registry view on comorbidities and disease prediction,"
PLOS Computational Biology, Public Library of Science, vol. 16(9), pages 1-19, September.
Handle:
RePEc:plo:pcbi00:1008244
DOI: 10.1371/journal.pcbi.1008244
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1008244. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.