IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005662.html
   My bibliography  Save this article

Learning causal networks with latent variables from multivariate information in genomic data

Author

Listed:
  • Louis Verny
  • Nadir Sella
  • Séverine Affeldt
  • Param Priya Singh
  • Hervé Isambert

Abstract

Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at https://github.com/miicTeam/MIIC.Author summary: The reconstruction of causal networks from genomic data is an important but challenging problem. Predicting key regulatory interactions or genomic alterations at the origin of human diseases can guide experimental investigation and ultimately inspire innovative therapy. However, causal relationships are difficult to establish without the possibility to directly perturb the organisms’ genome for ethical or practical reasons. Besides, unmeasured (latent) variables may be hidden in many genomic datasets and lead to spurious causal relationships between observed variables. We propose in this paper an efficient computational approach, miic, that overcomes these limitations and learns causal networks from non-perturbative (observational) data in the presence of latent variables. In addition, we assess the confidence of each predicted interaction and demonstrate the enhanced robustness and accuracy of miic compared to alternative existing methods. This approach can be applied on a wide range of datasets and provide new biological insights on regulatory networks from single cell expression data or genomic alterations during tumor development. Miic is implemented in an R package freely available to the scientific community under a General Public License.

Suggested Citation

  • Louis Verny & Nadir Sella & Séverine Affeldt & Param Priya Singh & Hervé Isambert, 2017. "Learning causal networks with latent variables from multivariate information in genomic data," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-25, October.
  • Handle: RePEc:plo:pcbi00:1005662
    DOI: 10.1371/journal.pcbi.1005662
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005662
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005662&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005662?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. William McGill, 1954. "Multivariate information transmission," Psychometrika, Springer;The Psychometric Society, vol. 19(2), pages 97-116, June.
    2. Malaguti, Giulia & Singh, Param Priya & Isambert, Hervé, 2014. "On the retention of gene duplicates prone to dominant deleterious mutations," Theoretical Population Biology, Elsevier, vol. 93(C), pages 38-51.
    3. Madhusudhan Kollareddy & Elizabeth Dimitrova & Krishna C. Vallabhaneni & Adriano Chan & Thuc Le & Krishna M. Chauhan & Zunamys I. Carrero & Gopalakrishnan Ramakrishnan & Kounosuke Watabe & Ygal Haupt , 2015. "Regulation of nucleotide metabolism by mutant p53 contributes to its gain-of-function activities," Nature Communications, Nature, vol. 6(1), pages 1-13, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Rohith Palli & Mukta G Palshikar & Juilee Thakar, 2019. "Executable pathway analysis using ensemble discrete-state modeling for large-scale data," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-21, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:hig:wpaper:98sti2019 is not listed on IDEAS
    2. Petersen, Alexander M. & Rotolo, Daniele & Leydesdorff, Loet, 2016. "A triple helix model of medical innovation: Supply, demand, and technological capabilities in terms of Medical Subject Headings," Research Policy, Elsevier, vol. 45(3), pages 666-681.
    3. Park, Han Woo & Leydesdorff, Loet, 2010. "Longitudinal trends in networks of university-industry-government relations in South Korea: The role of programmatic incentives," Research Policy, Elsevier, vol. 39(5), pages 640-649, June.
    4. Songyot Nakariyakul, 2019. "A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification," PLOS ONE, Public Library of Science, vol. 14(2), pages 1-17, February.
    5. Xiaojun Hu & Xian Li & Ronald Rousseau, 2021. "Mathematical reflections on Triple Helix calculations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8581-8587, October.
    6. Inga A. Ivanova & Loet Leydesdorff, 2014. "A simulation model of the Triple Helix of university–industry–government relations and the decomposition of the redundancy," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(3), pages 927-948, June.
    7. Loet Leydesdorff & Han Woo Park & Balazs Lengyel, 2014. "A routine for measuring synergy in university–industry–government relations: mutual information as a Triple-Helix and Quadruple-Helix indicator," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 27-35, April.
    8. Camilla Tombari & Alessandro Zannini & Rebecca Bertolio & Silvia Pedretti & Matteo Audano & Luca Triboli & Valeria Cancila & Davide Vacca & Manuel Caputo & Sara Donzelli & Ilenia Segatto & Simone Vodr, 2023. "Mutant p53 sustains serine-glycine synthesis and essential amino acids intake promoting breast cancer growth," Nature Communications, Nature, vol. 14(1), pages 1-21, December.
    9. Mariusz Kubkowski & Jan Mielniczuk, 2021. "Asymptotic Distributions of Empirical Interaction Information," Methodology and Computing in Applied Probability, Springer, vol. 23(1), pages 291-315, March.
    10. Loet Leydesdorff, 2011. "“Structuration” by intellectual organization: the configuration of knowledge in relations among structural components in networks of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(2), pages 499-520, August.
    11. Lengyel, Balázs & Leydesdorff, Loet, 2015. "The Effects of FDI on Innovation Systems in Hungarian Regions: Where is the Synergy Generated?," MPRA Paper 73945, University Library of Munich, Germany.
    12. Irad Ben-Gal & Marcelo Bacher & Morris Amara & Erez Shmueli, 2023. "A Nonparametric Subspace Analysis Approach with Application to Anomaly Detection Ensembles," INFORMS Joural on Data Science, INFORMS, vol. 2(2), pages 99-115, October.
    13. Han Woo Park, 2014. "Mapping election campaigns through negative entropy: Triple and Quadruple Helix approach to South Korea’s 2012 presidential election," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 187-197, April.
    14. Loet Leydesdorff & Igone Porto-Gomez, 2019. "Measuring the expected synergy in Spanish regional and national systems of innovation," The Journal of Technology Transfer, Springer, vol. 44(1), pages 189-209, February.
    15. Strand, Øivind & Leydesdorff, Loet, 2013. "Where is synergy indicated in the Norwegian innovation system? Triple-Helix relations among technology, organization, and geography," Technological Forecasting and Social Change, Elsevier, vol. 80(3), pages 471-484.
    16. Leydesdorff, Loet & Fritsch, Michael, 2006. "Measuring the knowledge base of regional innovation systems in Germany in terms of a Triple Helix dynamics," Research Policy, Elsevier, vol. 35(10), pages 1538-1553, December.
    17. Petras Rupšys, 2019. "Understanding the Evolution of Tree Size Diversity within the Multivariate Nonsymmetrical Diffusion Process and Information Measures," Mathematics, MDPI, vol. 7(8), pages 1-22, August.
    18. Dennis Knepp & Doris Entwisle, 1969. "Testing significance of differences between two chi-squares," Psychometrika, Springer;The Psychometric Society, vol. 34(3), pages 331-333, September.
    19. Inga Ivanova & Oivind Strand & Loet Leydesdorff, 2019. "The Synergy and Cycle Values in Regional Innovation Systems: The Case of Norway," Foresight and STI Governance (Foresight-Russia till No. 3/2015), National Research University Higher School of Economics, vol. 13(1), pages 48-61.
    20. Frank Huettner, & Tamer Boyaci, & Yalcin Akcay, 2016. "Consumer choice under limited attention when alternatives have different information costs," ESMT Research Working Papers ESMT-16-04_R2, ESMT European School of Management and Technology, revised 28 Feb 2018.
    21. Inga Ivanova, 2022. "The relation between complexity and synergy in the case of China: different ways of predicting GDP growth in a complex and adaptive system," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(1), pages 195-215, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005662. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.