IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v11y2020i1d10.1038_s41467-020-17112-9.html
   My bibliography  Save this article

Identifying domains of applicability of machine learning models for materials science

Author

Listed:
  • Christopher Sutton

    (Fritz Haber Institute of the Max Planck Society)

  • Mario Boley

    (Monash University)

  • Luca M. Ghiringhelli

    (Fritz Haber Institute of the Max Planck Society)

  • Matthias Rupp

    (Fritz Haber Institute of the Max Planck Society
    Citrine Informatics
    University of Konstanz)

  • Jilles Vreeken

    (CISPA Helmholtz Center for Information Security)

  • Matthias Scheffler

    (Fritz Haber Institute of the Max Planck Society
    IRIS Adlershof Humboldt-Universität)

Abstract

Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.

Suggested Citation

  • Christopher Sutton & Mario Boley & Luca M. Ghiringhelli & Matthias Rupp & Jilles Vreeken & Matthias Scheffler, 2020. "Identifying domains of applicability of machine learning models for materials science," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
  • Handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-17112-9
    DOI: 10.1038/s41467-020-17112-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-020-17112-9
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-020-17112-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yilei Wu & Chang-Feng Wang & Ming-Gang Ju & Qiangqiang Jia & Qionghua Zhou & Shuaihua Lu & Xinying Gao & Yi Zhang & Jinlan Wang, 2024. "Universal machine learning aided synthesis approach of two-dimensional perovskites in a typical laboratory," Nature Communications, Nature, vol. 15(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-17112-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.