IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-48177-5.html
   My bibliography  Save this article

BERNN: Enhancing classification of Liquid Chromatography Mass Spectrometry data with batch effect removal neural networks

Author

Listed:
  • Simon J. Pelletier

    (CHU de Québec - Université Laval Research Center)

  • Mickaël Leclercq

    (CHU de Québec - Université Laval Research Center)

  • Florence Roux-Dalvai

    (CHU de Québec - Université Laval Research Center
    CHU de Québec - Université Laval Research Center)

  • Matthijs B. Geus

    (Massachusetts General Hospital Department of Neurology
    Leiden University Medical Center)

  • Shannon Leslie

    (Yale Department of Psychiatry
    Janssen Pharmaceuticals)

  • Weiwei Wang

    (Yale School of Medicine)

  • TuKiet T. Lam

    (Yale School of Medicine
    Department of Molecular Biophysics and Biochemistry)

  • Angus C. Nairn

    (Yale Department of Psychiatry)

  • Steven E. Arnold

    (Massachusetts General Hospital Department of Neurology)

  • Becky C. Carlyle

    (Massachusetts General Hospital Department of Neurology
    Oxford University Department of Physiology Anatomy and Genetics
    Kavli Institute for Nanoscience Discovery)

  • Frédéric Precioso

    (Sophia Antipolis)

  • Arnaud Droit

    (CHU de Québec - Université Laval Research Center
    CHU de Québec - Université Laval Research Center)

Abstract

Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions, and data acquisition techniques, significantly impacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of omics research, but current methods are not optimal for the removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. A comparison of batch effect correction methods across five diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that the overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.

Suggested Citation

  • Simon J. Pelletier & Mickaël Leclercq & Florence Roux-Dalvai & Matthijs B. Geus & Shannon Leslie & Weiwei Wang & TuKiet T. Lam & Angus C. Nairn & Steven E. Arnold & Becky C. Carlyle & Frédéric Precios, 2024. "BERNN: Enhancing classification of Liquid Chromatography Mass Spectrometry data with batch effect removal neural networks," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-48177-5
    DOI: 10.1038/s41467-024-48177-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-48177-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-48177-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Rebecca C. Poulos & Peter G. Hains & Rohan Shah & Natasha Lucas & Dylan Xavier & Srikanth S. Manda & Asim Anees & Jennifer M. S. Koh & Sadia Mahboob & Max Wittman & Steven G. Williams & Erin K. Sykes , 2020. "Strategies to enable large-scale proteomics for reproducible research," Nature Communications, Nature, vol. 11(1), pages 1-13, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Henry Webel & Lili Niu & Annelaura Bach Nielsen & Marie Locard-Paulet & Matthias Mann & Lars Juhl Jensen & Simon Rasmussen, 2024. "Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    2. Zhaoxiang Cai & Sofia Apolinário & Ana R. Baião & Clare Pacini & Miguel D. Sousa & Susana Vinga & Roger R. Reddel & Phillip J. Robinson & Mathew J. Garnett & Qing Zhong & Emanuel Gonçalves, 2024. "Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    3. Hannah Voß & Simon Schlumbohm & Philip Barwikowski & Marcus Wurlitzer & Matthias Dottermusch & Philipp Neumann & Hartmut Schlüter & Julia E. Neumann & Christoph Krisp, 2022. "HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values," Nature Communications, Nature, vol. 13(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-48177-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.