IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-56273-3.html
   My bibliography  Save this article

STICI: Split-Transformer with integrated convolutions for genotype imputation

Author

Listed:
  • Mohammad Erfan Mowlaei

    (Temple University)

  • Chong Li

    (Temple University)

  • Oveis Jamialahmadi

    (University of Gothenburg)

  • Raquel Dias

    (University of Florida)

  • Junjie Chen

    (Harbin Institute of Technology)

  • Benyamin Jamialahmadi

    (University of Waterloo)

  • Timothy Richard Rebbeck

    (Dana-Farber Cancer Institute
    Harvard T. H. Chan School of Public Health)

  • Vincenzo Carnevale

    (Temple University
    Temple University)

  • Sudhir Kumar

    (Temple University
    Temple University
    Temple University)

  • Xinghua Shi

    (Temple University
    Temple University)

Abstract

Despite advances in sequencing technologies, genome-scale datasets often contain missing bases and genomic segments, hindering downstream analyses. Genotype imputation addresses this issue and has been a cornerstone pre-processing step in genetic and genomic studies. Although various methods have been widely adopted for genotype imputation, it remains challenging to impute certain genomic regions and large structural variants. Here, we present a transformer-based framework, named STICI, for accurate genotype imputation. STICI models automatically learn genome-wide patterns of linkage disequilibrium, evidenced by much higher imputation accuracy in regions with highly linked variants. Our imputation results on the human 1000 Genomes Project and non-human genomes show that STICI can achieve high imputation accuracy comparable to the state-of-the-art genotype imputation methods, with the additional capability to impute multi-allelic variants and various types of genetic variants. STICI can be trained for any collection of genomes automatically using self-supervision. Moreover, STICI shows excellent performance without needing any special presuppositions about the underlying patterns in collections of non-human genomes, pointing to adaptability and applications of STICI to impute missing genotypes in any species.

Suggested Citation

  • Mohammad Erfan Mowlaei & Chong Li & Oveis Jamialahmadi & Raquel Dias & Junjie Chen & Benyamin Jamialahmadi & Timothy Richard Rebbeck & Vincenzo Carnevale & Sudhir Kumar & Xinghua Shi, 2025. "STICI: Split-Transformer with integrated convolutions for genotype imputation," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56273-3
    DOI: 10.1038/s41467-025-56273-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-56273-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-56273-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56273-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.