IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v6y2015i1d10.1038_ncomms10162.html
   My bibliography  Save this article

A new tool called DISSECT for analysing large genomic data sets using a Big Data approach

Author

Listed:
  • Oriol Canela-Xandri

    (The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh EH25 9RG, UK)

  • Andy Law

    (The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh EH25 9RG, UK)

  • Alan Gray

    (EPCC, The University of Edinburgh)

  • John A. Woolliams

    (The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh EH25 9RG, UK)

  • Albert Tenesa

    (The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh EH25 9RG, UK
    MRC HGU at the MRC IGMM, University of Edinburgh)

Abstract

Large-scale genetic and genomic data are increasingly available and the major bottleneck in their analysis is a lack of sufficiently scalable computational tools. To address this problem in the context of complex traits analysis, we present DISSECT. DISSECT is a new and freely available software that is able to exploit the distributed-memory parallel computational architectures of compute clusters, to perform a wide range of genomic and epidemiologic analyses, which currently can only be carried out on reduced sample sizes or under restricted conditions. We demonstrate the usefulness of our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using mixed-linear model analysis. We analyse simulated traits from 470,000 individuals genotyped for 590,004 SNPs in ∼4 h using the combined computational power of 8,400 processor cores. We find that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large sample sizes.

Suggested Citation

  • Oriol Canela-Xandri & Andy Law & Alan Gray & John A. Woolliams & Albert Tenesa, 2015. "A new tool called DISSECT for analysing large genomic data sets using a Big Data approach," Nature Communications, Nature, vol. 6(1), pages 1-6, December.
  • Handle: RePEc:nat:natcom:v:6:y:2015:i:1:d:10.1038_ncomms10162
    DOI: 10.1038/ncomms10162
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/ncomms10162
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/ncomms10162?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Oriol Canela-Xandri & Konrad Rawlik & John A Woolliams & Albert Tenesa, 2016. "Improved Genetic Profiling of Anthropometric Traits Using a Big Data Approach," PLOS ONE, Public Library of Science, vol. 11(12), pages 1-12, December.
    2. Aman Agrawal & Alec M Chiu & Minh Le & Eran Halperin & Sriram Sankararaman, 2020. "Scalable probabilistic PCA for large-scale genetic variation data," PLOS Genetics, Public Library of Science, vol. 16(5), pages 1-19, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:6:y:2015:i:1:d:10.1038_ncomms10162. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.