IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v7y2016i1d10.1038_ncomms12846.html
   My bibliography  Save this article

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd

Author

Listed:
  • Zichen Wang

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Caroline D. Monteiro

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Kathleen M. Jagodnik

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai
    Fluid Physics and Transport Processes Branch, NASA Glenn Research Center
    Center for Space Medicine, Baylor College of Medicine)

  • Nicolas F. Fernandez

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Gregory W. Gundersen

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Andrew D. Rouillard

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Sherry L. Jenkins

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Axel S. Feldmann

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Kevin S. Hu

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Michael G. McDermott

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Qiaonan Duan

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Neil R. Clark

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Matthew R. Jones

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Yan Kou

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Troy Goff

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

  • Holly Woodland

    (Daylesford, the Fairway)

  • Fabio M R. Amaral

    (School of Biosciences, University of Nottingham, Sutton Bonington Campus)

  • Gregory L. Szeto

    (Massachusetts Institute of Technology
    David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology
    Massachusetts Institute of Technology
    The Ragon Institute of MGH, MIT, and Harvard, 400 Technology Square)

  • Oliver Fuchs

    (Paediatric Allergology and Pulmonology, Dr von Hauner University Children’s Hospital, Ludwig-Maximilians-University of Munich, Member of the German Centre for Lung Research (DZL))

  • Sophia M. Schüssler-Fiorenza Rose

    (Spinal Cord Injury Service, Veteran Affairs Palo Alto Health Care System
    Stanford School of Medicine)

  • Shvetank Sharma

    (Institute of Liver & Biliary Sciences)

  • Uwe Schwartz

    (University of Regensburg)

  • Xabier Bengoetxea Bausela

    (University of Navarra)

  • Maciej Szymkiewicz

    (Warsaw School of Information Technology under the auspices of the Polish Academy of Sciences)

  • Vasileios Maroulis
  • Anton Salykin

    (Faculty of Medicine, Masaryk University)

  • Carolina M. Barra

    (IMIM-Hospital Del Mar, PRBB Barcelona, Dr Aiguader)

  • Candice D. Kruth
  • Nicholas J. Bongio

    (Shenandoah University)

  • Vaibhav Mathur

    (IBM India Pvt Ltd.)

  • Radmila D Todoric
  • Udi E. Rubin

    (Department of Biological Sciences)

  • Apostolos Malatras

    (Center for Research in Myology, Sorbonne Universités, UPMC Univ Paris 06, INSERM UMRS975, CNRS FRE3617)

  • Carl T. Fulp
  • John A. Galindo

    (Universidad Nacional de Colombia)

  • Ruta Motiejunaite

    (Center for Interdisciplinary Cardiovascular Sciences, Brigham and Women’s Hospital)

  • Christoph Jüschke

    (Faculty of Medicine and Health Sciences, University of Oldenburg)

  • Philip C. Dishuck
  • Katharina Lahl

    (Technical University of Denmark, National Veterinary Institute)

  • Mohieddin Jafari

    (Protein Chemistry and Proteomics Unit, Biotechnology Research Center, Pasteur Institute of Iran
    School of Biological Sciences, Institute for Researches in Fundamental Sciences)

  • Sara Aibar

    (University of Salamanca)

  • Apostolos Zaravinos

    (Karolinska Institute
    School of Sciences, European University Cyprus)

  • Linda H. Steenhuizen

    (Anna Blamansingel 216)

  • Lindsey R. Allison
  • Pablo Gamallo
  • Fernando de Andres Segura

    (CICAB, Clinical Research Centre, Extremadura University Hospital)

  • Tyler Dae Devlin
  • Vicente Pérez-García

    (Consejo Superior de Investigaciones Científicas, Centro Nacional de Biotecnología)

  • Avi Ma’ayan

    (BD2K-LINCS Data Coordination and Integration Center, Illuminating the Druggable Genome Knowledge Management Center, Icahn School of Medicine at Mount Sinai)

Abstract

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.

Suggested Citation

  • Zichen Wang & Caroline D. Monteiro & Kathleen M. Jagodnik & Nicolas F. Fernandez & Gregory W. Gundersen & Andrew D. Rouillard & Sherry L. Jenkins & Axel S. Feldmann & Kevin S. Hu & Michael G. McDermot, 2016. "Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd," Nature Communications, Nature, vol. 7(1), pages 1-11, November.
  • Handle: RePEc:nat:natcom:v:7:y:2016:i:1:d:10.1038_ncomms12846
    DOI: 10.1038/ncomms12846
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/ncomms12846
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/ncomms12846?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marcin Pilarczyk & Mehdi Fazel-Najafabadi & Michal Kouril & Behrouz Shamsaei & Juozas Vasiliauskas & Wen Niu & Naim Mahi & Lixia Zhang & Nicholas A. Clark & Yan Ren & Shana White & Rashid Karim & Huan, 2022. "Connecting omics signatures and revealing biological mechanisms with iLINCS," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    2. Mohieddin Jafari & Mehdi Mirzaie & Jie Bao & Farnaz Barneh & Shuyu Zheng & Johanna Eriksson & Caroline A. Heckman & Jing Tang, 2022. "Bipartite network models to design combination therapies in acute myeloid leukaemia," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    3. Nathaniel T. Hawkins & Marc Maldaver & Anna Yannakopoulos & Lindsay A. Guare & Arjun Krishnan, 2022. "Systematic tissue annotations of genomics samples by modeling unstructured metadata," Nature Communications, Nature, vol. 13(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:7:y:2016:i:1:d:10.1038_ncomms12846. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.