IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1000479.html
   My bibliography  Save this article

Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference

Author

Listed:
  • Eric A Stone
  • Julien F Ayroles

Abstract

In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.Author Summary: Systems genetic approaches integrate classical methods with transcriptional profiling and other modern assays to make inference at the network level. It is customary to partition the genes entering such an analysis into clusters destined for independent interrogation, but there is a danger of facilitating a hypothesis that is falsely self-fulfilling. Motivated by the dual issues of scale and subjectivity, we present a new clustering method designed to elicit transcriptional modules from gene expression profiles that is both effective and automatic. Modulated modularity clustering (MMC) seeks community structure in graphical data—in this case, a graph of genes connected by edges whose weights reflect the degree to which transcriptional profiles correlate. MMC modifies this graph to make communities stand out and returns the clustering that describes this community structure. We begin with a numerical study to show that MMC is able to recover community structure from simulated data. We then demonstrate similar success on biological data by obtaining human and Drosophila gene clusters that, in each case, are intuitive and biologically meaningful. We advocate the use of MMC as an exploratory tool for functional genomic inference. A Web server for MMC is available at http://mmc.gnets.ncsu.edu.

Suggested Citation

  • Eric A Stone & Julien F Ayroles, 2009. "Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference," PLOS Genetics, Public Library of Science, vol. 5(5), pages 1-13, May.
  • Handle: RePEc:plo:pgen00:1000479
    DOI: 10.1371/journal.pgen.1000479
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1000479
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1000479&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1000479?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yanqing Chen & Jun Zhu & Pek Yee Lum & Xia Yang & Shirly Pinto & Douglas J. MacNeil & Chunsheng Zhang & John Lamb & Stephen Edwards & Solveig K. Sieberts & Amy Leonardson & Lawrence W. Castellini & Su, 2008. "Variations in DNA elucidate molecular networks that cause disease," Nature, Nature, vol. 452(7186), pages 429-435, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sim, Min Kyu & Deng, Shijie & Huo, Xiaoming, 2021. "What can cluster analysis offer in investing? - Measuring structural changes in the investment universe," International Review of Economics & Finance, Elsevier, vol. 71(C), pages 299-315.
    2. Marcela Preininger & Dalia Arafat & Jinhee Kim & Artika P Nath & Youssef Idaghdour & Kenneth L Brigham & Greg Gibson, 2013. "Blood-Informative Transcripts Define Nine Common Axes of Peripheral Blood Gene Expression," PLOS Genetics, Public Library of Science, vol. 9(3), pages 1-13, March.
    3. Peter Langfelder & Rui Luo & Michael C Oldham & Steve Horvath, 2011. "Is My Network Module Preserved and Reproducible?," PLOS Computational Biology, Public Library of Science, vol. 7(1), pages 1-29, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Emma Pierson & the GTEx Consortium & Daphne Koller & Alexis Battle & Sara Mostafavi, 2015. "Sharing and Specificity of Co-expression Networks across 35 Human Tissues," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-19, May.
    2. Kai Wang & Manikandan Narayanan & Hua Zhong & Martin Tompa & Eric E Schadt & Jun Zhu, 2009. "Meta-analysis of Inter-species Liver Co-expression Networks Elucidates Traits Associated with Common Human Diseases," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-16, December.
    3. Won Jun Lee & Sang Cheol Kim & Jung-Ho Yoon & Sang Jun Yoon & Johan Lim & You-Sun Kim & Sung Won Kwon & Jeong Hill Park, 2016. "Meta-Analysis of Tumor Stem-Like Breast Cancer Cells Using Gene Set and Network Analysis," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-20, February.
    4. Valur Emilsson & Elias F. Gudmundsson & Thorarinn Jonmundsson & Brynjolfur G. Jonsson & Michael Twarog & Valborg Gudmundsdottir & Zhiguang Li & Nancy Finkel & Stephen Poor & Xin Liu & Robert Esterberg, 2022. "A proteogenomic signature of age-related macular degeneration in blood," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    5. Benjamin A Logsdon & Jason Mezey, 2010. "Gene Expression Network Reconstruction by Convex Feature Selection when Incorporating Genetic Perturbations," PLOS Computational Biology, Public Library of Science, vol. 6(12), pages 1-13, December.
    6. Jin Hyun Ju & Sushila A Shenoy & Ronald G Crystal & Jason G Mezey, 2017. "An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci," PLOS Computational Biology, Public Library of Science, vol. 13(5), pages 1-26, May.
    7. Lina Zgaga & Felix Agakov & Evropi Theodoratou & Susan M Farrington & Albert Tenesa & Malcolm G Dunlop & Paul McKeigue & Harry Campbell, 2013. "Model Selection Approach Suggests Causal Association between 25-Hydroxyvitamin D and Colorectal Cancer," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-11, May.
    8. Eric P Xing & Ross E Curtis & Georg Schoenherr & Seunghak Lee & Junming Yin & Kriti Puniyani & Wei Wu & Peter Kinnaird, 2014. "GWAS in a Box: Statistical and Visual Analytics of Structured Associations via GenAMap," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-19, June.
    9. Oliver Stegle & Leopold Parts & Richard Durbin & John Winn, 2010. "A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies," PLOS Computational Biology, Public Library of Science, vol. 6(5), pages 1-11, May.
    10. Hui-Min Wang & Ching-Lin Hsiao & Ai-Ru Hsieh & Ying-Chao Lin & Cathy S J Fann, 2012. "Constructing Endophenotypes of Complex Diseases Using Non-Negative Matrix Factorization and Adjusted Rand Index," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-12, July.
    11. Ville-Petteri Mäkinen & Mete Civelek & Qingying Meng & Bin Zhang & Jun Zhu & Candace Levian & Tianxiao Huan & Ayellet V Segrè & Sujoy Ghosh & Juan Vivar & Majid Nikpay & Alexandre F R Stewart & Christ, 2014. "Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease," PLOS Genetics, Public Library of Science, vol. 10(7), pages 1-14, July.
    12. Michael J McGeachie & Hsun-Hsien Chang & Scott T Weiss, 2014. "CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data," PLOS Computational Biology, Public Library of Science, vol. 10(6), pages 1-7, June.
    13. Josine L Min & Jennifer M Taylor & J Brent Richards & Tim Watts & Fredrik H Pettersson & John Broxholme & Kourosh R Ahmadi & Gabriela L Surdulescu & Ernesto Lowy & Christian Gieger & Chris Newton-Cheh, 2011. "The Use of Genome-Wide eQTL Associations in Lymphoblastoid Cell Lines to Identify Novel Genetic Pathways Involved in Complex Traits," PLOS ONE, Public Library of Science, vol. 6(7), pages 1-14, July.
    14. Wei Zhang & Jun Zhu & Eric E Schadt & Jun S Liu, 2010. "A Bayesian Partition Method for Detecting Pleiotropic and Epistatic eQTL Modules," PLOS Computational Biology, Public Library of Science, vol. 6(1), pages 1-10, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1000479. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.