IDEAS home Printed from https://ideas.repec.org/a/eee/stapro/v78y2008i12p1490-1497.html
   My bibliography  Save this article

Clustering gene expression profile data by selective shrinkage

Author

Listed:
  • Ishwaran, Hemant
  • Sunil Rao, J.

Abstract

Clustering of gene expression profiles is a widely used approach for finding macroscopic data structure. A complication in such analyses is that not all genes are informative for forming clusters and different clusters might have different transcription regulation. Driven by these considerations, we present a novel two-stage clustering approach. The first stage identifies informative genes by adaptive variable selection using pseudo-samples modeled by a high dimensional multigroup ANOVA model. Variables are selected using a rescaled spike and slab Bayesian hierarchical model having a special selective shrinkage property. The second stage uses output from the first stage for clustering. We demonstrate why selective shrinkage occurs, and by extension, why it is useful for the clustering paradigm. We analyze a human gene atlas expression dataset where the question of interest is to look for tissue-specific transcription regulation and investigate whether tissues can be grouped together due to similar genomic control.

Suggested Citation

  • Ishwaran, Hemant & Sunil Rao, J., 2008. "Clustering gene expression profile data by selective shrinkage," Statistics & Probability Letters, Elsevier, vol. 78(12), pages 1490-1497, September.
  • Handle: RePEc:eee:stapro:v:78:y:2008:i:12:p:1490-1497
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-7152(08)00002-3
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ishwaran H. & Rao J.S., 2003. "Detecting Differentially Expressed Genes in Microarrays Using Bayesian Model Selection," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 438-455, January.
    2. Ishwaran, Hemant & Rao, J. Sunil, 2005. "Spike and Slab Gene Selection for Multigroup Microarray Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 764-780, September.
    3. Efron B. & Tibshirani R. & Storey J.D. & Tusher V., 2001. "Empirical Bayes Analysis of a Microarray Experiment," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1151-1160, December.
    4. John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Asafu-Adjei Josephine & Tadesse Mahlet G. & Coull Brent & Balasubramanian Raji & Lev Michael & Schwamm Lee & Betensky Rebecca, 2017. "Bayesian Variable Selection Methods for Matched Case-Control Studies," The International Journal of Biostatistics, De Gruyter, vol. 13(1), pages 1-23, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dazard, Jean-Eudes & Sunil Rao, J., 2012. "Joint adaptive mean–variance regularization and variance stabilization of high dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2317-2333.
    2. E. M. Conlon & B. L. Postier & B. A. Methe & K. P. Nevin & D. R. Lovley, 2009. "Hierarchical Bayesian meta-analysis models for cross-platform microarray studies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 36(10), pages 1067-1085.
    3. Wen Shi & Xi Chen & Jennifer Shang, 2019. "An Efficient Morris Method-Based Framework for Simulation Factor Screening," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 745-770, October.
    4. HyungJun Cho & Jaewoo Kang & Jae Lee, 2009. "Empirical Bayes analysis of unreplicated microarray data," Computational Statistics, Springer, vol. 24(3), pages 393-408, August.
    5. Cipolli III, William & Hanson, Timothy & McLain, Alexander C., 2016. "Bayesian nonparametric multiple testing," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 64-79.
    6. Guo Wenge & Peddada Shyamal, 2008. "Adaptive Choice of the Number of Bootstrap Samples in Large Scale Multiple Testing," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-21, March.
    7. Alessio Farcomeni, 2006. "More Powerful Control of the False Discovery Rate Under Dependence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 15(1), pages 43-73, May.
    8. Nik Tuzov & Frederi Viens, 2011. "Mutual fund performance: false discoveries, bias, and power," Annals of Finance, Springer, vol. 7(2), pages 137-169, May.
    9. Leek Jeffrey T & Storey John D., 2011. "The Joint Null Criterion for Multiple Hypothesis Tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, June.
    10. Kenneth Rice & David Spiegelhalter, 2006. "A Simple Diagnostic Plot Connecting Robust Estimation, Outlier Detection, and False Discovery Rates," Journal of Applied Statistics, Taylor & Francis Journals, vol. 33(10), pages 1131-1147.
    11. Hironori Fujisawa & Takayuki Sakaguchi, 2012. "Optimal significance analysis of microarray data in a class of tests whose null statistic can be constructed," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 21(2), pages 280-300, June.
    12. Andrew Y. Chen, 2022. "Most claimed statistical findings in cross-sectional return predictability are likely true," Papers 2206.15365, arXiv.org, revised Jan 2025.
    13. Friguet, Chloé & Causeur, David, 2011. "Estimation of the proportion of true null hypotheses in high-dimensional data under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2665-2676, September.
    14. T. Tony Cai & Wenguang Sun & Weinan Wang, 2019. "Covariate‐assisted ranking and screening for large‐scale two‐sample inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 187-234, April.
    15. Laurent Barras & Olivier Scaillet & Russ Wermers, 2010. "False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas," Journal of Finance, American Finance Association, vol. 65(1), pages 179-216, February.
    16. Patrick Kline & Christopher Walters, 2021. "Reasonable Doubt: Experimental Detection of Job‐Level Employment Discrimination," Econometrica, Econometric Society, vol. 89(2), pages 765-792, March.
    17. Patrick Kline & Evan K Rose & Christopher R Walters, 2022. "Systemic Discrimination Among Large U.S. Employers [“Teachers and Student Achievement in the Chicago Public High Schools,”]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(4), pages 1963-2036.
    18. Bickel David R., 2013. "Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(4), pages 529-543, August.
    19. Joshua Habiger & Edsel Peña, 2011. "Randomised -values and nonparametric procedures in multiple testing," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 23(3), pages 583-604.
    20. Habiger, Joshua D. & Peña, Edsel A., 2014. "Compound p-value statistics for multiple testing procedures," Journal of Multivariate Analysis, Elsevier, vol. 126(C), pages 153-166.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:stapro:v:78:y:2008:i:12:p:1490-1497. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.