IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v3y2004i1n36.html
   My bibliography  Save this article

Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data

Author

Listed:
  • Cheng Cheng

    (Department of Biostatistics, St. Jude Children’s Research Hospital)

  • Pounds Stanley B.

    (Department of Biostatistics, St. Jude Children’s Research Hospital)

  • Boyett James M.

    (Department of Biostatistics, St. Jude Children’s Research Hospital)

  • Pei Deqing

    (Department of Biostatistics, St. Jude Children’s Research Hospital)

  • Kuo Mei-Ling

    (Department of Genetics and Tumor Cell Biology, St. Jude Children’s Research Hospital)

  • Roussel Martine F.

    (Department of Genetics and Tumor Cell Biology, St. Jude Children’s Research Hospital)

Abstract

The methodological advancement in microarray data analysis on the basis of false discovery rate (FDR) control, such as the q-value plots, allows the investigator to examine the FDR from several perspectives. However, when FDR control at the ``customary" levels 0.01, 0.05, or 0.1 does not provide fruitful findings, there is little guidance for making the trade off between the significance threshold and the FDR level by sound statistical or biological considerations. Thus, meaningful statistical significance criteria that complement the existing FDR methods for large-scale multiple tests are desirable. Three statistical significance criteria, the profile information criterion, the total error proportion, and the guide-gene driven selection, are developed in this research. The first two are general significance threshold criteria for large-scale multiple tests; the profile information criterion is related to the recent theoretical studies of the connection between FDR control and minimax estimation, and the total error proportion is closely related to the asymptotic properties of FDR control in terms of the total error risk. The guide-gene driven selection is an approach to combining statistical significance and the existing biological knowledge of the study at hand. Error properties of these criteria are investigated theoretically and by simulation. The proposed methods are illustrated and compared using an example of genomic screening for novel Arf gene targets. Operating characteristics of q-value and the proposed significance threshold criteria are investigated and compared in a simulation study that employs a model mimicking a gene regulatory pathway. A guideline for using these criteria is provided. Splus/R code is available from the corresponding author upon request.

Suggested Citation

  • Cheng Cheng & Pounds Stanley B. & Boyett James M. & Pei Deqing & Kuo Mei-Ling & Roussel Martine F., 2004. "Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-32, December.
  • Handle: RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:36
    DOI: 10.2202/1544-6115.1064
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1064
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1064?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cheng, Cheng, 2009. "Internal validation inferences of significant genomic features in genome-wide screening," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 788-800, January.
    2. Hunt, Daniel L. & Cheng, Cheng & Pounds, Stanley, 2009. "The beta-binomial distribution for estimating the number of false rejections in microarray gene expression studies," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1688-1700, March.
    3. de Uña-Alvarez Jacobo, 2011. "On the Statistical Properties of SGoF Multitesting Method," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-30, April.
    4. Cheng, Cheng, 2016. "Exploratory failure time analysis in large scale genomics," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 192-206.
    5. Lin, Wan-Yu & Lee, Wen-Chung, 2011. "Floating prioritized subset analysis: A powerful method to detect differentially expressed genes," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 903-913, January.
    6. de Uña-Alvarez Jacobo, 2012. "The Beta-Binomial SGoF method for multiple dependent tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-32, May.
    7. Bickel David R., 2008. "Correcting the Estimated Level of Differential Expression for Gene Selection Bias: Application to a Microarray Study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-27, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:36. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.