Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data

My bibliography Save this article

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data

Author

Listed:

Birkner Merrill D.
(Division of Biostatistics, School of Public Health, University of California, Berkeley)
Hubbard Alan E.
(Division of Biostatistics, School of Public Health, University of California, Berkeley)
van der Laan Mark J.
(Division of Biostatistics, School of Public Health, University of California, Berkeley)
Skibola Christine F.
(Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley)
Hegedus Christine M.
(Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley)
Smith Martyn T.
(Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley)

Registered:

Abstract

A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of ``interesting'' proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth selection; 3) peak finding; and 4) methods to correct for multiple testing (van der Laan et al. (2005)). The result is a list of proteins (indexed by m/z) where average expression is significantly different among disease (or treatment, etc.) groups. The procedures are intended to provide a sensible and statistically driven algorithm, which we argue provides a list of proteins that have a significant difference in expression. Given no sources of unmeasured bias (such as confounding of experimental conditions with disease status), proteins found to be statistically significant using this technique have a low probability of being false positives.

Suggested Citation

Birkner Merrill D. & Hubbard Alan E. & van der Laan Mark J. & Skibola Christine F. & Hegedus Christine M. & Smith Martyn T., 2006. "Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 5(1), pages 1-24, April.

Handle: RePEc:bpj:sagmbi:v:5:y:2006:i:1:n:11
DOI: 10.2202/1544-6115.1198

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

van der Laan Mark J. & Dudoit Sandrine & Pollard Katherine S., 2004. "Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-27, June.
Mark van der Laan & Sandrine Dudoit & Katherine Pollard, 2004. "Multiple Testing. Part III. Procedures for Control of the Generalized Family-Wise Error Rate and Proportion of False Positives," U.C. Berkeley Division of Biostatistics Working Paper Series 1140, Berkeley Electronic Press.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

G�nther Fink & Margaret McConnell & Sebastian Vollmer, 2014. "Testing for heterogeneous treatment effects in experimental data: false discovery risks and correction procedures," Journal of Development Effectiveness, Taylor & Francis Journals, vol. 6(1), pages 44-57, January.
- Fink, Günther & McConnell, Margaret & Vollmer, Sebastian, 2011. "Testing for Heterogeneous Treatment Effects in Experimental Data: False Discovery Risks and Correction Procedures," Hannover Economic Papers (HEP) dp-477, Leibniz Universität Hannover, Wirtschaftswissenschaftliche Fakultät.
Irene Castro-Conde & Jacobo Uña-Álvarez, 2015. "Power, FDR and conservativeness of BB-SGoF method," Computational Statistics, Springer, vol. 30(4), pages 1143-1161, December.
Joseph Romano & Azeem Shaikh & Michael Wolf, 2008. "Control of the false discovery rate under dependence using the bootstrap and subsampling," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 17(3), pages 417-442, November.
- Joseph P. Romano & Azeem M. Shaikh & Michael Wolf, 2008. "Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling," IEW - Working Papers 337, Institute for Empirical Research in Economics - University of Zurich.
Wang, Li & Xu, Xingzhong, 2012. "Step-up procedure controlling generalized family-wise error rate," Statistics & Probability Letters, Elsevier, vol. 82(4), pages 775-782.
Christina C. Bartenschlager & Michael Krapp, 2015. "Theorie und Methoden multipler statistischer Vergleiche," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 9(2), pages 107-129, November.
Gordon, Alexander Y., 2009. "Inequalities between generalized familywise error rates of a multiple testing procedure," Statistics & Probability Letters, Elsevier, vol. 79(19), pages 1996-2004, October.
Guo Wenge & Peddada Shyamal, 2008. "Adaptive Choice of the Number of Bootstrap Samples in Large Scale Multiple Testing," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-21, March.
Montazeri Zahra & Yanofsky Corey M. & Bickel David R., 2010. "Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-33, June.
Joseph P. Romano & Azeem M. Shaikh & Michael Wolf, 2010. "Hypothesis Testing in Econometrics," Annual Review of Economics, Annual Reviews, vol. 2(1), pages 75-104, September.
- Joseph P. Romano & Azeem M. Shaikh & Michael Wolf, 2009. "Hypothesis testing in econometrics," IEW - Working Papers 444, Institute for Empirical Research in Economics - University of Zurich.
Somerville, Paul N. & Hemmelmann, Claudia, 2008. "Step-up and step-down procedures controlling the number and proportion of false positives," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1323-1334, January.
Merrill Birkner & Sandra Sinisi & Mark van der Laan, 2004. "Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data," U.C. Berkeley Division of Biostatistics Working Paper Series 1161, Berkeley Electronic Press.
Frank Emmert-Streib & Galina V Glazko, 2011. "Pathway Analysis of Expression Data: Deciphering Functional Building Blocks of Complex Diseases," PLOS Computational Biology, Public Library of Science, vol. 7(5), pages 1-6, May.
Mathur, Maya B & VanderWeele, Tyler J., 2018. "Statistical methods for evidence synthesis," Thesis Commons kd6ja_v1, Center for Open Science.
Mathur, Maya B & VanderWeele, Tyler, 2018. "Statistical methods for evidence synthesis," Thesis Commons kd6ja, Center for Open Science.
Cerioli, Andrea & Farcomeni, Alessio, 2011. "Error rates for multivariate outlier detection," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 544-553, January.
de Uña-Alvarez Jacobo, 2012. "The Beta-Binomial SGoF method for multiple dependent tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-32, May.
L. Finos & A. Farcomeni, 2011. "k-FWER Control without p -value Adjustment, with Application to Detection of Genetic Determinants of Multiple Sclerosis in Italian Twins," Biometrics, The International Biometric Society, vol. 67(1), pages 174-181, March.
Debashis Ghosh, 2006. "Shrunken p-Values for Assessing Differential Expression with Applications to Genomic Data Analysis," Biometrics, The International Biometric Society, vol. 62(4), pages 1099-1106, December.
Wang, Li, 2022. "New testing procedures with k-FWER control for discrete data," Statistics & Probability Letters, Elsevier, vol. 180(C).
Schumi Jennifer & DiRienzo A. Gregory & DeGruttola Victor, 2008. "Testing for Associations with Missing High-Dimensional Categorical Covariates," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-19, September.

More about this item

Keywords

proteomics; mass-spectrometry; multiple testing; preprocessing; leukemia; tail probability;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:5:y:2006:i:1:n:11. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data