IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v8y2009i1n10.html
   My bibliography  Save this article

Normalization Method for Transcriptional Studies of Heterogeneous Samples - Simultaneous Array Normalization and Identification of Equivalent Expression

Author

Listed:
  • Qin Li-Xuan

    (Memorial Sloan-Kettering Cancer Center)

  • Satagopan Jaya M

    (Memorial Sloan-Kettering Cancer Center)

Abstract

Normalization is an important step in the analysis of microarray data of transcription profiles as systematic non-biological variations often arise from the multiple steps involved in any transcription profiling experiment. Existing methods for data normalization often assume that there are few or symmetric differential expression, but this assumption does not always hold. Alternatively, non-differentially expressed genes may be used for array normalization. However, it is unknown at the outset which genes are non-differentially expressed. In this paper we propose a hierarchical mixture model framework to simultaneously identify non-differentially expressed genes and normalize arrays using these genes. The Fisher's information matrix corresponding to array effects is derived, which provides useful intuition for guiding the choice of array normalization method. The operating characteristics of the proposed method are evaluated using simulated data. The simulations conducted under a wide range of parametric configurations suggest that the proposed method provides a useful alternative for array normalization. For example, the proposed method has better sensitivity than median normalization under modest prevalence of differentially expressed genes and when the magnitudes of over-expression and under-expression are not the same. Further, the proposed method has properties similar to median normalization when the prevalence of differentially expressed genes is very small. Empirical illustration of the proposed method is provided using a liposarcoma study from MSKCC to identify genes differentially expressed between normal fat tissue versus liposarcoma tissue samples.

Suggested Citation

  • Qin Li-Xuan & Satagopan Jaya M, 2009. "Normalization Method for Transcriptional Studies of Heterogeneous Samples - Simultaneous Array Normalization and Identification of Equivalent Expression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-25, February.
  • Handle: RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:10
    DOI: 10.2202/1544-6115.1339
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1339
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1339?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Reilly C. & Wang C. & Rutherford M., 2003. "A Method for Normalizing Microarrays Using Genes That Are Not Differentially Expressed," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 868-878, January.
    2. Danh V. Nguyen & A. Bulak Arpat & Naisyin Wang & Raymond J. Carroll, 2002. "DNA Microarray Experiments: Biological and Technological Aspects," Biometrics, The International Biometric Society, vol. 58(4), pages 701-717, December.
    3. Purdom Elizabeth & Holmes Susan P, 2005. "Error Distribution for Gene Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-35, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Punathumparambath, Bindu & Kulathinal, Sangita & George, Sebastian, 2012. "Asymmetric type II compound Laplace distribution and its application to microarray gene expression," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1396-1404.
    2. Parrish, Rudolph S. & Spencer III, Horace J. & Xu, Ping, 2009. "Distribution modeling and simulation of gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1650-1660, March.
    3. Li-Xuan Qin & Steven G. Self, 2006. "The Clustering of Regression Models Method with Applications in Gene Expression Data," Biometrics, The International Biometric Society, vol. 62(2), pages 526-533, June.
    4. Huixia Judy Wang & Leonard A. Stefanski & Zhongyi Zhu, 2012. "Corrected-loss estimation for quantile regression with covariate measurement errors," Biometrika, Biometrika Trust, vol. 99(2), pages 405-421.
    5. Tu, Shiyi & Wang, Min & Sun, Xiaoqian, 2016. "Bayesian analysis of two-piece location–scale models under reference priors with partial information," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 133-144.
    6. He, Yi & Pan, Wei & Lin, Jizhen, 2006. "Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 641-658, November.
    7. Raymond J. Carroll, 2003. "Variances Are Not Always Nuisance Parameters," Biometrics, The International Biometric Society, vol. 59(2), pages 211-220, June.
    8. Olcay Arslan, 2010. "An alternative multivariate skew Laplace distribution: properties and estimation," Statistical Papers, Springer, vol. 51(4), pages 865-887, December.
    9. Robert R. Delongchamp & John F. Bowyer & James J. Chen & Ralph L. Kodell, 2004. "Multiple-Testing Strategy for Analyzing cDNA Array Data on Gene Expression," Biometrics, The International Biometric Society, vol. 60(3), pages 774-782, September.
    10. Kaul, Abhishek & Koul, Hira L., 2015. "Weighted ℓ1-penalized corrected quantile regression for high dimensional measurement error models," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 72-91.
    11. Salas-Gonzalez, Diego & Kuruoglu, Ercan E. & Ruiz, Diego P., 2009. "A heavy-tailed empirical Bayes method for replicated microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1535-1546, March.
    12. Mehmet Niyazi Çankaya & Abdullah Yalçınkaya & Ömer Altındaǧ & Olcay Arslan, 2019. "On the robustness of an epsilon skew extension for Burr III distribution on the real line," Computational Statistics, Springer, vol. 34(3), pages 1247-1273, September.
    13. Chenguang Wang & Ao Yuan & Leslie Cope & Jing Qin, 2022. "A semiparametric isotonic regression model for skewed distributions with application to DNA–RNA–protein analysis," Biometrics, The International Biometric Society, vol. 78(4), pages 1464-1474, December.
    14. Krutto Annika & Haugdahl Nøst Therese & Thoresen Magne, 2024. "A heavy-tailed model for analyzing miRNA-seq raw read counts," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 23(1), pages 1-30.
    15. repec:bla:biomet:v:62:y:2006:i:1:p:10-18:1 is not listed on IDEAS
    16. Tomy, Lishamol & Jose, K.K., 2009. "Generalized normal-Laplace AR process," Statistics & Probability Letters, Elsevier, vol. 79(14), pages 1615-1620, July.
    17. Hamid Jemila S & Beyene Joseph, 2009. "A Multivariate Growth Curve Model for Ranking Genes in Replicated Time Course Microarray Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-28, July.
    18. Arslan, Olcay, 2009. "Maximum likelihood parameter estimation for the multivariate skew-slash distribution," Statistics & Probability Letters, Elsevier, vol. 79(20), pages 2158-2165, October.
    19. Tsai, Arthur C. & Liou, Michelle & Simak, Maria & Cheng, Philip E., 2017. "On hyperbolic transformations to normality," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 250-266.
    20. Jose, K.K. & Tomy, Lishamol & Sreekumar, J., 2008. "Autoregressive processes with normal-Laplace marginals," Statistics & Probability Letters, Elsevier, vol. 78(15), pages 2456-2462, October.
    21. Kelmansky Diana M. & Martínez Elena J. & Leiva Víctor, 2013. "A new variance stabilizing transformation for gene expression data analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(6), pages 653-666, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:10. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.