IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v11y2012i4n5.html
   My bibliography  Save this article

A Novel and Fast Normalization Method for High-Density Arrays

Author

Listed:
  • van Iterson Maarten

    (Center for Human and Clinical Genetics, Leiden University Medical Center)

  • Duijkers Floor A.M.

    (Department of Pediatric Oncology/Hematology, Erasmus University Medical Center-Sophia Children's Hospital)

  • Meijerink Jules P.P.

    (Department of Pediatric Oncology/Hematology, Erasmus University Medical Center-Sophia Children's Hospital)

  • Admiraal Pieter

    (Department of Pediatric Oncology/Hematology, Erasmus University Medical Center-Sophia Children's Hospital)

  • van Ommen Gert-Jan B.

    (Center for Human and Clinical Genetics, Leiden University Medical)

  • Boer Judith M.
  • van Noesel Max M.
  • Menezes Renee X.

Abstract

Background: Among the most commonly applied microarray normalization methods are intensity-dependent normalization methods such as lowess or loess algorithms. Their computational complexity makes them slow and thus less suitable for normalization of large datasets. Current implementations try to circumvent this problem by using a random subset of the data for normalization, but the impact of this modification has not been previously assessed. We developed a novel intensity-dependent normalization method for microarrays that is fast, simple and can include weighing of observations.Results: Our normalization method is based on the P-spline scatterplot smoother using all data points for normalization. We show that using a random subset of the data for normalization should be avoided as unstable results can be produced. However, in certain cases normalization based on an invariant subset is desirable, for example, when groups of samples before and after intervention are compared. We show in the context of DNA methylation arrays that a constant weighted P-spline normalization yields a more reliable normalization curve than the one obtained by normalization on the invariant subset only.Conclusions: Our novel intensity-dependent normalization method is simpler and faster than current loess algorithms, and can be applied to one- and two-colour array data, similar to normalization based on loess.Availability: An implementation of the method is currently available as an R package called TurboNorm from www.bioconductor.org .

Suggested Citation

  • van Iterson Maarten & Duijkers Floor A.M. & Meijerink Jules P.P. & Admiraal Pieter & van Ommen Gert-Jan B. & Boer Judith M. & van Noesel Max M. & Menezes Renee X., 2012. "A Novel and Fast Normalization Method for High-Density Arrays," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-31, July.
  • Handle: RePEc:bpj:sagmbi:v:11:y:2012:i:4:n:5
    DOI: 10.1515/1544-6115.1753
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/1544-6115.1753
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/1544-6115.1753?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Nathaniel D. Heintzman & Gary C. Hon & R. David Hawkins & Pouya Kheradpour & Alexander Stark & Lindsey F. Harp & Zhen Ye & Leonard K. Lee & Rhona K. Stuart & Christina W. Ching & Keith A. Ching & Jess, 2009. "Histone modifications at human enhancers reflect global cell-type-specific gene expression," Nature, Nature, vol. 459(7243), pages 108-112, May.
    2. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167, October.
    3. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506, October.
    4. Inyoung Kim & Noah D. Cohen & Raymond J. Carroll, 2003. "Semiparametric Regression Splines in Matched Case-Control Studies," Biometrics, The International Biometric Society, vol. 59(4), pages 1158-1169, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Otto-Sobotka, Fabian & Salvati, Nicola & Ranalli, Maria Giovanna & Kneib, Thomas, 2019. "Adaptive semiparametric M-quantile regression," Econometrics and Statistics, Elsevier, vol. 11(C), pages 116-129.
    2. Timothy K.M. Beatty & Erling Røed Larsen, 2005. "Using Engel curves to estimate bias in the Canadian CPI as a cost of living index," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 38(2), pages 482-499, May.
    3. Arthur Charpentier & Emmanuel Flachaire & Antoine Ly, 2017. "Econom\'etrie et Machine Learning," Papers 1708.06992, arXiv.org, revised Mar 2018.
    4. Hyunju Son & Youyi Fong, 2021. "Fast grid search and bootstrap‐based inference for continuous two‐phase polynomial regression models," Environmetrics, John Wiley & Sons, Ltd., vol. 32(3), May.
    5. Michael Wegener & Göran Kauermann, 2017. "Forecasting in nonlinear univariate time series using penalized splines," Statistical Papers, Springer, vol. 58(3), pages 557-576, September.
    6. Dlugosz, Stephan & Mammen, Enno & Wilke, Ralf A., 2017. "Generalized partially linear regression with misclassified data and an application to labour market transitions," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 145-159.
    7. Bernhard Baumgartner & Daniel Guhl & Thomas Kneib & Winfried J. Steiner, 2018. "Flexible estimation of time-varying effects for frequently purchased retail goods: a modeling approach based on household panel data," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 40(4), pages 837-873, October.
    8. Zi Ye & Giles Hooker & Stephen P. Ellner, 2021. "Generalized Single Index Models and Jensen Effects on Reproduction and Survival," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(3), pages 492-512, September.
    9. Ferraccioli, Federico & Sangalli, Laura M. & Finos, Livio, 2022. "Some first inferential tools for spatial regression with differential regularization," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    10. Alexander Dokumentov & Rob J. Hyndman, 2022. "STR: Seasonal-Trend Decomposition Using Regression," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 50-62, April.
    11. Kalogridis, Ioannis & Van Aelst, Stefan, 2023. "Robust penalized estimators for functional linear regression," Journal of Multivariate Analysis, Elsevier, vol. 194(C).
    12. Krisztin, Tamás, 2018. "Semi-parametric spatial autoregressive models in freight generation modeling," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 114(C), pages 121-143.
    13. Lauren N. Berry & Nathaniel E. Helwig, 2021. "Cross-Validation, Information Theory, or Maximum Likelihood? A Comparison of Tuning Methods for Penalized Splines," Stats, MDPI, vol. 4(3), pages 1-24, September.
    14. Nagler Thomas & Schellhase Christian & Czado Claudia, 2017. "Nonparametric estimation of simplified vine copula models: comparison of methods," Dependence Modeling, De Gruyter, vol. 5(1), pages 99-120, January.
    15. Yukun Zhang & Haocheng Li & Sarah Kozey Keadle & Charles E. Matthews & Raymond J. Carroll, 2019. "A Review of Statistical Analyses on Physical Activity Data Collected from Accelerometers," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(2), pages 465-476, July.
    16. Wei Huang & Oliver Linton & Zheng Zhang, 2021. "A Unified Framework for Specification Tests of Continuous Treatment Effect Models," Papers 2102.08063, arXiv.org, revised Sep 2021.
    17. Massimiliano Mazzanti & Antonio Musolesi, 2020. "Modeling Green Knowledge Production and Environmental Policies with Semiparametric Panel Data Regression models," SEEDS Working Papers 1420, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Sep 2020.
    18. Basile, Roberto & Durbán, María & Mínguez, Román & María Montero, Jose & Mur, Jesús, 2014. "Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities," Journal of Economic Dynamics and Control, Elsevier, vol. 48(C), pages 229-245.
    19. Morteza Amini & Mahdi Roozbeh & Nur Anisah Mohamed, 2024. "Separation of the Linear and Nonlinear Covariates in the Sparse Semi-Parametric Regression Model in the Presence of Outliers," Mathematics, MDPI, vol. 12(2), pages 1-17, January.
    20. Wahba, Jackline & Schluter, Christian, 2009. "Illegal migration, wages and remittances- semi-parametric estimation of illegality effects," Discussion Paper Series In Economics And Econometrics 913, Economics Division, School of Social Sciences, University of Southampton.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:11:y:2012:i:4:n:5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.