IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v55y2011i4p1540-1551.html
   My bibliography  Save this article

Comparing penalized splines and fractional polynomials for flexible modelling of the effects of continuous predictor variables

Author

Listed:
  • Strasak, Alexander M.
  • Umlauf, Nikolaus
  • Pfeiffer, Ruth M.
  • Lang, Stefan

Abstract

P(enalized)-splines and fractional polynomials (FPs) have emerged as powerful smoothing techniques with increasing popularity in applied research. Both approaches provide considerable flexibility, but only limited comparative evaluations of the performance and properties of the two methods have been conducted to date. Extensive simulations are performed to compare FPs of degree 2 (FP2) and degree 4 (FP4) and two variants of P-splines that used generalized cross validation (GCV) and restricted maximum likelihood (REML) for smoothing parameter selection. The ability of P-splines and FPs to recover the "true" functional form of the association between continuous, binary and survival outcomes and exposure for linear, quadratic and more complex, non-linear functions, using different sample sizes and signal to noise ratios is evaluated. For more curved functions FP2, the current default setting in implementations for fitting FPs in R, STATA and SAS, showed considerable bias and consistently higher mean squared error (MSE) compared to spline-based estimators and FP4, that performed equally well in most simulation settings. FPs however, are prone to artefacts due to the specific choice of the origin, while P-splines based on GCV reveal sometimes wiggly estimates in particular for small sample sizes. Application to a real dataset illustrates the different features of the two approaches.

Suggested Citation

  • Strasak, Alexander M. & Umlauf, Nikolaus & Pfeiffer, Ruth M. & Lang, Stefan, 2011. "Comparing penalized splines and fractional polynomials for flexible modelling of the effects of continuous predictor variables," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1540-1551, April.
  • Handle: RePEc:eee:csdana:v:55:y:2011:i:4:p:1540-1551
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(10)00407-X
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167, October.
    2. Patrick Royston & Gareth Ambler, 1999. "Multivariable fractional polynomials," Stata Technical Bulletin, StataCorp LP, vol. 8(43).
    3. Sauerbrei, W. & Meier-Hirmer, C. & Benner, A. & Royston, P., 2006. "Multivariable regression model building by using fractional polynomials: Description of SAS, STATA and R programs," Computational Statistics & Data Analysis, Elsevier, vol. 50(12), pages 3464-3485, August.
    4. Brezger, Andreas & Kneib, Thomas & Lang, Stefan, 2005. "BayesX: Analyzing Bayesian Structural Additive Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i11).
    5. Simon N. Wood, 2003. "Thin plate regression splines," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(1), pages 95-114, February.
    6. Marx, Brian D. & Eilers, Paul H. C., 1998. "Direct generalized additive modeling with penalized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 28(2), pages 193-209, August.
    7. Jullion, Astrid & Lambert, Philippe, 2007. "Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models," Computational Statistics & Data Analysis, Elsevier, vol. 51(5), pages 2542-2558, February.
    8. S. N. Wood, 2000. "Modelling and smoothing parameter estimation with multiple quadratic penalties," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(2), pages 413-428.
    9. Simon N. Wood, 2008. "Fast stable direct fitting and smoothness selection for generalized additive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(3), pages 495-518, July.
    10. Patrick Royston, 2005. "Multivariable regression models with continuous covariates, with a practical emphasis on fractional polynomials and applications in clinical epidemiology," German Stata Users' Group Meetings 2005 01, Stata Users Group.
    11. Göran Kauermann & Tatyana Krivobokova & Ludwig Fahrmeir, 2009. "Some asymptotic results on generalized penalized spline smoothing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 487-503, April.
    12. Simon N. Wood, 2004. "Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 673-686, January.
    13. Belitz, Christiane & Lang, Stefan, 2008. "Simultaneous selection of variables and smoothing parameters in structured additive regression models," Computational Statistics & Data Analysis, Elsevier, vol. 53(1), pages 61-81, September.
    14. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    15. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506, October.
    16. W. Sauerbrei & P. Royston, 1999. "Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 162(1), pages 71-94.
    17. M. P. Wand, 2003. "Smoothing and mixed models," Computational Statistics, Springer, vol. 18(2), pages 223-249, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lee, Wang-Sheng, 2014. "Big and Tall: Is there a Height Premium or Obesity Penalty in the Labor Market?," IZA Discussion Papers 8606, Institute of Labor Economics (IZA).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nadja Klein & Michel Denuit & Stefan Lang & Thomas Kneib, 2013. "Nonlife Ratemaking and Risk Management with Bayesian Additive Models for Location, Scale and Shape," Working Papers 2013-24, Faculty of Economics and Statistics, Universität Innsbruck.
    2. Klein, Nadja & Denuit, Michel & Lang, Stefan & Kneib, Thomas, 2013. "Nonlife Ratemaking and Risk Management with Bayesian Additive Models for Location, Scale and Shape," LIDAM Discussion Papers ISBA 2013045, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    3. Belitz, Christiane & Lang, Stefan, 2008. "Simultaneous selection of variables and smoothing parameters in structured additive regression models," Computational Statistics & Data Analysis, Elsevier, vol. 53(1), pages 61-81, September.
    4. Simon N. Wood, 2011. "Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(1), pages 3-36, January.
    5. Longhi, Christian & Musolesi, Antonio & Baumont, Catherine, 2014. "Modeling structural change in the European metropolitan areas during the process of economic integration," Economic Modelling, Elsevier, vol. 37(C), pages 395-407.
    6. Musolesi Antonio & Mazzanti Massimiliano, 2014. "Nonlinearity, heterogeneity and unobserved effects in the carbon dioxide emissions-economic development relation for advanced countries," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 18(5), pages 521-541, December.
    7. Takuma Yoshida, 2016. "Asymptotics and smoothing parameter selection for penalized spline regression with various loss functions," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 70(4), pages 278-303, November.
    8. Mazzanti, Massimiliano & Musolesi, Antonio, 2013. "Nonlinearity, Heterogeneity and Unobserved Effects in the CO2-income Relation for Advanced Countries," Climate Change and Sustainable Development 162374, Fondazione Eni Enrico Mattei (FEEM).
    9. Nadja Klein & Thomas Kneib & Stefan Lang, 2015. "Bayesian Generalized Additive Models for Location, Scale, and Shape for Zero-Inflated and Overdispersed Count Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 405-419, March.
    10. Stefan Lang & Nikolaus Umlauf & Peter Wechselberger & Kenneth Harttgen & Thomas Kneib, 2012. "Multilevel structured additive regression," Working Papers 2012-07, Faculty of Economics and Statistics, Universität Innsbruck.
    11. Sylvie Charlot & Riccardo Crescenzi & Antonio Musolesi, 2014. "Augmented and Unconstrained: revisiting the Regional Knowledge Production Function," SEEDS Working Papers 2414, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Aug 2014.
    12. Lee, Wang-Sheng, 2014. "Big and Tall: Is there a Height Premium or Obesity Penalty in the Labor Market?," IZA Discussion Papers 8606, Institute of Labor Economics (IZA).
    13. Rodríguez-Álvarez, María Xosé & Lee, Dae-Jin & Kneib, Thomas & Durbán, María & Eilers, Paul, 2013. "Fast algorithm for smoothing parameter selection in multidimensional generalized P-splines," DES - Working Papers. Statistics and Econometrics. WS ws133026, Universidad Carlos III de Madrid. Departamento de Estadística.
    14. Klein, Nadja & Denuit, Michel & Lang, Stefan & Kneib, Thomas, 2014. "Nonlife ratemaking and risk management with Bayesian generalized additive models for location, scale, and shape," Insurance: Mathematics and Economics, Elsevier, vol. 55(C), pages 225-249.
    15. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    16. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    17. Morteza Amini & Mahdi Roozbeh & Nur Anisah Mohamed, 2024. "Separation of the Linear and Nonlinear Covariates in the Sparse Semi-Parametric Regression Model in the Presence of Outliers," Mathematics, MDPI, vol. 12(2), pages 1-17, January.
    18. Mestekemper, Thomas & Kauermann, Göran & Smith, Michael S., 2013. "A comparison of periodic autoregressive and dynamic factor models in intraday energy demand forecasting," International Journal of Forecasting, Elsevier, vol. 29(1), pages 1-12.
    19. Philip T. Reiss & R. Todd Ogden, 2010. "Functional Generalized Linear Models with Images as Predictors," Biometrics, The International Biometric Society, vol. 66(1), pages 61-69, March.
    20. Roca-Pardinas, Javier & Cadarso-Suarez, Carmen & Tahoces, Pablo G. & Lado, Maria J., 2008. "Assessing continuous bivariate effects among different groups through nonparametric regression models: An application to breast cancer detection," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 1958-1970, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:55:y:2011:i:4:p:1540-1551. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.