IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i15p1761-d601590.html
   My bibliography  Save this article

Out-of-Sample Prediction in Multidimensional P-Spline Models

Author

Listed:
  • Alba Carballo

    (Department of Statistics, Universidad Carlos III de Madrid, 28903 Getafe, Spain
    These authors contributed equally to this work.)

  • María Durbán

    (Department of Statistics, Universidad Carlos III de Madrid, 28903 Getafe, Spain
    These authors contributed equally to this work.)

  • Dae-Jin Lee

    (BCAM—Basque Center for Applied Mathematics, 48009 Bilbao, Spain
    These authors contributed equally to this work.)

Abstract

The prediction of out-of-sample values is an interesting problem in any regression model. In the context of penalized smoothing using a mixed-model reparameterization, a general framework has been proposed for predicting in additive models but without interaction terms. The aim of this paper is to generalize this work, extending the methodology proposed in the multidimensional case, to models that include interaction terms, i.e., when prediction is carried out in a multidimensional setting. Our method fits the data, predicts new observations at the same time, and uses constraints to ensure a consistent fit or impose further restrictions on predictions. We have also developed this method for the so-called smooth-ANOVA model, which allows us to include interaction terms that can be decomposed into the sum of several smooth functions. We also develop this methodology for the so-called smooth-ANOVA models, which allow us to include interaction terms that can be decomposed as a sum of several smooth functions. To illustrate the method, two real data sets were used, one for predicting the mortality of the U.S. population in a logarithmic scale, and the other for predicting the aboveground biomass of Populus trees as a smooth function of height and diameter. We examine the performance of interaction and the smooth-ANOVA model through simulation studies.

Suggested Citation

  • Alba Carballo & María Durbán & Dae-Jin Lee, 2021. "Out-of-Sample Prediction in Multidimensional P-Spline Models," Mathematics, MDPI, vol. 9(15), pages 1-23, July.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:15:p:1761-:d:601590
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/15/1761/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/15/1761/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. I. D. Currie & M. Durban & P. H. C. Eilers, 2006. "Generalized linear array models with applications to multidimensional smoothing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 259-280, April.
    2. Simon N. Wood, 2006. "Low-Rank Scale-Invariant Tensor Product Smooths for Generalized Additive Mixed Models," Biometrics, The International Biometric Society, vol. 62(4), pages 1025-1036, December.
    3. Greene, William H & Seaks, Terry G, 1991. "The Restricted Least Squares Estimator: A Pedagogical Note," The Review of Economics and Statistics, MIT Press, vol. 73(3), pages 563-567, August.
    4. Hyndman, Rob J. & Koehler, Anne B., 2006. "Another look at measures of forecast accuracy," International Journal of Forecasting, Elsevier, vol. 22(4), pages 679-688.
    5. Lee, Dae-Jin & Durbán, María & Eilers, Paul, 2013. "Efficient two-dimensional smoothing with P-spline ANOVA mixed models and nested bases," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 22-37.
    6. Yingxing Li & David Ruppert, 2008. "On the asymptotics of penalized splines," Biometrika, Biometrika Trust, vol. 95(2), pages 415-436.
    7. M. P. Wand, 2003. "Smoothing and mixed models," Computational Statistics, Springer, vol. 18(2), pages 223-249, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lee, Dae-Jin, 2019. "Out-of-sample prediction in multidimensional P-spline models," DES - Working Papers. Statistics and Econometrics. WS 28630, Universidad Carlos III de Madrid. Departamento de Estadística.
    2. Lee, Wang-Sheng, 2014. "Is the BMI a Relic of the Past?," IZA Discussion Papers 8637, Institute of Labor Economics (IZA).
    3. Gioldasis, Georgios & Musolesi, Antonio & Simioni, Michel, 2023. "Interactive R&D spillovers: An estimation strategy based on forecasting-driven model selection," International Journal of Forecasting, Elsevier, vol. 39(1), pages 144-169.
    4. Simon N. Wood & Zheyuan Li & Gavin Shaddick & Nicole H. Augustin, 2017. "Generalized Additive Models for Gigadata: Modeling the U.K. Black Smoke Network Daily Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1199-1210, July.
    5. Mariola Sánchez-González & María Durbán & Dae-Jin Lee & Isabel Cañellas & Hortensia Sixto, 2017. "Smooth additive mixed models for predicting aboveground biomass," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 22(1), pages 23-41, March.
    6. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2021. "Interactive R&D Spillovers: An estimation strategy based on forecasting-driven model selection," SEEDS Working Papers 0621, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Jun 2021.
    7. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2021. "Interactive R&D Spillovers: an estimation strategy based on forecasting-driven model selection," Working Papers hal-03224910, HAL.
    8. Lee, Dae-Jin & Durbán, María, 2008. "Smooth-car mixed models for spatial count data," DES - Working Papers. Statistics and Econometrics. WS ws085820, Universidad Carlos III de Madrid. Departamento de Estadística.
    9. Lee, Wang-Sheng, 2014. "Big and Tall: Is there a Height Premium or Obesity Penalty in the Labor Market?," IZA Discussion Papers 8606, Institute of Labor Economics (IZA).
    10. Lee, Dae-Jin & Durbán, María, 2009. "Smooth-CAR mixed models for spatial count data," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 2968-2979, June.
    11. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2020. "Model uncertainty, nonlinearities and out-of-sample comparison: evidence from international technology diffusion," Working Papers hal-02790523, HAL.
    12. Lee, Dae-Jin & Durbán, María, 2009. "P-spline anova-type interaction models for spatio-temporal smoothing," DES - Working Papers. Statistics and Econometrics. WS ws093312, Universidad Carlos III de Madrid. Departamento de Estadística.
    13. repec:hum:wpaper:sfb649dp2017-024 is not listed on IDEAS
    14. Militino, A.F. & Goicoa, T. & Ugarte, M.D., 2012. "Estimating the percentage of food expenditure in small areas using bias-corrected P-spline based estimators," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2934-2948.
    15. Basile, Roberto & Durbán, María & Mínguez, Román & María Montero, Jose & Mur, Jesús, 2014. "Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities," Journal of Economic Dynamics and Control, Elsevier, vol. 48(C), pages 229-245.
    16. Israel Martínez‐Hernández & Marc G. Genton, 2021. "Nonparametric trend estimation in functional time series with application to annual mortality rates," Biometrics, The International Biometric Society, vol. 77(3), pages 866-878, September.
    17. María Xosé Rodríguez‐Álvarez & María Durbán & Paul H.C. Eilers & Dae‐Jin Lee & Francisco Gonzalez, 2023. "Multidimensional adaptive P‐splines with application to neurons' activity studies," Biometrics, The International Biometric Society, vol. 79(3), pages 1972-1985, September.
    18. Lee, Dae-Jin & Durbán, María, 2012. "Seasonal modulation mixed models for time series forecasting," DES - Working Papers. Statistics and Econometrics. WS ws122519, Universidad Carlos III de Madrid. Departamento de Estadística.
    19. Christian Schellhase & Göran Kauermann, 2012. "Density estimation and comparison with a penalized mixture approach," Computational Statistics, Springer, vol. 27(4), pages 757-777, December.
    20. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2018. "Nonparametric estimation of international R&D spillovers," SEEDS Working Papers 0318, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Mar 2018.
    21. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2019. "Nonparametric estimation of R&D international spillovers," Post-Print hal-02789474, HAL.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:15:p:1761-:d:601590. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.