IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v122y2018icp135-155.html
   My bibliography  Save this article

Goodness-of-fit test for nonparametric regression models: Smoothing spline ANOVA models as example

Author

Listed:
  • Teran Hidalgo, Sebastian J.
  • Wu, Michael C.
  • Engel, Stephanie M.
  • Kosorok, Michael R.

Abstract

Nonparametric regression models do not require the specification of the functional form between the outcome and the covariates. Despite their popularity, the amount of diagnostic statistics, in comparison to their parametric counterparts, is small. We propose a goodness-of-fit test for nonparametric regression models with linear smoother form. In particular, we apply this testing framework to smoothing spline ANOVA models. The test can consider two sources of lack-of-fit: whether covariates that are not currently in the model need to be included, and whether the current model fits the data well. The proposed method derives estimated residuals from the model. Then, statistical dependence is assessed between the estimated residuals and the covariates using the HSIC. If dependence exists, the model does not capture all the variability in the outcome associated with the covariates, otherwise the model fits the data well. The bootstrap is used to obtain p-values. Application of the method is demonstrated with a neonatal mental development data analysis. We demonstrate correct type I error as well as power performance through simulations.

Suggested Citation

  • Teran Hidalgo, Sebastian J. & Wu, Michael C. & Engel, Stephanie M. & Kosorok, Michael R., 2018. "Goodness-of-fit test for nonparametric regression models: Smoothing spline ANOVA models as example," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 135-155.
  • Handle: RePEc:eee:csdana:v:122:y:2018:i:c:p:135-155
    DOI: 10.1016/j.csda.2018.01.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947318300057
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2018.01.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Einmahl, John H.J. & Van Keilegom, Ingrid, 2008. "Specification tests in nonparametric regression," Journal of Econometrics, Elsevier, vol. 143(1), pages 88-102, March.
    2. Székely, Gábor J. & Rizzo, Maria L., 2013. "The distance correlation t-test of independence in high dimension," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 193-213.
    3. Neumeyer, Natalie, 2009. "Testing independence in nonparametric regression," Journal of Multivariate Analysis, Elsevier, vol. 100(7), pages 1551-1566, August.
    4. Einmahl, J.H.J. & van Keilegom, I., 2006. "Tests for Independence in Nonparametric Regression," Discussion Paper 2006-80, Tilburg University, Center for Economic Research.
    5. Fan J. & Huang L-S., 2001. "Goodness-of-Fit Tests for Parametric Regression Models," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 640-652, June.
    6. A. Sen & B. Sen, 2014. "Testing independence and goodness-of-fit in linear models," Biometrika, Biometrika Trust, vol. 101(4), pages 927-942.
    7. Dawei Liu & Xihong Lin & Debashis Ghosh, 2007. "Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models," Biometrics, The International Biometric Society, vol. 63(4), pages 1079-1088, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Barrientos, Andrés F. & Canale, Antonio, 2021. "A Bayesian goodness-of-fit test for regression," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sankar, Subhra & Bergsma, Wicher & Dassios, Angelos, 2017. "Testing independence of covariates and errors in nonparametric regression," LSE Research Online Documents on Economics 83780, London School of Economics and Political Science, LSE Library.
    2. Fan, Caiyun & Lu, Wenbin & Zhou, Yong, 2021. "Testing error heterogeneity in censored linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).
    3. Van Keilegom, Ingrid, 2013. "Discussion on: "An updated review of Goodness-of-Fit tests for regression models" (by W. Gonzales-Manteiga and R.M. Crujeiras)," LIDAM Discussion Papers ISBA 2013008, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    4. Florens, Jean-Pierre & Simar, Léopold & Van Keilegom, Ingrid, 2014. "Frontier estimation in nonparametric location-scale models," Journal of Econometrics, Elsevier, vol. 178(P3), pages 456-470.
    5. Simar, Léopold & Vanhems, Anne & Van Keilegom, Ingrid, 2016. "Unobserved heterogeneity and endogeneity in nonparametric frontier estimation," Journal of Econometrics, Elsevier, vol. 190(2), pages 360-373.
    6. Hlávka, Zdenek & Husková, Marie & Meintanis, Simos G., 2011. "Tests for independence in non-parametric heteroscedastic regression models," Journal of Multivariate Analysis, Elsevier, vol. 102(4), pages 816-827, April.
    7. Bodhisattva Sen & Mary Meyer, 2017. "Testing against a linear regression model using ideas from shape-restricted estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(2), pages 423-448, March.
    8. Escanciano, Juan Carlos & Jacho-Chávez, David T., 2012. "n-uniformly consistent density estimation in nonparametric regression models," Journal of Econometrics, Elsevier, vol. 167(2), pages 305-316.
    9. Braekers, Roel & Van Keilegom, Ingrid, 2009. "Flexible modeling based on copulas in nonparametric median regression," Journal of Multivariate Analysis, Elsevier, vol. 100(6), pages 1270-1281, July.
    10. Neumeyer, Natalie & Noh, Hohsuk & Van Keilegom, Ingrid, 2014. "Heteroscedastic semiparametric transformation models: estimation and testing for validity," LIDAM Discussion Papers ISBA 2014047, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    11. Mastromarco, Camilla & Simar, Léopold, 2018. "Globalization and productivity: A robust nonparametric world frontier analysis," Economic Modelling, Elsevier, vol. 69(C), pages 134-149.
    12. Neumeyer, Natalie, 2009. "Testing independence in nonparametric regression," Journal of Multivariate Analysis, Elsevier, vol. 100(7), pages 1551-1566, August.
    13. Natalie Neumeyer & Ingrid Van Keilegom, 2009. "Change‐Point Tests for the Error Distribution in Non‐parametric Regression," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(3), pages 518-541, September.
    14. Jun Zhang & Zhenghui Feng & Peirong Xu, 2015. "Estimating the conditional single-index error distribution with a partial linear mean regression," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(1), pages 61-83, March.
    15. Y. Andriyana & I. Gijbels & A. Verhasselt, 2018. "Quantile regression in varying-coefficient models: non-crossing quantile curves and heteroscedasticity," Statistical Papers, Springer, vol. 59(4), pages 1589-1621, December.
    16. Guochang Wang & Wai Keung Li & Ke Zhu, 2018. "New HSIC-based tests for independence between two stationary multivariate time series," Papers 1804.09866, arXiv.org.
    17. Zaili Fang & Inyoung Kim & Jeesun Jung, 2018. "Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 129-152, March.
    18. Dette, Holger & Marchlewski, Mareen, 2007. "A test for the parametric form of the variance function in apartial linear regression model," Technical Reports 2007,26, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    19. Mason David M. & Eubank Randy, 2012. "Moderate deviations and intermediate efficiency for lack-of-fit tests," Statistics & Risk Modeling, De Gruyter, vol. 29(2), pages 175-187, June.
    20. Rauf Ahmad, M., 2019. "A significance test of the RV coefficient in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 116-130.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:122:y:2018:i:c:p:135-155. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.