IDEAS home Printed from https://ideas.repec.org/p/sep/wpaper/3_231.html
   My bibliography  Save this paper

GRID for model structure discovering in high dimensional regression

Author

Listed:
  • Francesco Giordano

    (Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno)

  • Soumendra Nath Lahiri

    (Statistics Department, NC State University)

  • Maria Lucia Parrella

    (Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno)

Abstract

Given a nonparametric regression model, we assume that the number of covariates $d\rightarrow\infty$ but only some of these covariates are relevant for the model. Our goal is to identify the relevant covariates and to obtain some information about the structure of the model. We propose a new nonparametric procedure, called GRID, having the following features: (a) it automatically identifies the relevant covariates of the regression model, also distinguishing the nonlinear from the linear ones (a covariate is defined linear/nonlinear depending on the marginal relation between the response variable and such a covariate); (b) the interactions between the covariates (mixed effect terms) are automatically identified, without the necessity of considering some kind of stepwise selection method. In particular, our procedure can identify the mixed terms of any order (two way, three way, ...) without increasing the computational complexity of the algorithm; (c) it is completely data-driven, so being easily implementable for the analysis of real datasets. In particular, it does not depend on the selection of crucial regularization parameters, nor it requires the estimation of the nuisance parameter $\sigma^2$ (self scaling). The acronym GRID has a twofold meaning: first, it derives from Gradient Relevant Identification Derivatives, meaning that the procedure is based on testing the significance of a partial derivative estimator; second, it refers to a graphical tool which can help in representing the identified structure of the regression model. The properties of the GRID procedure are investigated theoretically.

Suggested Citation

  • Francesco Giordano & Soumendra Nath Lahiri & Maria Lucia Parrella, 2014. "GRID for model structure discovering in high dimensional regression," Working Papers 3_231, Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno.
  • Handle: RePEc:sep:wpaper:3_231
    as

    Download full text from publisher

    File URL: http://www.dises.unisa.it/RePEc/sep/wpaper/3_231.pdf
    File Function: First version, 2014
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lu, Zhan-Qian, 1996. "Multivariate Locally Weighted Polynomial Fitting and Partial Derivative Estimation," Journal of Multivariate Analysis, Elsevier, vol. 59(2), pages 187-205, November.
    2. Radchenko, Peter & James, Gareth M., 2010. "Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 105(492), pages 1541-1553.
    3. Song Xi Chen & Liang Peng & Ying-Li Qin, 2009. "Effects of data dimension on empirical likelihood," Biometrika, Biometrika Trust, vol. 96(3), pages 711-722.
    4. Zhang, Hao Helen & Cheng, Guang & Liu, Yufeng, 2011. "Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1099-1112.
    5. Elias Masry, 1996. "Multivariate Local Polynomial Regression For Time Series:Uniform Strong Consistency And Rates," Journal of Time Series Analysis, Wiley Blackwell, vol. 17(6), pages 571-599, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Giordano, Francesco & Parrella, Maria Lucia, 2016. "Bias-corrected inference for multivariate nonparametric regression: Model selection and oracle property," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 71-93.
    2. Fabian Scheipl & Thomas Kneib & Ludwig Fahrmeir, 2013. "Penalized likelihood and Bayesian function selection in regression models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(4), pages 349-385, October.
    3. Francesco Giordano & Maria Lucia Parrella, 2014. "Bias-corrected inference for multivariate nonparametric regression: model selection and oracle property," Working Papers 3_232, Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno.
    4. Du, Pang & Cheng, Guang & Liang, Hua, 2012. "Semiparametric regression models with additive nonparametric components and high dimensional parametric components," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 2006-2017.
    5. Chang, Jinyuan & Chen, Song Xi & Chen, Xiaohong, 2015. "High dimensional generalized empirical likelihood for moment restrictions with dependent data," Journal of Econometrics, Elsevier, vol. 185(1), pages 283-304.
    6. El Ghouch, Anouar & Genton, Marc G. & Bouezmarni , Taoufik, 2012. "Measuring the Discrepancy of a Parametric Model via Local Polynomial Smoothing," LIDAM Discussion Papers ISBA 2012001, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    7. Jun Yan & Jian Huang, 2012. "Model Selection for Cox Models with Time-Varying Coefficients," Biometrics, The International Biometric Society, vol. 68(2), pages 419-428, June.
    8. Bonsoo Koo & Oliver Linton, 2010. "Semiparametric Estimation of Locally Stationary Diffusion Models," STICERD - Econometrics Paper Series 551, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    9. Lee, Yoonseok & Mukherjee, Debasri & Ullah, Aman, 2019. "Nonparametric estimation of the marginal effect in fixed-effect panel data models," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 53-67.
    10. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    11. Jesus Gonzalo & Jose Olmo, 2014. "Conditional Stochastic Dominance Tests In Dynamic Settings," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 55(3), pages 819-838, August.
    12. Whang, Yoon-Jae & Linton, Oliver, 1999. "The asymptotic distribution of nonparametric estimates of the Lyapunov exponent for stochastic time series," Journal of Econometrics, Elsevier, vol. 91(1), pages 1-42, July.
    13. Park, Byeong U. & Simar, Léopold & Zelenyuk, Valentin, 2017. "Nonparametric estimation of dynamic discrete choice models for time series data," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 97-120.
    14. Su, Liangjun & Lu, Xun, 2013. "Nonparametric dynamic panel data models: Kernel estimation and specification testing," Journal of Econometrics, Elsevier, vol. 176(2), pages 112-133.
    15. Portier, François & Segers, Johan, 2018. "On the weak convergence of the empirical conditional copula under a simplifying assumption," Journal of Multivariate Analysis, Elsevier, vol. 166(C), pages 160-181.
    16. Oliver Linton & Pedro Gozalo, 1996. "Conditional Independence Restrictions: Testing and Estimation," Cowles Foundation Discussion Papers 1140, Cowles Foundation for Research in Economics, Yale University.
    17. Chen, Bin & Song, Zhaogang, 2013. "Testing whether the underlying continuous-time process follows a diffusion: An infinitesimal operator-based approach," Journal of Econometrics, Elsevier, vol. 173(1), pages 83-107.
    18. Zhang, Jia & Shi, Haoming & Tian, Lemeng & Xiao, Fengjun, 2019. "Penalized generalized empirical likelihood in high-dimensional weakly dependent data," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 270-283.
    19. Peter Malec, 2016. "A Semiparametric Intraday GARCH Model," Cambridge Working Papers in Economics 1633, Faculty of Economics, University of Cambridge.
    20. repec:hum:wpaper:sfb649dp2007-042 is not listed on IDEAS
    21. Hoderlein, Stefan & Su, Liangjun & White, Halbert & Yang, Thomas Tao, 2016. "Testing for monotonicity in unobservables under unconfoundedness," Journal of Econometrics, Elsevier, vol. 193(1), pages 183-202.

    More about this item

    Keywords

    Variable selection; model selection; nonparametric model regression.;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sep:wpaper:3_231. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Maria Rizzo (email available below). General contact details of provider: https://edirc.repec.org/data/dssalit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.