IDEAS home Printed from https://ideas.repec.org/a/eee/econom/v239y2024i2s0304407623000568.html
   My bibliography  Save this article

The nonparametric Box–Cox model for high-dimensional regression analysis

Author

Listed:
  • Zhou, He
  • Zou, Hui

Abstract

The mainstream theory for high-dimensional regression assumes that the underlying true model is a low-dimensional linear regression model. On the other hand, a standard technique in regression analysis, even in the traditional low-dimensional setting, is to employ the Box–Cox transformation for reducing anomalies such as non-additivity and heteroscedasticity in linear regression. In this paper, we propose a new high-dimensional regression method based on a nonparametric Box–Cox model with an unspecified monotone transformation function. Model fitting and computation become much more challenging than the usual penalized regression method, and a two-step method is proposed for the estimation of this model in high-dimensional settings. First, we propose a novel technique called composite probit regression (CPR) and use the folded concave penalized CPR for estimating the regression parameters. The strong oracle property of the estimator is established without knowing the nonparametric transformation function. Next, the nonparametric function is estimated by conducting univariate monotone regression. The computation is done efficiently by using a coordinate-majorization-descent algorithm. Extensive simulation studies show that the proposed method performs well in various settings. Our analysis of the supermarket data demonstrates the superior performance of the proposed method over the standard high-dimensional regression method.

Suggested Citation

  • Zhou, He & Zou, Hui, 2024. "The nonparametric Box–Cox model for high-dimensional regression analysis," Journal of Econometrics, Elsevier, vol. 239(2).
  • Handle: RePEc:eee:econom:v:239:y:2024:i:2:s0304407623000568
    DOI: 10.1016/j.jeconom.2023.01.025
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304407623000568
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jeconom.2023.01.025?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. J. O. Ramsay, 1998. "Estimating smooth monotone functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 365-375.
    2. Songnian Chen, 2002. "Rank Estimation of Transformation Models," Econometrica, Econometric Society, vol. 70(4), pages 1683-1697, July.
    3. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    4. Fan, Yanqin & Han, Fang & Li, Wei & Zhou, Xiao-Hua, 2020. "On rank estimators in increasing dimensions," Journal of Econometrics, Elsevier, vol. 214(2), pages 379-412.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. Han, Aaron K., 1987. "Non-parametric analysis of a generalized regression model : The maximum rank correlation estimator," Journal of Econometrics, Elsevier, vol. 35(2-3), pages 303-316, July.
    7. Lan, Wei & Zhong, Ping-Shou & Li, Runze & Wang, Hansheng & Tsai, Chih-Ling, 2016. "Testing a single regression coefficient in high dimensional linear models," Journal of Econometrics, Elsevier, vol. 195(1), pages 154-168.
    8. Sherman, Robert P, 1993. "The Limiting Distribution of the Maximum Rank Correlation Estimator," Econometrica, Econometric Society, vol. 61(1), pages 123-137, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lin, Huazhen & Peng, Heng, 2013. "Smoothed rank correlation of the linear transformation regression model," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 615-630.
    2. Youngki Shin & Zvezdomir Todorov, 2021. "Exact computation of maximum rank correlation estimator," The Econometrics Journal, Royal Economic Society, vol. 24(3), pages 589-607.
    3. Koen Jochmans, 2013. "Pairwise‐comparison estimation with non‐parametric controls," Econometrics Journal, Royal Economic Society, vol. 16(3), pages 340-372, October.
    4. Bijwaard Govert E. & Ridder Geert & Woutersen Tiemen, 2013. "A Simple GMM Estimator for the Semiparametric Mixed Proportional Hazard Model," Journal of Econometric Methods, De Gruyter, vol. 2(1), pages 1-23, July.
    5. repec:spo:wpmain:info:hdl:2441/dambferfb7dfprc9m01h6f4h2 is not listed on IDEAS
    6. repec:hal:wpspec:info:hdl:2441/dambferfb7dfprc9m01h6f4h2 is not listed on IDEAS
    7. Caiyun Fan & Wenbin Lu & Rui Song & Yong Zhou, 2017. "Concordance-assisted learning for estimating optimal individualized treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1565-1582, November.
    8. Chen, Songnian, 2010. "Root-N-consistent estimation of fixed-effect panel data transformation models with censoring," Journal of Econometrics, Elsevier, vol. 159(1), pages 222-234, November.
    9. Christoph Breunig & Stephan Martin, 2020. "Nonclassical Measurement Error in the Outcome Variable," Papers 2009.12665, arXiv.org, revised May 2021.
    10. Hausman, Jerry A. & Woutersen, Tiemen, 2014. "Estimating a semi-parametric duration model without specifying heterogeneity," Journal of Econometrics, Elsevier, vol. 178(P1), pages 114-131.
    11. Subbotin, Viktor, 2007. "Asymptotic and bootstrap properties of rank regressions," MPRA Paper 9030, University Library of Munich, Germany, revised 20 Mar 2008.
    12. Tan, Xin Lu, 2019. "Optimal estimation of slope vector in high-dimensional linear transformation models," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 179-204.
    13. repec:spo:wpecon:info:hdl:2441/dambferfb7dfprc9m01h6f4h2 is not listed on IDEAS
    14. Yu, Tao & Li, Pengfei & Chen, Baojiang & Yuan, Ao & Qin, Jing, 2023. "Maximum pairwise-rank-likelihood-based inference for the semiparametric transformation model," Journal of Econometrics, Elsevier, vol. 235(2), pages 454-469.
    15. Khan, Shakeeb & Tamer, Elie, 2007. "Partial rank estimation of duration models with general forms of censoring," Journal of Econometrics, Elsevier, vol. 136(1), pages 251-280, January.
    16. Subbotin, Viktor, 2008. "Essays on the econometric theory of rank regressions," MPRA Paper 14086, University Library of Munich, Germany.
    17. Shakeeb Khan & Xiaoying Lan & Elie Tamer & Qingsong Yao, 2021. "Estimating High Dimensional Monotone Index Models by Iterative Convex Optimization1," Papers 2110.04388, arXiv.org, revised Feb 2023.
    18. repec:hal:spmain:info:hdl:2441/dambferfb7dfprc9m01h6f4h2 is not listed on IDEAS
    19. Lin, Xiefang & Fang, Fang, 2024. "Variable selection of Kolmogorov-Smirnov maximization with a penalized surrogate loss," Computational Statistics & Data Analysis, Elsevier, vol. 195(C).
    20. Xu, Wenchao & Zhang, Xinyu & Liang, Hua, 2024. "Linearized maximum rank correlation estimation when covariates are functional," Journal of Multivariate Analysis, Elsevier, vol. 202(C).
    21. Chen, Songnian & Zhang, Hanghui, 2020. "n-prediction of generalized heteroscedastic transformation regression models," Journal of Econometrics, Elsevier, vol. 215(2), pages 305-340.
    22. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    23. Zhu Wang, 2022. "MM for penalized estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 54-75, March.
    24. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:econom:v:239:y:2024:i:2:s0304407623000568. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jeconom .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.