IDEAS home Printed from https://ideas.repec.org/a/bla/scjsta/v51y2024i3p1259-1287.html
   My bibliography  Save this article

Semiparametric efficient estimation in high‐dimensional partial linear regression models

Author

Listed:
  • Xinyu Fu
  • Mian Huang
  • Weixin Yao

Abstract

We introduce a novel semiparametric efficient estimation procedure for high‐dimensional partial linear regression models to overcome the challenge of efficiency loss of the traditional least‐squares based estimation procedure under unknown error distributions, while enjoying several appealing theoretical properties. The new estimation procedure provides a sparse estimator for the parametric component and achieves the semiparametric efficiency as the oracle maximum likelihood estimator as if the error distribution was known. By employing the penalized estimation and the semiparametric efficiency theory for ultra‐high‐dimensional partial linear model, the procedure enjoys the oracle variable selection property and offers efficiency gain for non‐Gaussian random errors, while maintaining the same efficiency as the least squares‐based estimator for Gaussian random errors. Extensive simulation studies and an empirical application are conducted to demonstrate the effectiveness of the proposed procedure.

Suggested Citation

  • Xinyu Fu & Mian Huang & Weixin Yao, 2024. "Semiparametric efficient estimation in high‐dimensional partial linear regression models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 51(3), pages 1259-1287, September.
  • Handle: RePEc:bla:scjsta:v:51:y:2024:i:3:p:1259-1287
    DOI: 10.1111/sjos.12716
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/sjos.12716
    Download Restriction: no

    File URL: https://libkey.io/10.1111/sjos.12716?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Lan Wang & Bo Peng & Jelena Bradic & Runze Li & Yunan Wu, 2020. "A Tuning-free Robust and Efficient Approach to High-dimensional Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1700-1714, December.
    3. Patric Müller & Sara Geer, 2015. "The Partial Linear Model in High Dimensions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(2), pages 580-608, June.
    4. Jianqing Fan & Runze Li, 2004. "New Estimation and Model Selection Procedures for Semiparametric Modeling in Longitudinal Data Analysis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 710-723, January.
    5. Ao Yuan & Jan G. De Gooijer, 2007. "Semiparametric Regression with Kernel Error Model," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 34(4), pages 841-869, December.
    6. Lan Wang & Bo Peng & Jelena Bradic & Runze Li & Yunan Wu, 2020. "Rejoinder to “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1726-1729, December.
    7. Jianqing Fan & Cong Ma & Kaizheng Wang, 2020. "Comment on “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1720-1725, December.
    8. Guang Cheng & Hao Zhang & Zuofeng Shang, 2015. "Sparse and efficient estimation for partial spline models with increasing dimension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(1), pages 93-127, February.
    9. Liang, Hua & Li, Runze, 2009. "Variable Selection for Partially Linear Models With Measurement Errors," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 234-248.
    10. Anastasios A. Tsiatis & Yanyuan Ma, 2004. "Locally efficient semiparametric estimators for functional measurement error models," Biometrika, Biometrika Trust, vol. 91(4), pages 835-848, December.
    11. Xihong Lin & Raymond J. Carroll, 2006. "Semiparametric estimation in general repeated measures problems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 69-88, February.
    12. Zhang, Hao Helen & Cheng, Guang & Liu, Yufeng, 2011. "Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1099-1112.
    13. Xiudi Li & Ali Shojaie, 2020. "Discussion of “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1717-1719, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuyang Liu & Pengfei Pi & Shan Luo, 2023. "A semi-parametric approach to feature selection in high-dimensional linear regression models," Computational Statistics, Springer, vol. 38(2), pages 979-1000, June.
    2. Canhong Wen & Zhenduo Li & Ruipeng Dong & Yijin Ni & Wenliang Pan, 2023. "Simultaneous Dimension Reduction and Variable Selection for Multinomial Logistic Regression," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1044-1060, September.
    3. Mingyang Ren & Sanguo Zhang & Junhui Wang, 2023. "Consistent estimation of the number of communities via regularized network embedding," Biometrics, The International Biometric Society, vol. 79(3), pages 2404-2416, September.
    4. Yu, Ke & Luo, Shan, 2024. "Rank-based sequential feature selection for high-dimensional accelerated failure time models with main and interaction effects," Computational Statistics & Data Analysis, Elsevier, vol. 197(C).
    5. Jack Jewson & David Rossell, 2022. "General Bayesian loss function selection and the use of improper models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1640-1665, November.
    6. Dong, Wei & Xu, Chen & Xie, Jinhan & Tang, Niansheng, 2024. "Tuning-free sparse clustering via alternating hard-thresholding," Journal of Multivariate Analysis, Elsevier, vol. 203(C).
    7. Liu, Jingyuan & Lou, Lejia & Li, Runze, 2018. "Variable selection for partially linear models via partial correlation," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 418-434.
    8. Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
    9. Lian, Heng & Du, Pang & Li, YuanZhang & Liang, Hua, 2014. "Partially linear structure identification in generalized additive models with NP-dimensionality," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 197-208.
    10. Li, Xinyi & Wang, Li & Nettleton, Dan, 2019. "Sparse model identification and learning for ultra-high-dimensional additive partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 204-228.
    11. Zhang, Jun & Feng, Zhenghui & Peng, Heng, 2018. "Estimation and hypothesis test for partial linear multiplicative models," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 87-103.
    12. Wang, Xiuli & Zhao, Shengli & Wang, Mingqiu, 2017. "Restricted profile estimation for partially linear models with large-dimensional covariates," Statistics & Probability Letters, Elsevier, vol. 128(C), pages 71-76.
    13. Cui, Wenquan & Cheng, Haoyang & Sun, Jiajing, 2018. "An RKHS-based approach to double-penalized regression in high-dimensional partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 201-210.
    14. Ruiqi Liu & Ben Boukai & Zuofeng Shang, 2019. "Statistical Inference on Partially Linear Panel Model under Unobserved Linearity," Papers 1911.08830, arXiv.org.
    15. Huazhen Lin & Ling Zhou & Xiaohua Zhou, 2014. "Semiparametric Regression Analysis of Longitudinal Skewed Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(4), pages 1031-1050, December.
    16. Jun Zhang & Zhenghui Feng & Peirong Xu & Hua Liang, 2017. "Generalized varying coefficient partially linear measurement errors models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(1), pages 97-120, February.
    17. Jia Chen & Degui Li & Hua Liang & Suojin Wang, 2014. "Semiparametric GEE Analysis in Partially Linear Single-Index Models for Longitudinal Data," Discussion Papers 14/26, Department of Economics, University of York.
    18. Zhao, Yan-Yong & Zhang, Yuchun & Liu, Yuan & Ismail, Noriszura, 2024. "Distributed debiased estimation of high-dimensional partially linear models with jumps," Computational Statistics & Data Analysis, Elsevier, vol. 191(C).
    19. Zhao, Peixin & Xue, Liugen, 2010. "Variable selection for semiparametric varying coefficient partially linear errors-in-variables models," Journal of Multivariate Analysis, Elsevier, vol. 101(8), pages 1872-1883, September.
    20. Yawei He & Zehua Chen, 2016. "The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(1), pages 155-180, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:scjsta:v:51:y:2024:i:3:p:1259-1287. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0303-6898 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.