IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v196y2023ics0047259x23000210.html
   My bibliography  Save this article

Semiparametric penalized quadratic inference functions for longitudinal data in ultra-high dimensions

Author

Listed:
  • Green, Brittany
  • Lian, Heng
  • Yu, Yan
  • Zu, Tianhai

Abstract

In many biomedical and health studies, multivariate data arise from repeated measurements on a sample of subjects over time. In order to analyze such longitudinal data, we need to consider the correlations from the same subject, and it is inappropriate to use a simple multivariate model assuming independence structure. Motivated by a large scale longitudinal public health study that requires longitudinal data analysis with correlated multivariate discrete responses from repeated measurements and very high dimensional covariates, we adopt a flexible semiparametric approach for simultaneous variable selection and estimation without the requirement of specifying the full likelihood. Specifically, we propose generalized partially linear single-index models using penalized quadratic inference functions for longitudinal data in ultra-high dimensions. A key feature is that we allow the number of single-index covariates in the nonparametric term to diverge and even to be in ultra-high dimensions. The penalized quadratic inference functions easily incorporate within-subject correlation and pursue efficient estimation, and the single-index models can incorporate nonlinearity and some interactions while avoiding the curse of dimensionality. In this challenging setting, we contribute both an efficient algorithm and new asymptotic theory for our proposed approach for diverging and even ultra-dimensional covariates and a multivariate correlated response in longitudinal data. We apply our method to investigate diabetes status within a continuing longitudinal public health study with very high-dimensional genetic variables and phenotype variables.

Suggested Citation

  • Green, Brittany & Lian, Heng & Yu, Yan & Zu, Tianhai, 2023. "Semiparametric penalized quadratic inference functions for longitudinal data in ultra-high dimensions," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
  • Handle: RePEc:eee:jmvana:v:196:y:2023:i:c:s0047259x23000210
    DOI: 10.1016/j.jmva.2023.105175
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X23000210
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2023.105175?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    2. Yu Y. & Ruppert D., 2002. "Penalized Spline Estimation for Partially Linear Single-Index Models," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1042-1054, December.
    3. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    4. Peng Wang & Guei-feng Tsai & Annie Qu, 2012. "Conditional Inference Functions for Mixed-Effects Models With Unspecified Random-Effects Distribution," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 725-736, June.
    5. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    6. Jianhui Zhou & Annie Qu, 2012. "Informative Estimation and Selection of Correlation Structure for Longitudinal Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 701-710, June.
    7. Lan Wang & Jianhui Zhou & Annie Qu, 2012. "Penalized Generalized Estimating Equations for High-Dimensional Longitudinal Data Analysis," Biometrics, The International Biometric Society, vol. 68(2), pages 353-360, June.
    8. Lihua Cai & Honglong Wu & Dongfang Li & Ke Zhou & Fuhao Zou, 2015. "Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-15, October.
    9. Lai, Peng & Li, Gaorong & Lian, Heng, 2013. "Quadratic inference functions for partially linear single-index models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 118(C), pages 115-127.
    10. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    11. Bai, Yang & Fung, Wing K. & Zhu, Zhong Yi, 2009. "Penalized quadratic inference functions for single-index models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 100(1), pages 152-161, January.
    12. Xuming He, 2002. "Estimation in a semiparametric model for longitudinal data with unspecified dependence structure," Biometrika, Biometrika Trust, vol. 89(3), pages 579-590, August.
    13. Annie Qu & Runze Li, 2006. "Quadratic Inference Functions for Varying-Coefficient Models with Longitudinal Data," Biometrics, The International Biometric Society, vol. 62(2), pages 379-391, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Geng, Shuli & Zhang, Lixin, 2024. "Decorrelated empirical likelihood for generalized linear models with high-dimensional longitudinal data," Statistics & Probability Letters, Elsevier, vol. 211(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ma, Shujie & Liang, Hua & Tsai, Chih-Ling, 2014. "Partially linear single index models for repeated measurements," Journal of Multivariate Analysis, Elsevier, vol. 130(C), pages 354-375.
    2. Brittany Green & Heng Lian & Yan Yu & Tianhai Zu, 2021. "Ultra high‐dimensional semiparametric longitudinal data analysis," Biometrics, The International Biometric Society, vol. 77(3), pages 903-913, September.
    3. Lai, Peng & Li, Gaorong & Lian, Heng, 2013. "Quadratic inference functions for partially linear single-index models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 118(C), pages 115-127.
    4. Rui Li & Chenlei Leng & Jinhong You, 2017. "A Semiparametric Regression Model for Longitudinal Data with Non-stationary Errors," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(4), pages 932-950, December.
    5. Lei Wang & Wei Ma, 2021. "Improved empirical likelihood inference and variable selection for generalized linear models with longitudinal nonignorable dropouts," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 623-647, June.
    6. Lai, Peng & Wang, Qihua & Lian, Heng, 2012. "Bias-corrected GEE estimation and smooth-threshold GEE variable selection for single-index models with clustered data," Journal of Multivariate Analysis, Elsevier, vol. 105(1), pages 422-432.
    7. Guang Cheng & Hao Zhang & Zuofeng Shang, 2015. "Sparse and efficient estimation for partial spline models with increasing dimension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(1), pages 93-127, February.
    8. Xiaochao Xia & Binyan Jiang & Jialiang Li & Wenyang Zhang, 2016. "Low-dimensional confounder adjustment and high-dimensional penalized estimation for survival analysis," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 22(4), pages 547-569, October.
    9. Jin, Fei & Lee, Lung-fei, 2018. "Irregular N2SLS and LASSO estimation of the matrix exponential spatial specification model," Journal of Econometrics, Elsevier, vol. 206(2), pages 336-358.
    10. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    11. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.
    12. Joel L. Horowitz, 2015. "Variable selection and estimation in high-dimensional models," CeMMAP working papers 35/15, Institute for Fiscal Studies.
    13. Hou, Zhaohan & Wang, Lei, 2024. "Heterogeneous quantile regression for longitudinal data with subgroup structures," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    14. Zou, Yubo & Zhang, Jiajia & Qin, Guoyou, 2011. "A semiparametric accelerated failure time partial linear model and its application to breast cancer," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1479-1487, March.
    15. Lian, Heng, 2014. "Semiparametric Bayesian information criterion for model selection in ultra-high dimensional additive models," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 304-310.
    16. Peirong Xu & Jun Zhang & Xingfang Huang & Tao Wang, 2016. "Efficient estimation for marginal generalized partially linear single-index models with longitudinal data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(3), pages 413-431, September.
    17. Li, Cheng & Jiang, Wenxin, 2016. "On oracle property and asymptotic validity of Bayesian generalized method of moments," Journal of Multivariate Analysis, Elsevier, vol. 145(C), pages 132-147.
    18. Guo-Liang Tian & Mingqiu Wang & Lixin Song, 2014. "Variable selection in the high-dimensional continuous generalized linear model with current status data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(3), pages 467-483, March.
    19. Xiaorui Zhu & Yichen Qin & Peng Wang, 2023. "Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models," Papers 2307.07574, arXiv.org, revised Jan 2025.
    20. Tian, Ruiqin & Xue, Liugen & Liu, Chunling, 2014. "Penalized quadratic inference functions for semiparametric varying coefficient partially linear models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 132(C), pages 94-110.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:196:y:2023:i:c:s0047259x23000210. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.