IDEAS home Printed from https://ideas.repec.org/a/spr/stmapp/v31y2022i4d10.1007_s10260-021-00619-w.html
   My bibliography  Save this article

Inference for non-probability samples under high-dimensional covariate-adjusted superpopulation model

Author

Listed:
  • Yingli Pan

    (Hubei University)

  • Wen Cai

    (Hubei University)

  • Zhan Liu

    (Hubei University)

Abstract

Non-probability samples become increasingly popular in sampling survey with lower costs, shorter time durations and higher efficiencies. In the high-dimensional superpopulation modeling approach for non-probability samples, a model is fitted for the analysis variable from a non-probability sample, and is used to project the sample to the full population. In practice, there exist situations that the covariates in modeling process are not directly observed, but are contaminated with a multiplicative factor that is determined by the value of an unknown function of an observable confounder. In the paper, we propose to calibrate the covariates by nonparametrically regressing the observable contaminated covariate on the confounder. We employ the SCAD-penalized least squares method to investigate the variable selection and inference problems for non-probability samples based on the calibrated covariates. A SCAD-penalized estimator for the parameter and the population mean estimator are obtained. Under some mild assumptions, we establish the “oracle property” of the proposed SCAD-penalized estimator and give the consistency properties of the proposed population mean estimator. Simulation studies are conducted to assess the finite-sample performance of the proposed method. An application to a Boston housing price study demonstrates the utility of the proposed method in practice.

Suggested Citation

  • Yingli Pan & Wen Cai & Zhan Liu, 2022. "Inference for non-probability samples under high-dimensional covariate-adjusted superpopulation model," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(4), pages 955-979, October.
  • Handle: RePEc:spr:stmapp:v:31:y:2022:i:4:d:10.1007_s10260-021-00619-w
    DOI: 10.1007/s10260-021-00619-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10260-021-00619-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10260-021-00619-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hui Zou, 2008. "A note on path-based variable selection in the penalized proportional hazards model," Biometrika, Biometrika Trust, vol. 95(1), pages 241-247.
    2. Jia Chen & Jiti Gao & Degui Li, 2013. "Estimation in Partially Linear Single-Index Panel Data Models With Fixed Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(3), pages 315-330, July.
    3. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    4. Jack Kuang Tsung Chen & Richard L. Valliant & Michael R. Elliott, 2019. "Calibrating non‐probability surveys to estimated control totals using LASSO, with an application to political polling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 68(3), pages 657-681, April.
    5. Li-Chun Zhang, 2019. "On valid descriptive inference from non-probability sample," Statistical Theory and Related Fields, Taylor & Francis Journals, vol. 3(2), pages 103-113, July.
    6. Shu Yang & Jae Kwang Kim & Rui Song, 2020. "Doubly robust inference when combining probability and non‐probability samples with high dimensional data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(2), pages 445-465, April.
    7. Niels Keiding & Thomas A. Louis, 2016. "Perils and potentials of self-selected entry to epidemiological studies and surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(2), pages 319-376, February.
    8. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    9. Jun Zhang & Yao Yu & Li-Xing Zhu & Hua Liang, 2013. "Partial linear single index models with distortion measurement errors," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(2), pages 237-267, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xie, Chuanlong & Zhu, Lixing, 2019. "A goodness-of-fit test for variable-adjusted models," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 27-48.
    2. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    3. Tizheng Li & Xiaojuan Kang, 2022. "Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters," Statistical Papers, Springer, vol. 63(1), pages 243-285, February.
    4. Tang, Yanlin & Song, Xinyuan & Wang, Huixia Judy & Zhu, Zhongyi, 2013. "Variable selection in high-dimensional quantile varying coefficient models," Journal of Multivariate Analysis, Elsevier, vol. 122(C), pages 115-132.
    5. Dasom Lee & Shu Yang & Lin Dong & Xiaofei Wang & Donglin Zeng & Jianwen Cai, 2023. "Improving trial generalizability using observational studies," Biometrics, The International Biometric Society, vol. 79(2), pages 1213-1225, June.
    6. Jun Zhang, 2021. "Estimation and variable selection for partial linear single-index distortion measurement errors models," Statistical Papers, Springer, vol. 62(2), pages 887-913, April.
    7. Yinjun Chen & Hao Ming & Hu Yang, 2024. "Efficient variable selection for high-dimensional multiplicative models: a novel LPRE-based approach," Statistical Papers, Springer, vol. 65(6), pages 3713-3737, August.
    8. Fang Lu & Jing Yang & Xuewen Lu, 2022. "One-step oracle procedure for semi-parametric spatial autoregressive model and its empirical application to Boston housing price data," Empirical Economics, Springer, vol. 62(6), pages 2645-2671, June.
    9. Hu Yang & Ning Li & Jing Yang, 2020. "A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates," Statistical Papers, Springer, vol. 61(5), pages 1911-1937, October.
    10. Lina Liao & Cheolwoo Park & Hosik Choi, 2019. "Penalized expectile regression: an alternative to penalized quantile regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(2), pages 409-438, April.
    11. Zhensheng Huang & Xing Sun & Riquan Zhang, 2022. "Estimation for partially varying-coefficient single-index models with distorted measurement errors," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(2), pages 175-201, February.
    12. Wei Qian & Yuhong Yang, 2013. "Model selection via standard error adjusted adaptive lasso," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(2), pages 295-318, April.
    13. Hong, Hyokyoung G. & Zheng, Qi & Li, Yi, 2019. "Forward regression for Cox models with high-dimensional covariates," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 268-290.
    14. Kong, Dehan & Bondell, Howard D. & Wu, Yichao, 2015. "Domain selection for the varying coefficient model via local polynomial regression," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 236-250.
    15. Zhang, Jun & Feng, Zhenghui & Zhou, Bu, 2014. "A revisit to correlation analysis for distortion measurement error data," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 116-129.
    16. Xia, Xiaochao & Liu, Zhi & Yang, Hu, 2016. "Regularized estimation for the least absolute relative error models with a diverging number of covariates," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 104-119.
    17. Xuan Liu & Jianbao Chen, 2021. "Variable Selection for the Spatial Autoregressive Model with Autoregressive Disturbances," Mathematics, MDPI, vol. 9(12), pages 1-20, June.
    18. He, Xin & Mao, Xiaojun & Wang, Zhonglei, 2024. "Nonparametric augmented probability weighting with sparsity," Computational Statistics & Data Analysis, Elsevier, vol. 191(C).
    19. Sanying Feng & Liugen Xue, 2013. "Variable selection for partially varying coefficient single-index model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(12), pages 2637-2652, December.
    20. Song, Yunquan & Liang, Xijun & Zhu, Yanji & Lin, Lu, 2021. "Robust variable selection with exponential squared loss for the spatial autoregressive model," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:31:y:2022:i:4:d:10.1007_s10260-021-00619-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.