IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v32y2023i2d10.1007_s11749-023-00850-5.html
   My bibliography  Save this article

Global debiased DC estimations for biased estimators via pro forma regression

Author

Listed:
  • Lu Lin

    (Shandong University)

  • Feng Li

    (Zhengzhou University
    Zhengzhou University)

Abstract

We establish a global unbiased divide-and-conquer estimation (gub-DC) in linear model and a global bias reduced DC estimation (gbr-DC) in nonlinear model under the case of memory constraint. To introduce the new strategy in linear model, we first provide a new insight into the statistical structure through the closed representation of the local biased estimator and then construct a pro forma linear regression with the local estimator as “response variable” and the parameter of interest as “intercept.” Based on such a regression structure, we composite a global unbiased estimator as the least squares estimator of the intercept. Generally, the gub-DC method can be applied to various biased estimations such as Ridge estimator, principal component estimator and Stein estimator in linear model. Moreover, the method can be extended into nonlinear model to construct a global bias reduced estimator. The main advantage over the classical DC methods is that the new proposed procedures can absorb the information hidden in the statistical structure, and the resulting global estimators are strictly unbiased or can achieve root-n consistency, without any constraint on the number of batches. Another attractive feature refers to the computational simplicity and efficiency. Detailed simulation studies demonstrate that the new estimators are significantly bias-corrected, and their behaviors are comparable with the entire data estimation and are better or at least not worse than the competitors.

Suggested Citation

  • Lu Lin & Feng Li, 2023. "Global debiased DC estimations for biased estimators via pro forma regression," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 726-758, June.
  • Handle: RePEc:spr:testjl:v:32:y:2023:i:2:d:10.1007_s11749-023-00850-5
    DOI: 10.1007/s11749-023-00850-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-023-00850-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-023-00850-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jelena Bradic & Jianqing Fan & Weiwei Wang, 2011. "Penalized composite quasi‐likelihood for ultrahigh dimensional variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 325-349, June.
    2. Tiejun Tong & Yuedong Wang, 2005. "Estimating residual variance in nonparametric regression using least squares," Biometrika, Biometrika Trust, vol. 92(4), pages 821-830, December.
    3. Tang, Lu & Zhou, Ling & Song, Peter X.-K., 2020. "Distributed simultaneous inference in generalized linear models via confidence distribution," Journal of Multivariate Analysis, Elsevier, vol. 176(C).
    4. Christian Bontemps, 2019. "Moment-Based Tests under Parameter Uncertainty," The Review of Economics and Statistics, MIT Press, vol. 101(1), pages 146-159, March.
    5. Runze Li & Dennis K.J. Lin & Bing Li, 2013. "Statistical inference in massive data sets," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 29(5), pages 399-409, September.
    6. Bo Kai & Runze Li & Hui Zou, 2010. "Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 49-69, January.
    7. Michael I. Jordan & Jason D. Lee & Yun Yang, 2019. "Communication-Efficient Distributed Statistical Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 668-681, April.
    8. Rajen D. Shah & Richard J. Samworth, 2013. "Variable selection with error control: another look at stability selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(1), pages 55-80, January.
    9. Lu Lin & Feng Li, 2008. "Stable and bias-corrected estimation for nonparametric regression models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 20(4), pages 283-303.
    10. HaiYing Wang & Min Yang & John Stufken, 2019. "Information-Based Optimal Subdata Selection for Big Data Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 393-405, January.
    11. Chengchun Shi & Wenbin Lu & Rui Song, 2018. "A Massive Data Framework for M-Estimators with Cubic-Rate," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(524), pages 1698-1709, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fengrui Di & Lei Wang, 2022. "Multi-round smoothed composite quantile regression for distributed data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(5), pages 869-893, October.
    2. Zhang, Haixiang & Wang, HaiYing, 2021. "Distributed subdata selection for big data via sampling-based approach," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    3. Lulu Zuo & Haixiang Zhang & HaiYing Wang & Liuquan Sun, 2021. "Optimal subsample selection for massive logistic regression with distributed data," Computational Statistics, Springer, vol. 36(4), pages 2535-2562, December.
    4. Chen, Canyi & Xu, Wangli & Zhu, Liping, 2022. "Distributed estimation in heterogeneous reduced rank regression: With application to order determination in sufficient dimension reduction," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
    5. Xiang, Pengcheng & Zhou, Ling & Tang, Lu, 2024. "Transfer learning via random forests: A one-shot federated approach," Computational Statistics & Data Analysis, Elsevier, vol. 197(C).
    6. Feifei Wang & Danyang Huang & Tianchen Gao & Shuyuan Wu & Hansheng Wang, 2022. "Sequential one‐step estimator by sub‐sampling for customer churn analysis with massive data sets," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1753-1786, November.
    7. Yu, Dengdeng & Zhang, Li & Mizera, Ivan & Jiang, Bei & Kong, Linglong, 2019. "Sparse wavelet estimation in quantile regression with multiple functional predictors," Computational Statistics & Data Analysis, Elsevier, vol. 136(C), pages 12-29.
    8. Yaohong Yang & Lei Wang, 2023. "Communication-efficient sparse composite quantile regression for distributed data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 86(3), pages 261-283, April.
    9. Yaeji Lim & Hee-Seok Oh, 2016. "Composite Quantile Periodogram for Spectral Analysis," Journal of Time Series Analysis, Wiley Blackwell, vol. 37(2), pages 195-221, March.
    10. Zhao, Weihua & Lian, Heng & Song, Xinyuan, 2017. "Composite quantile regression for correlated data," Computational Statistics & Data Analysis, Elsevier, vol. 109(C), pages 15-33.
    11. Wei Wang & Shou‐En Lu & Jerry Q. Cheng & Minge Xie & John B. Kostis, 2022. "Multivariate survival analysis in big data: A divide‐and‐combine approach," Biometrics, The International Biometric Society, vol. 78(3), pages 852-866, September.
    12. Li, Degui & Li, Runze, 2016. "Local composite quantile regression smoothing for Harris recurrent Markov processes," Journal of Econometrics, Elsevier, vol. 194(1), pages 44-56.
    13. Rahim Alhamzawi, 2016. "Bayesian Analysis of Composite Quantile Regression," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 8(2), pages 358-373, October.
    14. Luo, Jiyu & Sun, Qiang & Zhou, Wen-Xin, 2022. "Distributed adaptive Huber regression," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    15. Chaohui Guo & Hu Yang & Jing Lv, 2017. "Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression," Statistical Papers, Springer, vol. 58(4), pages 1009-1033, December.
    16. Wang, Kangning & Li, Shaomin & Zhang, Benle, 2021. "Robust communication-efficient distributed composite quantile regression and variable selection for massive data," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).
    17. Xuejun Ma & Shaochen Wang & Wang Zhou, 2022. "Statistical inference in massive datasets by empirical likelihood," Computational Statistics, Springer, vol. 37(3), pages 1143-1164, July.
    18. Xiaohui Yuan & Yong Li & Xiaogang Dong & Tianqing Liu, 2022. "Optimal subsampling for composite quantile regression in big data," Statistical Papers, Springer, vol. 63(5), pages 1649-1676, October.
    19. Changgee Chang & Zhiqi Bu & Qi Long, 2023. "CEDAR: communication efficient distributed analysis for regressions," Biometrics, The International Biometric Society, vol. 79(3), pages 2357-2369, September.
    20. Shi, Jianwei & Qin, Guoyou & Zhu, Huichen & Zhu, Zhongyi, 2021. "Communication-efficient distributed M-estimation with missing data," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:32:y:2023:i:2:d:10.1007_s11749-023-00850-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.