A massive data framework for M-estimators with cubic-rate

My bibliography Save this paper

A massive data framework for M-estimators with cubic-rate

Author

Listed:

Shi, Chengchun
Lu, Wenbin
Song, Rui

Registered:

Abstract

The divide and conquer method is a common strategy for handling massive data. In this article, we study the divide and conquer method for cubic-rate estimators under the massive data framework. We develop a general theory for establishing the asymptotic distribution of the aggregated M-estimators using a weighted average with weights depending on the subgroup sample sizes. Under certain condition on the growing rate of the number of subgroups, the resulting aggregated estimators are shown to have faster convergence rate and asymptotic normal distribution, which are more tractable in both computation and inference than the original M-estimators based on pooled data. Our theory applies to a wide class of M-estimators with cube root convergence rate, including the location estimator, maximum score estimator, and value search estimator. Empirical performance via simulations and a real data application also validate our theoretical findings. Supplementary materials for this article are available online.

Suggested Citation

Shi, Chengchun & Lu, Wenbin & Song, Rui, 2018. "A massive data framework for M-estimators with cubic-rate," LSE Research Online Documents on Economics 102111, London School of Economics and Political Science, LSE Library.

Handle: RePEc:ehl:lserod:102111

Download full text from publisher

References listed on IDEAS

Baqun Zhang & Anastasios A. Tsiatis & Eric B. Laber & Marie Davidian, 2012. "A Robust Method for Estimating Optimal Treatment Regimes," Biometrics, The International Biometric Society, vol. 68(4), pages 1010-1018, December.
Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2012. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," Papers 1212.6906, arXiv.org, revised Jan 2018.
- Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2013. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," CeMMAP working papers 76/13, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2013. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," CeMMAP working papers CWP76/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Ariel Kleiner & Ameet Talwalkar & Purnamrita Sarkar & Michael I. Jordan, 2014. "A scalable bootstrap for massive data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(4), pages 795-816, September.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Fengrui Di & Lei Wang, 2022. "Multi-round smoothed composite quantile regression for distributed data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(5), pages 869-893, October.
Zhang, Haixiang & Wang, HaiYing, 2021. "Distributed subdata selection for big data via sampling-based approach," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
Xuejun Ma & Shaochen Wang & Wang Zhou, 2022. "Statistical inference in massive datasets by empirical likelihood," Computational Statistics, Springer, vol. 37(3), pages 1143-1164, July.
Chen, Canyi & Xu, Wangli & Zhu, Liping, 2022. "Distributed estimation in heterogeneous reduced rank regression: With application to order determination in sufficient dimension reduction," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
Shi, Jianwei & Qin, Guoyou & Zhu, Huichen & Zhu, Zhongyi, 2021. "Communication-efficient distributed M-estimation with missing data," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).
Zhao, Yan-Yong & Zhang, Yuchun & Liu, Yuan & Ismail, Noriszura, 2024. "Distributed debiased estimation of high-dimensional partially linear models with jumps," Computational Statistics & Data Analysis, Elsevier, vol. 191(C).
Lulu Zuo & Haixiang Zhang & HaiYing Wang & Liuquan Sun, 2021. "Optimal subsample selection for massive logistic regression with distributed data," Computational Statistics, Springer, vol. 36(4), pages 2535-2562, December.
Le-Yu Chen & Sokbae Lee, 2018. "High Dimensional Classification through $\ell_0$-Penalized Empirical Risk Minimization," Papers 1811.09540, arXiv.org.
Tom Boot & Art=uras Juodis, 2023. "Uniform Inference in Linear Error-in-Variables Models: Divide-and-Conquer," Papers 2301.04439, arXiv.org.
Ma, Xuejun & Wang, Shaochen & Zhou, Wang, 2021. "Testing multivariate quantile by empirical likelihood," Journal of Multivariate Analysis, Elsevier, vol. 182(C).

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Xingcai Zhou & Zhaoyang Jing & Chao Huang, 2024. "Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression," Mathematics, MDPI, vol. 12(5), pages 1-53, February.
Xin Qiu & Donglin Zeng & Yuanjia Wang, 2018. "Estimation and evaluation of linear individualized treatment rules to guarantee performance," Biometrics, The International Biometric Society, vol. 74(2), pages 517-528, June.
Brice Ozenne & Esben Budtz-Jørgensen & Sebastian Elgaard Ebert, 2023. "Controlling the familywise error rate when performing multiple comparisons in a linear latent variable model," Computational Statistics, Springer, vol. 38(1), pages 1-23, March.
Ruoqing Zhu & Ying-Qi Zhao & Guanhua Chen & Shuangge Ma & Hongyu Zhao, 2017. "Greedy outcome weighted tree learning of optimal personalized treatment rules," Biometrics, The International Biometric Society, vol. 73(2), pages 391-400, June.
Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
- Hansen, Christian & Liao, Yuan, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," MPRA Paper 75313, University Library of Munich, Germany.
- Christian Hansen & Yuan Liao, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," Departmental Working Papers 201610, Rutgers University, Department of Economics.
- Christian Hansen & Yuan Liao, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," Papers 1611.09420, arXiv.org, revised Dec 2016.
Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression and other z-estimation problems," CeMMAP working papers CWP74/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression and other z-estimation problems," CeMMAP working papers 74/13, Institute for Fiscal Studies.
- Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2014. "Uniform post selection inference for LAD regression and other Z-estimation problems," CeMMAP working papers 51/14, Institute for Fiscal Studies.
- Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems," Papers 1304.0282, arXiv.org, revised Oct 2020.
- Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2014. "Uniform post selection inference for LAD regression and other Z-estimation problems," CeMMAP working papers CWP51/14, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Guangbao Guo & Yue Sun & Xuejun Jiang, 2020. "A partitioned quasi-likelihood for distributed statistical inference," Computational Statistics, Springer, vol. 35(4), pages 1577-1596, December.
Matias D. Cattaneo & Richard K. Crump & Weining Wang, 2022. "Beta-Sorted Portfolios," Papers 2208.10974, arXiv.org, revised Nov 2024.
- Matias Cattaneo & Richard K. Crump & Weining Wang, 2024. "Beta-sorted portfolios," CeMMAP working papers 20/24, Institute for Fiscal Studies.
- Matias D. Cattaneo & Richard K. Crump & Weining Wang, 2023. "Beta-Sorted Portfolios," Staff Reports 1068, Federal Reserve Bank of New York.
Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Wu Wang & Xuming He & Zhongyi Zhu, 2020. "Statistical inference for multiple change‐point models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1149-1170, December.
Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
Demian Pouzo, 2014. "Bootstrap Consistency for Quadratic Forms of Sample Averages with Increasing Dimension," Papers 1411.2701, arXiv.org, revised Aug 2015.
Dongwoo Kim & Daniel Wilhelm, 2024. "Powerful t-tests in the presence of nonclassical measurement error," Econometric Reviews, Taylor & Francis Journals, vol. 43(6), pages 345-378, July.
- Dongwoo Kim & Daniel Wilhelm, 2017. "Powerful t-Tests in the presence of nonclassical measurement error," CeMMAP working papers CWP57/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Dongwoo Kim & Daniel Wilhelm, 2023. "Powerful t-tests in the presence of nonclassical measurement error," CeMMAP working papers 22/23, Institute for Fiscal Studies.
- Dongwoo Kim & Daniel Wilhelm, 2023. "Powerful t-tests in the presence of nonclassical measurement error," IFS Working Papers WCWP22/23, Institute for Fiscal Studies.
- Dongwoo Kim & Daniel Wilhelm, 2017. "Powerful t-Tests in the presence of nonclassical measurement error," CeMMAP working papers 57/17, Institute for Fiscal Studies.
- Dongwoo Kim & Daniel Wilhelm, 2021. "Powerful t-tests in the presence of nonclassical measurement error," CeMMAP working papers CWP18/21, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
Weibin Mo & Yufeng Liu, 2022. "Efficient learning of optimal individualized treatment rules for heteroscedastic or misspecified treatment‐free effect models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 440-472, April.
Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
- Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-selection and post-regularization inference in linear models with many controls and instruments," CeMMAP working papers 02/15, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," Papers 1501.03185, arXiv.org.
- Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-selection and post-regularization inference in linear models with many controls and instruments," CeMMAP working papers CWP02/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Victor Chernozhukov & Whitney K Newey & Rahul Singh, 2022. "Debiased machine learning of global and local parameters using regularized Riesz representers [Semiparametric instrumental variable estimation of treatment response models]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 576-601.
- Victor Chernozhukov & Whitney Newey & Rahul Singh, 2018. "De-Biased Machine Learning of Global and Local Parameters Using Regularized Riesz Representers," Papers 1802.08667, arXiv.org, revised Oct 2022.
Jelena Bradic & Victor Chernozhukov & Whitney K. Newey & Yinchu Zhu, 2019. "Minimax Semiparametric Learning With Approximate Sparsity," Papers 1912.12213, arXiv.org, revised Aug 2022.
Denis Chetverikov & Bradley Larsen & Christopher Palmer, 2016. "IV Quantile Regression for Group‐Level Treatments, With an Application to the Distributional Effects of Trade," Econometrica, Econometric Society, vol. 84, pages 809-833, March.
- Denis Chetverikov & Bradley Larsen & Christopher Palmer, 2015. "IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade," NBER Working Papers 21033, National Bureau of Economic Research, Inc.
Yizhe Xu & Tom H. Greene & Adam P. Bress & Brian C. Sauer & Brandon K. Bellows & Yue Zhang & William S. Weintraub & Andrew E. Moran & Jincheng Shen, 2022. "Estimating the optimal individualized treatment rule from a cost‐effectiveness perspective," Biometrics, The International Biometric Society, vol. 78(1), pages 337-351, March.

More about this item

Keywords

cubic rate asymptotics; divide and conquer; M-estimators; massive data;
All these keywords.

JEL classification:

C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

NEP fields

This paper has been announced in the following NEP Reports:

NEP-ECM-2020-02-03 (Econometrics)
NEP-ORE-2020-02-03 (Operations Research)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:102111. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A massive data framework for M-estimators with cubic-rate

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data