IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v66y2025i2d10.1007_s00362-025-01664-3.html
   My bibliography  Save this article

Communication-efficient model averaging prediction for massive data with asymptotic optimality

Author

Listed:
  • Xiaochao Xia

    (Chongqing University)

  • Sijin He

    (Chongqing University)

  • Naiwen Pang

    (Chongqing University)

Abstract

This paper focuses on model averaging prediction for massive dataset. Specifically, in the framework of Mallows model averaging, we propose two distributed approaches to estimate the parameters of each submodel and weights in the final weighted estimator, respectively. The first approach is an one-shot procedure that aggregates the estimated parameters and weights from each local machine via simple average. The second approach is an iterative procedure that approximates the global loss by a surrogate loss in parameter estimation. The two proposed distributed estimators are communication-efficient, where the former requires only one round of communication and the latter requires two rounds of communications between central and local machines for parameter estimation to achieve the globally statistical efficiency. To estimate weight vector, two distributed algorithms are presented. Furthermore, we theoretically justify the two approaches by proving convergence rates and asymptotic normalities. More importantly, we establish the asymptotic optimality of distributed estimator of weight vector in terms of the out-of-sample prediction error criterion. Finally, simulations and a real data analysis are carried out to illustrate the proposed methods.

Suggested Citation

  • Xiaochao Xia & Sijin He & Naiwen Pang, 2025. "Communication-efficient model averaging prediction for massive data with asymptotic optimality," Statistical Papers, Springer, vol. 66(2), pages 1-45, February.
  • Handle: RePEc:spr:stpapr:v:66:y:2025:i:2:d:10.1007_s00362-025-01664-3
    DOI: 10.1007/s00362-025-01664-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-025-01664-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-025-01664-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:66:y:2025:i:2:d:10.1007_s00362-025-01664-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.