IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-29160-4.html
   My bibliography  Save this article

DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models

Author

Listed:
  • Chongliang Luo

    (University of Pennsylvania
    Washington University School of Medicine in St. Louis)

  • Md. Nazmul Islam

    (Optum Labs)

  • Natalie E. Sheils

    (Optum Labs)

  • John Buresh

    (Optum Labs)

  • Jenna Reps

    (Janssen Research and Development LLC)

  • Martijn J. Schuemie

    (Janssen Research and Development LLC)

  • Patrick B. Ryan

    (Janssen Research and Development LLC)

  • Mackenzie Edmondson

    (University of Pennsylvania)

  • Rui Duan

    (University of Pennsylvania
    Harvard T.H. Chan School of Public Health)

  • Jiayi Tong

    (University of Pennsylvania)

  • Arielle Marks-Anglin

    (University of Pennsylvania)

  • Jiang Bian

    (University of Florida)

  • Zhaoyi Chen

    (University of Florida)

  • Talita Duarte-Salles

    (Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol))

  • Sergio Fernández-Bertolín

    (Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol))

  • Thomas Falconer

    (Columbia University)

  • Chungsoo Kim

    (Ajou University Graduate School of Medicine)

  • Rae Woong Park

    (Ajou University Graduate School of Medicine
    Ajou University School of Medicine)

  • Stephen R. Pfohl

    (Stanford Center for Biomedical Informatics Research)

  • Nigam H. Shah

    (Stanford Center for Biomedical Informatics Research)

  • Andrew E. Williams

    (Tufts University School of Medicine)

  • Hua Xu

    (The University of Texas Health Science Center at Houston)

  • Yujia Zhou

    (The University of Texas Health Science Center at Houston)

  • Ebbing Lautenbach

    (University of Pennsylvania
    University of Pennsylvania
    University of Pennsylvania)

  • Jalpa A. Doshi

    (University of Pennsylvania
    Leonard Davis Institute of Health Economics)

  • Rachel M. Werner

    (University of Pennsylvania
    Leonard Davis Institute of Health Economics
    Cpl Michael J Crescenz VA Medical Center)

  • David A. Asch

    (University of Pennsylvania
    Leonard Davis Institute of Health Economics)

  • Yong Chen

    (University of Pennsylvania)

Abstract

Linear mixed models are commonly used in healthcare-based association analyses for analyzing multi-site data with heterogeneous site-specific random effects. Due to regulations for protecting patients’ privacy, sensitive individual patient data (IPD) typically cannot be shared across sites. We propose an algorithm for fitting distributed linear mixed models (DLMMs) without sharing IPD across sites. This algorithm achieves results identical to those achieved using pooled IPD from multiple sites (i.e., the same effect size and standard error estimates), hence demonstrating the lossless property. The algorithm requires each site to contribute minimal aggregated data in only one round of communication. We demonstrate the lossless property of the proposed DLMM algorithm by investigating the associations between demographic and clinical characteristics and length of hospital stay in COVID-19 patients using administrative claims from the UnitedHealth Group Clinical Discovery Database. We extend this association study by incorporating 120,609 COVID-19 patients from 11 collaborative data sources worldwide.

Suggested Citation

  • Chongliang Luo & Md. Nazmul Islam & Natalie E. Sheils & John Buresh & Jenna Reps & Martijn J. Schuemie & Patrick B. Ryan & Mackenzie Edmondson & Rui Duan & Jiayi Tong & Arielle Marks-Anglin & Jiang Bi, 2022. "DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-29160-4
    DOI: 10.1038/s41467-022-29160-4
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-29160-4
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-29160-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David Froelicher & Juan R. Troncoso-Pastoriza & Jean Louis Raisaro & Michel A. Cuendet & Joao Sa Sousa & Hyunghoon Cho & Bonnie Berger & Jacques Fellay & Jean-Pierre Hubaux, 2021. "Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    2. Edward Burn & Seng Chan You & Anthony G. Sena & Kristin Kostka & Hamed Abedtash & Maria Tereza F. Abrahão & Amanda Alberga & Heba Alghoul & Osaid Alser & Thamir M. Alshammari & Maria Aragon & Carlos A, 2020. "Deep phenotyping of 34,128 adult patients hospitalised with COVID-19 in an international network study," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    3. David Froelicher & Juan R. Troncoso-Pastoriza & Jean Louis Raisaro & Michel A. Cuendet & Joao Sa Sousa & Hyunghoon Cho & Bonnie Berger & Jacques Fellay & Jean-Pierre Hubaux, 2021. "Author Correction: Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption," Nature Communications, Nature, vol. 12(1), pages 1-1, December.
    4. Yong Chen & Kung-Yee Liang, 2010. "On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems," Biometrika, Biometrika Trust, vol. 97(3), pages 603-620.
    5. Wasserman, Larry & Zhou, Shuheng, 2010. "A Statistical Framework for Differential Privacy," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 375-389.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Helin Yang & Kwok-Yan Lam & Liang Xiao & Zehui Xiong & Hao Hu & Dusit Niyato & H. Vincent Poor, 2022. "Lead federated neuromorphic learning for wireless edge artificial intelligence," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    2. Miran Kim & Xiaoqian Jiang & Kristin Lauter & Elkhan Ismayilzada & Shayan Shams, 2022. "Secure human action recognition by encrypted neural network inference," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    3. Tao Qi & Fangzhao Wu & Chuhan Wu & Liang He & Yongfeng Huang & Xing Xie, 2023. "Differentially private knowledge transfer for federated learning," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    4. John M. Abowd & Ian M. Schmutte & William Sexton & Lars Vilhuber, 2019. "Suboptimal Provision of Privacy and Statistical Accuracy When They are Public Goods," Papers 1906.09353, arXiv.org.
    5. Ron S. Jarmin & John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Nathan Goldschlag & Michael B. Hawes & Sallie Ann Keller & Daniel Kifer & Philip Leclerc & Jerome P. Reiter & Rolando A. Rodrígue, 2023. "An in-depth examination of requirements for disclosure risk assessment," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 120(43), pages 2220558120-, October.
    6. Phill Wheat & Alexander D. Stead & William H. Greene, 2019. "Robust stochastic frontier analysis: a Student’s t-half normal model with application to highway maintenance costs in England," Journal of Productivity Analysis, Springer, vol. 51(1), pages 21-38, February.
    7. Raj Chetty & John N. Friedman, 2019. "A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 414-420, May.
    8. John M. Abowd & Robert Ashmead & Ryan Cumings-Menon & Simson Garfinkel & Micah Heineck & Christine Heiss & Robert Johns & Daniel Kifer & Philip Leclerc & Ashwin Machanavajjhala & Brett Moran & William, 2022. "The 2020 Census Disclosure Avoidance System TopDown Algorithm," Papers 2204.08986, arXiv.org.
    9. Taining Wang & Jinjing Tian & Feng Yao, 2021. "Does high debt ratio influence Chinese firms’ performance? A semiparametric stochastic frontier approach with zero inefficiency," Empirical Economics, Springer, vol. 61(2), pages 587-636, August.
    10. Ori Heffetz & Katrina Ligett, 2014. "Privacy and Data-Based Research," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 75-98, Spring.
    11. Toth Daniell, 2014. "Data Smearing: An Approach to Disclosure Limitation for Tabular Data," Journal of Official Statistics, Sciendo, vol. 30(4), pages 839-857, December.
    12. Soumya Mukherjee & Aratrika Mustafi & Aleksandra Slavkovi'c & Lars Vilhuber, 2023. "Assessing Utility of Differential Privacy for RCTs," Papers 2309.14581, arXiv.org.
    13. Chuan Hong & Yang Ning & Shuang Wang & Hao Wu & Raymond J. Carroll & Yong Chen, 2017. "PLEMT: A Novel Pseudolikelihood-Based EM Test for Homogeneity in Generalized Exponential Tilt Mixture Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1393-1404, October.
    14. Katherine B. Coffman & Lucas C. Coffman & Keith M. Marzilli Ericson, 2017. "The Size of the LGBT Population and the Magnitude of Antigay Sentiment Are Substantially Underestimated," Management Science, INFORMS, vol. 63(10), pages 3168-3186, October.
    15. Seunghwa Rho & Peter Schmidt, 2015. "Are all firms inefficient?," Journal of Productivity Analysis, Springer, vol. 43(3), pages 327-349, June.
    16. Lalanne, Clément & Gadat, Sébastien, 2024. "Privately Learning Smooth Distributions on the Hypercube by Projections," TSE Working Papers 24-1505, Toulouse School of Economics (TSE).
    17. Kumbhakar, Subal C. & Parmeter, Christopher F. & Tsionas, Efthymios G., 2013. "A zero inefficiency stochastic frontier model," Journal of Econometrics, Elsevier, vol. 172(1), pages 66-76.
    18. Chang, Jinyuan & Hu, Qiao & Kolaczyk, Eric D. & Yao, Qiwei & Yi, Fengting, 2024. "Edge differentially private estimation in the β-model via jittering and method of moments," LSE Research Online Documents on Economics 122099, London School of Economics and Political Science, LSE Library.
    19. Ana-Maria Staicu & Yingxing Li & Ciprian M. Crainiceanu & David Ruppert, 2014. "Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(4), pages 932-949, December.
    20. Claire McKay Bowen & Fang Liu & Bingyue Su, 2021. "Differentially private data release via statistical election to partition sequentially," METRON, Springer;Sapienza Università di Roma, vol. 79(1), pages 1-31, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-29160-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.