IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v203y2024ics0047259x24000411.html
   My bibliography  Save this article

Distribution-on-distribution regression with Wasserstein metric: Multivariate Gaussian case

Author

Listed:
  • Okano, Ryo
  • Imaizumi, Masaaki

Abstract

Distribution data refer to a data set in which each sample is represented as a probability distribution, a subject area that has received increasing interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce models for regression from one Gaussian distribution to another, using the Wasserstein metric. These models are constructed using the geometry of the Wasserstein space, which enables the transformation of Gaussian distributions into components of a linear matrix space. Owing to their linear regression frameworks, our models are intuitively understandable, and their implementation is simplified because of the optimal transport problem’s analytical solution between Gaussian distributions. We also explore a generalization of our models to encompass non-Gaussian scenarios. We establish the convergence rates of in-sample prediction errors for the empirical risk minimizations in our models. In comparative simulation experiments, our models demonstrate superior performance over a simpler alternative method that transforms Gaussian distributions into matrices. We present an application of our methodology using weather data for illustration purposes.

Suggested Citation

  • Okano, Ryo & Imaizumi, Masaaki, 2024. "Distribution-on-distribution regression with Wasserstein metric: Multivariate Gaussian case," Journal of Multivariate Analysis, Elsevier, vol. 203(C).
  • Handle: RePEc:eee:jmvana:v:203:y:2024:i:c:s0047259x24000411
    DOI: 10.1016/j.jmva.2024.105334
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X24000411
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2024.105334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Alexander Petersen & Hans-Georg Müller, 2019. "Wasserstein covariance for multiple random densities," Biometrika, Biometrika Trust, vol. 106(2), pages 339-351.
    2. Hua Zhou & Lexin Li & Hongtu Zhu, 2013. "Tensor Regression with Applications in Neuroimaging Data Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 540-552, June.
    3. Chao Zhang & Piotr Kokoszka & Alexander Petersen, 2022. "Wasserstein autoregressive models for density time series," Journal of Time Series Analysis, Wiley Blackwell, vol. 43(1), pages 30-52, January.
    4. Elsa Cazelles & Vivien Seguy & Jérémie Bigot & Marco Cuturi & Nicolas Papadakis, 2017. "Log-PCA versus Geodesic PCA of histograms in the Wasserstein space," Working Papers 2017-85, Center for Research in Economics and Statistics.
    5. Laya Ghodrati & Victor M Panaretos, 2022. "Distribution-on-distribution regression via optimal transport maps [Upper and lower risk bounds for estimating the Wasserstein barycenter of random measures on the real line]," Biometrika, Biometrika Trust, vol. 109(4), pages 957-974.
    6. Petersen, Alexander & Zhang, Chao & Kokoszka, Piotr, 2022. "Modeling Probability Density Functions as Data Objects," Econometrics and Statistics, Elsevier, vol. 21(C), pages 159-178.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ghodrati, Laya & Panaretos, Victor M., 2023. "Minimax rate for optimal transport regression between distributions," Statistics & Probability Letters, Elsevier, vol. 194(C).
    2. Florian Gunsilius & Meng Hsuan Hsieh & Myung Jin Lee, 2022. "Tangential Wasserstein Projections," Papers 2207.14727, arXiv.org, revised Aug 2022.
    3. Tadao Hoshino, 2024. "Functional Spatial Autoregressive Models," Papers 2402.14763, arXiv.org, revised Oct 2024.
    4. Zhang, Qi & Li, Bing & Xue, Lingzhou, 2024. "Nonlinear sufficient dimension reduction for distribution-on-distribution regression," Journal of Multivariate Analysis, Elsevier, vol. 202(C).
    5. Petersen, Alexander & Zhang, Chao & Kokoszka, Piotr, 2022. "Modeling Probability Density Functions as Data Objects," Econometrics and Statistics, Elsevier, vol. 21(C), pages 159-178.
    6. Chao Zhang & Piotr Kokoszka & Alexander Petersen, 2022. "Wasserstein autoregressive models for density time series," Journal of Time Series Analysis, Wiley Blackwell, vol. 43(1), pages 30-52, January.
    7. Lin Liu, 2021. "Matrix‐based introduction to multivariate data analysis, by KoheiAdachi 2nd edition. Singapore: Springer Nature, 2020. pp. 457," Biometrics, The International Biometric Society, vol. 77(4), pages 1498-1500, December.
    8. Cui Guo & Jian Kang & Timothy D. Johnson, 2022. "A spatial Bayesian latent factor model for image‐on‐image regression," Biometrics, The International Biometric Society, vol. 78(1), pages 72-84, March.
    9. Hayato Maki & Sakriani Sakti & Hiroki Tanaka & Satoshi Nakamura, 2018. "Quality prediction of synthesized speech based on tensor structured EEG signals," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-13, June.
    10. Kim, Jonathan & Sandri, Brian J. & Rao, Raghavendra B. & Lock, Eric F., 2023. "Bayesian predictive modeling of multi-source multi-way data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    11. Chelsey Hill & James Li & Matthew J. Schneider & Martin T. Wells, 2021. "The tensor auto‐regressive model," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(4), pages 636-652, July.
    12. Kai Deng & Xin Zhang, 2022. "Tensor envelope mixture model for simultaneous clustering and multiway dimension reduction," Biometrics, The International Biometric Society, vol. 78(3), pages 1067-1079, September.
    13. Lan Liu & Wei Li & Zhihua Su & Dennis Cook & Luca Vizioli & Essa Yacoub, 2022. "Efficient estimation via envelope chain in magnetic resonance imaging‐based studies," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(2), pages 481-501, June.
    14. Will Wei Sun & Junwei Lu & Han Liu & Guang Cheng, 2017. "Provable sparse tensor decomposition," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(3), pages 899-916, June.
    15. Zhu, Changbo & Müller, Hans-Georg, 2024. "Spherical autoregressive models, with application to distributional and compositional time series," Journal of Econometrics, Elsevier, vol. 239(2).
    16. Feiyang Han & Yimin Wei & Pengpeng Xie, 2024. "Regularized and Structured Tensor Total Least Squares Methods with Applications," Journal of Optimization Theory and Applications, Springer, vol. 202(3), pages 1101-1136, September.
    17. Le Brigant, Alice & Puechmorel, Stéphane, 2019. "Quantization and clustering on Riemannian manifolds with an application to air traffic analysis," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 685-703.
    18. Bulté, Matthieu & Sørensen, Helle, 2024. "Medoid splits for efficient random forests in metric spaces," Computational Statistics & Data Analysis, Elsevier, vol. 198(C).
    19. Kenneth W. Latimer & David J. Freedman, 2023. "Low-dimensional encoding of decisions in parietal cortex reflects long-term training history," Nature Communications, Nature, vol. 14(1), pages 1-24, December.
    20. Giuseppe Brandi & T. Di Matteo, 2020. "A new multilayer network construction via Tensor learning," Papers 2004.05367, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:203:y:2024:i:c:s0047259x24000411. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.