IDEAS home Printed from https://ideas.repec.org/p/aiz/louvar/2021005.html
   My bibliography  Save this paper

Multivariate Goodness-of-Fit Tests Based on Wasserstein Distance

Author

Listed:
  • Hallin, Marc

    (ULB)

  • Mordant, Gilles

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

  • Segers, Johan

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

Abstract

Goodness-of-fit tests based on the empirical Wasserstein distance are proposed for simple and composite null hypotheses involving general multivariate distributions. For group families, the procedure is to be implemented after preliminary reduction of the data via invariance. This property allows for calculation of exact critical values and p-values at finite sample sizes. Applications include testing for location–scale families and testing for families arising from affine transformations, such as elliptical distributions with given standard radial density and unspecified location vector and scatter matrix. A novel test for multivariate normality with unspecified mean vector and covariance matrix arises as a special case. For more general parametric families, we propose a parametric bootstrap procedure to calculate critical values. The lack of asymptotic distribution theory for the empirical Wasserstein distance means that the validity of the parametric bootstrap under the null hypothesis remains a conjecture. Nevertheless, we show that the test is consistent against fixed alternatives. To this end, we prove a uniform law of large numbers for the empirical distribution in Wasserstein distance, where the uniformity is over any class of underlying distributions satisfying a uniform integrability condition but no additional moment assumptions. The calculation of test statistics boils down to solving the well-studied semi-discrete optimal transport problem. Extensive numerical experiments demonstrate the practical feasibility and the excellent performance of the proposed tests for the Wasserstein distance of order p = 1 and p = 2 and for dimensions at least up to d = 5. The simulations also lend support to the conjecture of the asymptotic validity of the parametric bootstrap.

Suggested Citation

  • Hallin, Marc & Mordant, Gilles & Segers, Johan, 2021. "Multivariate Goodness-of-Fit Tests Based on Wasserstein Distance," LIDAM Reprints ISBA 2021005, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
  • Handle: RePEc:aiz:louvar:2021005
    DOI: https://doi.org/10.1214/21-EJS1816
    Note: In: Electronic Journal of Statistics, Vol. 15, no. 1, p. 1328-1371 (2021)
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chen, Feifei & Jiménez–Gamero, M. Dolores & Meintanis, Simos & Zhu, Lixing, 2022. "A general Monte Carlo method for multivariate goodness–of–fit testing applied to elliptical families," Computational Statistics & Data Analysis, Elsevier, vol. 175(C).
    2. Opher Baron & Dmitry Krass & Arik Senderovich & Eliran Sherzer, 2024. "Supervised ML for Solving the GI / GI /1 Queue," INFORMS Journal on Computing, INFORMS, vol. 36(3), pages 766-786, May.
    3. Solveig Flaig & Gero Junike, 2022. "Scenario Generation for Market Risk Models Using Generative Neural Networks," Risks, MDPI, vol. 10(11), pages 1-28, October.
    4. Solveig Flaig & Gero Junike, 2021. "Scenario generation for market risk models using generative neural networks," Papers 2109.10072, arXiv.org, revised Aug 2023.
    5. Hongjian Shi & Marc Hallin & Mathias Drton & Fang Han, 2020. "Rate-Optimality of Consistent Distribution-Free Tests of Independence Based on Center-Outward Ranks and Signs," Working Papers ECARES 2020-23, ULB -- Universite Libre de Bruxelles.
    6. Bagkavos, Dimitrios & Patil, Prakash N. & Wood, Andrew T.A., 2023. "Nonparametric goodness-of-fit testing for a continuous multivariate parametric model," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    7. Marc Hallin & H Lui & Thomas Verdebout, 2022. "Nonparametric Measure-transportation-based Methods for Directional Data," Working Papers ECARES 2022-18, ULB -- Universite Libre de Bruxelles.
    8. Fraiman, Ricardo & Moreno, Leonardo & Ransford, Thomas, 2023. "A Cramér–Wold theorem for elliptical distributions," Journal of Multivariate Analysis, Elsevier, vol. 196(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aiz:louvar:2021005. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Nadja Peiffer (email available below). General contact details of provider: https://edirc.repec.org/data/isuclbe.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.