IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v190y2022ics0047259x2200001x.html
   My bibliography  Save this article

A generalized Wilcoxon–Mann–Whitney type test for multivariate data through pairwise distance

Author

Listed:
  • Liu, Jiamin
  • Ma, Shuangge
  • Xu, Wangli
  • Zhu, Liping

Abstract

The Wilcoxon–Mann–Whitney test is designed to test homogeneity of two random samples in the univariate case. It is very powerful to detect location shifts yet may lose power completely when there exist scale differences. We generalize the classic Wilcoxon–Mann–Whitney test through using pairwise distances of all observations. The generalized test can be readily used even when the random observations are multivariate. It is also very powerful in the presence of scale differences. The generalized test is in spirit to compare difference between the distribution functions of two random samples. It is mn/(m+n)-consistent under the strong null and local alternatives, and root-mn/(m+n)-consistent under fixed alternatives, where m,n stand for the respective sizes of the two random samples. The power of the generalized test is asymptotically independent of m/n, the size ratio of the two random samples. This indicates that the generalized test has nontrivial power as long as the sample sizes are not extremely unbalanced. We demonstrate the theoretical properties of our improved rank-based two-sample test through comprehensive numerical studies.

Suggested Citation

  • Liu, Jiamin & Ma, Shuangge & Xu, Wangli & Zhu, Liping, 2022. "A generalized Wilcoxon–Mann–Whitney type test for multivariate data through pairwise distance," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
  • Handle: RePEc:eee:jmvana:v:190:y:2022:i:c:s0047259x2200001x
    DOI: 10.1016/j.jmva.2022.104946
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X2200001X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2022.104946?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Leucht, Anne & Neumann, Michael H., 2009. "Consistency of general bootstrap methods for degenerate U-type and V-type statistics," Journal of Multivariate Analysis, Elsevier, vol. 100(8), pages 1622-1633, September.
    2. Fernando A. Freitas & Sarajane M. Peres & Clodoaldo A. M. Lima & Felipe V. Barbosa, 2017. "Grammatical facial expression recognition in sign language discourse: a study at the syntax level," Information Systems Frontiers, Springer, vol. 19(6), pages 1243-1259, December.
    3. M. Ahmad, 2014. "A $$U$$ -statistic approach for a high-dimensional two-sample mean testing problem under non-normality and Behrens–Fisher setting," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 33-61, February.
    4. Peter Hall, 2002. "Permutation tests for equality of distributions in high-dimensional settings," Biometrika, Biometrika Trust, vol. 89(2), pages 359-374, June.
    5. Anirvan Chakraborty & Probal Chaudhuri, 2015. "A Wilcoxon–Mann–Whitney-type test for infinite-dimensional data," Biometrika, Biometrika Trust, vol. 102(1), pages 239-246.
    6. Youyi Fong & Ying Huang, 2019. "Modified Wilcoxon–Mann–Whitney Test and Power Against Strong Null," The American Statistician, Taylor & Francis Journals, vol. 73(1), pages 43-49, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wosnitza, Jan Henrik, 2022. "Calibration alternatives to logistic regression and their potential for transferring the dispersion of discriminatory power into uncertainties of probabilities of default," Discussion Papers 04/2022, Deutsche Bundesbank.
    2. Wan, Phyllis & Davis, Richard A., 2022. "Goodness-of-fit testing for time series models via distance covariance," Journal of Econometrics, Elsevier, vol. 227(1), pages 4-24.
    3. Rauf Ahmad, M., 2019. "A significance test of the RV coefficient in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 116-130.
    4. Justyna Zabawa & Cyprian Kozyra, 2020. "Eco-Banking in Relation to Financial Performance of the Sector—The Evidence from Poland," Sustainability, MDPI, vol. 12(6), pages 1-23, March.
    5. Qiu, Tao & Zhang, Qintong & Fang, Yuanyuan & Xu, Wangli, 2024. "Testing homogeneity in high dimensional data through random projections," Journal of Multivariate Analysis, Elsevier, vol. 200(C).
    6. Biswas, Munmun & Ghosh, Anil K., 2014. "A nonparametric two-sample test applicable to high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 160-171.
    7. Zacharias Psaradakis & Marián Vávra, 2017. "Normality Tests for Dependent Data: Large-Sample and Bootstrap Approaches," Birkbeck Working Papers in Economics and Finance 1706, Birkbeck, Department of Economics, Mathematics & Statistics.
    8. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.
    9. Zongliang Hu & Tiejun Tong & Marc G. Genton, 2019. "Diagonal likelihood ratio test for equality of mean vectors in high‐dimensional data," Biometrics, The International Biometric Society, vol. 75(1), pages 256-267, March.
    10. Jiang, Qing & Hušková, Marie & Meintanis, Simos G. & Zhu, Lixing, 2019. "Asymptotics, finite-sample comparisons and applications for two-sample tests with functional data," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 202-220.
    11. Dehling, Herold & Sharipov, Olimjon Sh. & Wendler, Martin, 2015. "Bootstrap for dependent Hilbert space-valued random variables with application to von Mises statistics," Journal of Multivariate Analysis, Elsevier, vol. 133(C), pages 200-215.
    12. Federico A. Bugni & Joel L. Horowitz, 2021. "Permutation tests for equality of distributions of functional data," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 36(7), pages 861-877, November.
    13. M. Rauf Ahmad, 2019. "A unified approach to testing mean vectors with large dimensions," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(4), pages 593-618, December.
    14. Mondal, Pronoy K. & Biswas, Munmun & Ghosh, Anil K., 2015. "On high dimensional two-sample tests based on nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 141(C), pages 168-178.
    15. Luvai Motiwalla & Amit V. Deokar & Surendra Sarnikar & Angelika Dimoka, 2019. "Leveraging Data Analytics for Behavioral Research," Information Systems Frontiers, Springer, vol. 21(4), pages 735-742, August.
    16. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    17. Reza Modarres & Yu Song, 2020. "Multivariate power series interpoint distances," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(4), pages 955-982, December.
    18. Anne Leucht & Michael Neumann, 2013. "Degenerate $$U$$ - and $$V$$ -statistics under ergodicity: asymptotics, bootstrap and applications in statistics," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(2), pages 349-386, April.
    19. Zhou, Niwen & Guo, Xu & Zhu, Lixing, 2024. "Significance test for semiparametric conditional average treatment effects and other structural functions," Computational Statistics & Data Analysis, Elsevier, vol. 189(C).
    20. Anne Leucht & Jens-Peter Kreiss & Michael H. Neumann, 2015. "A Model Specification Test For GARCH(1,1) Processes," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(4), pages 1167-1193, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:190:y:2022:i:c:s0047259x2200001x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.