IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v160y2021ics0167947321000529.html
   My bibliography  Save this article

Two-sample test in high dimensions through random selection

Author

Listed:
  • Qiu, Tao
  • Xu, Wangli
  • Zhu, Liping

Abstract

Testing the equality for two-sample means with high dimensional distributions is a fundamental problem in statistics. In the past two decades, many efforts have been devoted to comparing the mean vectors of two populations. Many existing tests rely on naive diagonal or trace estimators of the covariance matrix, ignoring the dependence structure between variables. To make more use of the dependence structure, a new nonparametric test based on random selections is proposed to test the population mean vector of nonnormal high-dimensional multivariate data. This makes more efficient use of the covariance structure to deal with dependent variables. The asymptotic null distribution of the proposed test is standard normal, regardless of the parent distributions of the random samples and the relations between data dimensions and sample sizes. Extensive simulations show that the power performance of the proposed test is encouraging compared with some existing methods.

Suggested Citation

  • Qiu, Tao & Xu, Wangli & Zhu, Liping, 2021. "Two-sample test in high dimensions through random selection," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
  • Handle: RePEc:eee:csdana:v:160:y:2021:i:c:s0167947321000529
    DOI: 10.1016/j.csda.2021.107218
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321000529
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107218?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Thulin, Måns, 2014. "A high-dimensional two-sample test for the mean using random subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 74(C), pages 26-38.
    2. Chen, Song Xi & Zhang, Li-Xin & Zhong, Ping-Shou, 2010. "Tests for High-Dimensional Covariance Matrices," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 810-819.
    3. Lan Wang & Bo Peng & Runze Li, 2015. "A High-Dimensional Nonparametric Multivariate Test for Mean Vector," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1658-1669, December.
    4. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    5. Marco Marozzi & Amitava Mukherjee & Jan Kalina, 2020. "Interpoint distance tests for high-dimensional comparison studies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 47(4), pages 653-665, March.
    6. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    7. Srivastava, Muni S. & Du, Meng, 2008. "A test for the mean vector with fewer observations than the dimension," Journal of Multivariate Analysis, Elsevier, vol. 99(3), pages 386-402, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Michal Krejčí & Michaela Staňková, 2022. "The Position of Netflix in the Czech Republic Before and During the COVID-19 Pandemic," European Journal of Business Science and Technology, Mendel University in Brno, Faculty of Business and Economics, vol. 8(1), pages 72-83.
    2. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    2. Wang, Wei & Lin, Nan & Tang, Xiang, 2019. "Robust two-sample test of high-dimensional mean vectors under dependence," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 312-329.
    3. Zhang, Jin-Ting & Zhou, Bu & Guo, Jia, 2022. "Linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA: A normal reference L2-norm based test," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    4. Huang, Yuan & Li, Changcheng & Li, Runze & Yang, Songshan, 2022. "An overview of tests on high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    5. Li, Yang & Wang, Zhaojun & Zou, Changliang, 2016. "A simpler spatial-sign-based two-sample test for high-dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 149(C), pages 192-198.
    6. Tzviel Frostig & Yoav Benjamini, 2022. "Testing the equality of multivariate means when $$p>n$$ p > n by combining the Hotelling and Simes tests," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(2), pages 390-415, June.
    7. Saha, Enakshi & Sarkar, Soham & Ghosh, Anil K., 2017. "Some high-dimensional one-sample tests based on functions of interpoint distances," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 83-95.
    8. Feng, Long & Sun, Fasheng, 2015. "A note on high-dimensional two-sample test," Statistics & Probability Letters, Elsevier, vol. 105(C), pages 29-36.
    9. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    10. Yin, Yanqing, 2021. "Test for high-dimensional mean vector under missing observations," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    11. Feng, Long & Zhang, Xiaoxu & Liu, Binghui, 2020. "A high-dimensional spatial rank test for two-sample location problems," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    12. Li, Jun, 2023. "Finite sample t-tests for high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    13. Zhang, Jin-Ting & Guo, Jia & Zhou, Bu, 2017. "Linear hypothesis testing in high-dimensional one-way MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 200-216.
    14. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    15. Yuanyuan Jiang & Xingzhong Xu, 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    16. Davy Paindaveine & Thomas Verdebout, 2013. "Universal Asymptotics for High-Dimensional Sign Tests," Working Papers ECARES ECARES 2013-40, ULB -- Universite Libre de Bruxelles.
    17. Wang, Rui & Xu, Xingzhong, 2018. "On two-sample mean tests under spiked covariances," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 225-249.
    18. Mingxiang Cao & Yuanjing He, 2022. "A high-dimensional test on linear hypothesis of means under a low-dimensional factor model," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(5), pages 557-572, July.
    19. Zhao, Junguang & Xu, Xingzhong, 2016. "A generalized likelihood ratio test for normal mean when p is greater than n," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 91-104.
    20. Jin-Ting Zhang & Bu Zhou & Jia Guo, 2022. "Testing high-dimensional mean vector with applications," Statistical Papers, Springer, vol. 63(4), pages 1105-1137, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:160:y:2021:i:c:s0167947321000529. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.