IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v144y2020ics0167947319302518.html
   My bibliography  Save this article

Model-free two-sample test for network-valued data

Author

Listed:
  • Lovato, Ilenia
  • Pini, Alessia
  • Stamm, Aymeric
  • Vantini, Simone

Abstract

In the framework of Object Oriented Data Analysis, a permutation approach to the two-sample testing problem for network-valued data is proposed. In detail, the present framework proceeds in four steps: (i) matrix representation of the networks, (ii) computation of the matrix of pairwise (inter-point) distances, (iii) computation of test statistics based on inter-point distances and (iv) embedding of the test statistics within a permutation test. The proposed testing procedures are proven to be exact for every finite sample size and consistent. Two new test statistics based on inter-point distances (i.e., IP-Student and IP-Fisher) are defined and a method to combine them to get a further inferential tool (i.e., IP-StudentFisher) is introduced. Simulated data shows that tests with our statistic exhibit a statistical power that is either the best or second-best but very close to the best on a variety of possible alternatives hypotheses and other statistics. A second simulation study that aims at better understanding which features are captured by specific combinations of matrix representations and distances is presented. Finally, a case study on mobility networks in the city of Milan is carried out. The proposed framework is fully implemented in the R package nevada (NEtwork-VAlued Data Analysis).

Suggested Citation

  • Lovato, Ilenia & Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2020. "Model-free two-sample test for network-valued data," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
  • Handle: RePEc:eee:csdana:v:144:y:2020:i:c:s0167947319302518
    DOI: 10.1016/j.csda.2019.106896
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319302518
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.106896?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Biswas, Munmun & Ghosh, Anil K., 2014. "A nonparametric two-sample test applicable to high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 160-171.
    2. Menafoglio, Alessandra & Secchi, Piercesare, 2017. "Statistical analysis of complex and spatially dependent data: A review of Object Oriented Spatial Statistics," European Journal of Operational Research, Elsevier, vol. 258(2), pages 401-410.
    3. Paul R. Rosenbaum, 2005. "An exact distribution‐free test comparing two multivariate distributions based on adjacency," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(4), pages 515-530, September.
    4. Davide Pigoli & John A. D. Aston & Ian L. Dryden & Piercesare Secchi, 2014. "Distances and inference for covariance operators," Biometrika, Biometrika Trust, vol. 101(2), pages 409-422.
    5. N. Locantore & J. Marron & D. Simpson & N. Tripoli & J. Zhang & K. Cohen & Graciela Boente & Ricardo Fraiman & Babette Brumback & Christophe Croux & Jianqing Fan & Alois Kneip & John Marden & Daniel P, 1999. "Robust principal component analysis for functional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 8(1), pages 1-73, June.
    6. Phipson Belinda & Smyth Gordon K, 2010. "Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-16, October.
    7. Daniele Durante & David B. Dunson & Joshua T. Vogelstein, 2017. "Rejoinder: Nonparametric Bayes Modeling of Populations of Networks," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1547-1552, October.
    8. Daniele Durante & David B. Dunson & Joshua T. Vogelstein, 2017. "Nonparametric Bayes Modeling of Populations of Networks," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1516-1530, October.
    9. Zhenyu Liu & Reza Modarres, 2011. "A triangle test for equality of distribution functions in high dimensions," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 23(3), pages 605-615.
    10. Tom M W Nye & Xiaoxian Tang & Grady Weyenberg & Ruriko Yoshida, 2017. "Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees," Biometrika, Biometrika Trust, vol. 104(4), pages 901-922.
    11. Ann E. Krause & Kenneth A. Frank & Doran M. Mason & Robert E. Ulanowicz & William W. Taylor, 2003. "Compartments revealed in food-web structure," Nature, Nature, vol. 426(6964), pages 282-285, November.
    12. Comellas, Francesc & Diaz-Lopez, Jordi, 2008. "Spectral reconstruction of complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 387(25), pages 6436-6442.
    13. Brombin, Chiara & Salmaso, Luigi, 2009. "Multi-aspect permutation tests in shape analysis with small sample size," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 3921-3931, October.
    14. Gabor J. Szekely & Maria L. Rizzo, 2005. "Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method," Journal of Classification, Springer;The Classification Society, vol. 22(2), pages 151-183, September.
    15. Hao Chen & Jerome H. Friedman, 2017. "A New Graph-Based Two-Sample Test for Multivariate and Object Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 397-409, January.
    16. Szekely, Gábor J. & Rizzo, Maria L., 2005. "A new test for multivariate normality," Journal of Multivariate Analysis, Elsevier, vol. 93(1), pages 58-80, March.
    17. Peter Hall, 2002. "Permutation tests for equality of distributions in high-dimensional settings," Biometrika, Biometrika Trust, vol. 89(2), pages 359-374, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ilenia Lovato & Alessia Pini & Aymeric Stamm & Maxime Taquet & Simone Vantini, 2021. "Multiscale null hypothesis testing for network‐valued data: Analysis of brain networks of patients with autism," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 372-397, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.
    2. Mondal, Pronoy K. & Biswas, Munmun & Ghosh, Anil K., 2015. "On high dimensional two-sample tests based on nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 141(C), pages 168-178.
    3. Reza Modarres, 2020. "Graphical Comparison of High‐Dimensional Distributions," International Statistical Review, International Statistical Institute, vol. 88(3), pages 698-714, December.
    4. Paul, Biplab & De, Shyamal K. & Ghosh, Anil K., 2022. "Some clustering-based exact distribution-free k-sample tests applicable to high dimension, low sample size data," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
    5. Qiu, Tao & Zhang, Qintong & Fang, Yuanyuan & Xu, Wangli, 2024. "Testing homogeneity in high dimensional data through random projections," Journal of Multivariate Analysis, Elsevier, vol. 200(C).
    6. Stefano Bonnini & Getnet Melak Assegie & Kamila Trzcinska, 2024. "Review about the Permutation Approach in Hypothesis Testing," Mathematics, MDPI, vol. 12(17), pages 1-29, August.
    7. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    8. Reza Modarres, 2024. "Hotelling $$T^2$$ T 2 test in high dimensions with application to Wilks outlier method," Statistical Papers, Springer, vol. 65(8), pages 5203-5218, October.
    9. Biswas, Munmun & Ghosh, Anil K., 2014. "A nonparametric two-sample test applicable to high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 160-171.
    10. Reza Modarres & Yu Song, 2020. "Multivariate power series interpoint distances," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(4), pages 955-982, December.
    11. Ilenia Lovato & Alessia Pini & Aymeric Stamm & Maxime Taquet & Simone Vantini, 2021. "Multiscale null hypothesis testing for network‐valued data: Analysis of brain networks of patients with autism," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 372-397, March.
    12. Chung, Jaewon & Bridgeford, Eric & Arroyo, Jesus & Pedigo, Benjamin D. & Saad-Eldin, Ali & Gopalakrishnan, Vivek & Xiang, Liang & Priebe, Carey E. & Vogelstein, Joshua T., 2020. "Statistical Connectomics," OSF Preprints ek4n3, Center for Open Science.
    13. Quessy, Jean-François, 2021. "A Szekely–Rizzo inequality for testing general copula homogeneity hypotheses," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    14. Nicolas Städler & Sach Mukherjee, 2017. "Two-sample testing in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 225-246, January.
    15. Rizzo, Maria L. & Haman, John T., 2016. "Expected distances and goodness-of-fit for the asymmetric Laplace distribution," Statistics & Probability Letters, Elsevier, vol. 117(C), pages 158-164.
    16. Boente, Graciela & Rodriguez, Daniela & Sued, Mariela, 2019. "The spatial sign covariance operator: Asymptotic results and applications," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 115-128.
    17. Jiang, Qing & Hušková, Marie & Meintanis, Simos G. & Zhu, Lixing, 2019. "Asymptotics, finite-sample comparisons and applications for two-sample tests with functional data," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 202-220.
    18. Linardi, Fernando & Diks, Cees & van der Leij, Marco & Lazier, Iuri, 2020. "Dynamic interbank network analysis using latent space models," Journal of Economic Dynamics and Control, Elsevier, vol. 112(C).
    19. Jun Li, 2018. "Asymptotic normality of interpoint distances for high-dimensional data with applications to the two-sample problem," Biometrika, Biometrika Trust, vol. 105(3), pages 529-546.
    20. Zhi Peng Ong & Aixiang Andy Chen & Tianming Zhu & Jin-Ting Zhang, 2023. "Testing Equality of Several Distributions at High Dimensions: A Maximum-Mean-Discrepancy-Based Approach," Mathematics, MDPI, vol. 11(20), pages 1-21, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:144:y:2020:i:c:s0167947319302518. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.