IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v38y2023i1d10.1007_s00180-022-01218-3.html
   My bibliography  Save this article

Permutation testing for thick data when the number of variables is much greater than the sample size: recent developments and some recommendations

Author

Listed:
  • Patrick B. Langthaler

    (University of Salzburg
    Paracelsus Medical University)

  • Riccardo Ceccato

    (University of Padova)

  • Luigi Salmaso

    (University of Padova)

  • Rosa Arboretti

    (University of Padova)

  • Arne C. Bathke

    (Paracelsus Medical University
    University of Salzburg)

Abstract

In many scientific disciplines datasets contain many more variables than observational units (so-called thick data). A common hypothesis of interest in this setting is the global null hypothesis of no difference in multivariate distribution between different experimental or observational groups. Several permutation-based nonparametric tests have been proposed for this hypothesis. In this paper we investigate the potential differences in performance between different methods used to test thick data. In particular we focus on an extension of the Nonparametric combination procedure (NPC) proposed by Pesarin and Salmaso, a rank-based approach by Ellis, Burchett, Harrar and Bathke, and a distance-based approach by Mielke. The effect of different combining procedures on the NPC is also explored. Finally, we illustrate the use of these methods on a real-life dataset.

Suggested Citation

  • Patrick B. Langthaler & Riccardo Ceccato & Luigi Salmaso & Rosa Arboretti & Arne C. Bathke, 2023. "Permutation testing for thick data when the number of variables is much greater than the sample size: recent developments and some recommendations," Computational Statistics, Springer, vol. 38(1), pages 101-132, March.
  • Handle: RePEc:spr:compst:v:38:y:2023:i:1:d:10.1007_s00180-022-01218-3
    DOI: 10.1007/s00180-022-01218-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-022-01218-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-022-01218-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. François Baccelli & Armand M. Makowski, 1989. "Multidimensional Stochastic Ordering and Associated Random Variables," Operations Research, INFORMS, vol. 37(3), pages 478-487, June.
    2. Bathke, Arne C. & Harrar, Solomon W. & Madden, Laurence V., 2008. "How to compare small multivariate samples using nonparametric tests," Computational Statistics & Data Analysis, Elsevier, vol. 52(11), pages 4951-4965, July.
    3. N A Heard & P Rubin-Delanchy, 2018. "Choosing between methods of combining $p$-values," Biometrika, Biometrika Trust, vol. 105(1), pages 239-246.
    4. Burchett, Woodrow W. & Ellis, Amanda R. & Harrar, Solomon W. & Bathke, Arne C., 2017. "Nonparametric Inference for Multivariate Data: The R Package npmv," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i04).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Panda, Deepak Kumar & Das, Saptarshi, 2021. "Economic operational analytics for energy storage placement at different grid locations and contingency scenarios with stochastic wind profiles," Renewable and Sustainable Energy Reviews, Elsevier, vol. 137(C).
    2. Gunawardana, Asanka & Konietschke, Frank, 2019. "Nonparametric multiple contrast tests for general multivariate factorial designs," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 165-180.
    3. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    4. Dennis Dobler & Sarah Friedrich & Markus Pauly, 2020. "Nonparametric MANOVA in meaningful effects," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(4), pages 997-1022, August.
    5. Friedrich, Sarah & Pauly, Markus, 2018. "MATS: Inference for potentially singular and heteroscedastic MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 166-179.
    6. Song, Zhi & Mukherjee, Amitava & Zhang, Jiujun, 2021. "Some robust approaches based on copula for monitoring bivariate processes and component-wise assessment," European Journal of Operational Research, Elsevier, vol. 289(1), pages 177-196.
    7. Susan H. Xu & Haijun Li, 2000. "Majorization of Weighted Trees: A New Tool to Study Correlated Stochastic Systems," Mathematics of Operations Research, INFORMS, vol. 25(2), pages 298-323, May.
    8. Rauf Ahmad, M. & Werner, C. & Brunner, E., 2008. "Analysis of high-dimensional repeated measures designs: The one sample case," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 416-427, December.
    9. Colangelo, Antonio & Scarsini, Marco & Shaked, Moshe, 2006. "Some positive dependence stochastic orders," Journal of Multivariate Analysis, Elsevier, vol. 97(1), pages 46-78, January.
    10. Xiong, Peihan & Hu, Taizhong, 2022. "On Samuel’s p-value model and the Simes test under dependence," Statistics & Probability Letters, Elsevier, vol. 187(C).
    11. Annalisa Paolino & Elizabeth H. Haines & Evan J. Bailey & Dylan A. Black & Ching Moey & Fernando García-Moreno & Linda J. Richards & Rodrigo Suárez & Laura R. Fenlon, 2023. "Non-uniform temporal scaling of developmental processes in the mammalian cortex," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    12. Justin W. Bonny & Lisa M. Castaneda, 2022. "To Triumph or to Socialize? The Role of Gaming Motivations in Multiplayer Online Battle Arena Gameplay Preferences," Simulation & Gaming, , vol. 53(2), pages 157-174, April.
    13. Zimmermann, Paul, 2021. "The role of the leverage effect in the price discovery process of credit markets," Journal of Economic Dynamics and Control, Elsevier, vol. 122(C).
    14. Paulo C. Rodrigues & Vanda M. Lourenço, 2020. "Comments on: Hierarchical Inference for genome-wide association studies: a view on methodology with software by Paulo C. Rodrigues and Vanda M. Lourenço," Computational Statistics, Springer, vol. 35(1), pages 57-58, March.
    15. Juan Antonio Villatoro-García & Jordi Martorell-Marugán & Daniel Toro-Domínguez & Yolanda Román-Montoya & Pedro Femia & Pedro Carmona-Sáez, 2022. "DExMA: An R Package for Performing Gene Expression Meta-Analysis with Missing Genes," Mathematics, MDPI, vol. 10(18), pages 1-15, September.
    16. Denuit, Michel & Lefevre, Claude & Mesfioui, M'hamed, 1999. "A class of bivariate stochastic orderings, with applications in actuarial sciences," Insurance: Mathematics and Economics, Elsevier, vol. 24(1-2), pages 31-50, March.
    17. Satya P. DAS & Chetan CHATE, 2001. "Endogenous Distribution, Politics, and Growth," LIDAM Discussion Papers IRES 2001019, Université catholique de Louvain, Institut de Recherches Economiques et Sociales (IRES).
    18. Liu, Chunxu & Bathke, Arne C. & Harrar, Solomon W., 2011. "A nonparametric version of Wilks' lambda--Asymptotic results and small sample approximations," Statistics & Probability Letters, Elsevier, vol. 81(10), pages 1502-1506, October.
    19. Arnold, Barry C. & Castillo, Enrique & María Sarabia, José, 2009. "On multivariate order statistics. Application to ranked set sampling," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4555-4569, October.
    20. Savas Dayanik & Jing-Sheng Song & Susan H. Xu, 2003. "The Effectiveness of Several Performance Bounds for Capacitated Production, Partial-Order-Service, Assemble-to-Order Systems," Manufacturing & Service Operations Management, INFORMS, vol. 5(3), pages 230-251, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:38:y:2023:i:1:d:10.1007_s00180-022-01218-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.