IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v54y2010i1p16-24.html
   My bibliography  Save this article

Fast surrogates of U-statistics

Author

Listed:
  • Lin, N.
  • Xi, R.

Abstract

U-statistics have long been known as a class of nonparametric estimators with good theoretical properties such as unbiasedness and asymptotic normality. However, their applications in modern statistical analysis are limited due to the high computational complexity, especially when massive data sets are becoming more and more common nowadays. In this paper, using the "divide-and-conquer" technique, we developed two surrogates of the U-statistics, aggregated U-statistics and average aggregated U-statistics, both of which are shown asymptotically equivalent to U-statistics and computationally much more efficient. When dividing the raw data set into K subsets, the two proposed estimators reduce the computational complexity from O(Nm) to O(K(N/K)m), which results in significant time reduction as long as K=o(N) and m>=2. The merit of the two proposed statistics is demonstrated by both simulation studies and real data examples.

Suggested Citation

  • Lin, N. & Xi, R., 2010. "Fast surrogates of U-statistics," Computational Statistics & Data Analysis, Elsevier, vol. 54(1), pages 16-24, January.
  • Handle: RePEc:eee:csdana:v:54:y:2010:i:1:p:16-24
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00280-1
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Haataja, Riina & Larocque, Denis & Nevalainen, Jaakko & Oja, Hannu, 2009. "A weighted multivariate signed-rank test for cluster-correlated data," Journal of Multivariate Analysis, Elsevier, vol. 100(6), pages 1107-1119, July.
    2. Shen, Gang, 2008. "Asymptotics of Oja Median Estimate," Statistics & Probability Letters, Elsevier, vol. 78(14), pages 2137-2141, October.
    3. Marc Hallin & Thomas S. Ferguson & Christian Genest, 2000. "Kendall's tau for serial dependence," ULB Institutional Repository 2013/2093, ULB -- Universite Libre de Bruxelles.
    4. Oja, Hannu, 1983. "Descriptive statistics for multivariate distributions," Statistics & Probability Letters, Elsevier, vol. 1(6), pages 327-332, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dimitris N Politis, 2024. "Scalable subsampling: computation, aggregation and inference," Biometrika, Biometrika Trust, vol. 111(1), pages 347-354.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shen, Gang, 2009. "Asymptotics of a Theil-type estimate in multiple linear regression," Statistics & Probability Letters, Elsevier, vol. 79(8), pages 1053-1064, April.
    2. Eliana Christou, 2020. "Robust dimension reduction using sliced inverse median regression," Statistical Papers, Springer, vol. 61(5), pages 1799-1818, October.
    3. G. Zioutas & C. Chatzinakos & T. D. Nguyen & L. Pitsoulis, 2017. "Optimization techniques for multivariate least trimmed absolute deviation estimation," Journal of Combinatorial Optimization, Springer, vol. 34(3), pages 781-797, October.
    4. Hwang, Jinsoo & Jorn, Hongsuk & Kim, Jeankyung, 2004. "On the performance of bivariate robust location estimators under contamination," Computational Statistics & Data Analysis, Elsevier, vol. 44(4), pages 587-601, January.
    5. Masato Okamoto, 2009. "Decomposition of gini and multivariate gini indices," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 7(2), pages 153-177, June.
    6. Averous, Jean & Meste, Michel, 1997. "Median Balls: An Extension of the Interquantile Intervals to Multivariate Distributions," Journal of Multivariate Analysis, Elsevier, vol. 63(2), pages 222-241, November.
    7. Fantazzini, Dean, 2011. "Analysis of multidimensional probability distributions with copula functions," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 22(2), pages 98-134.
    8. Eisenberg, Bennett, 2015. "The multivariate Gini ratio," Statistics & Probability Letters, Elsevier, vol. 96(C), pages 292-298.
    9. Bauwens, Luc & Veredas, David, 2004. "The stochastic conditional duration model: a latent variable model for the analysis of financial durations," Journal of Econometrics, Elsevier, vol. 119(2), pages 381-412, April.
    10. Kwiecien, Robert & Gather, Ursula, 2007. "Jensen's inequality for the Tukey median," Technical Reports 2007,07, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    11. Rainer Dyckerhoff & Christophe Ley & Davy Paindaveine, 2014. "Depth-Based Runs Tests for bivariate Central Symmetry," Working Papers ECARES ECARES 2014-03, ULB -- Universite Libre de Bruxelles.
    12. repec:spo:wpmain:info:hdl:2441/3qnaslliat80pbqa8t90240unj is not listed on IDEAS
    13. Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2014. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Papers 1412.8434, arXiv.org, revised Sep 2015.
    14. Belzunce, Félix & Ruiz, José M. & Suárez-Llorens, Alfonso, 2008. "On multivariate dispersion orderings based on the standard construction," Statistics & Probability Letters, Elsevier, vol. 78(3), pages 271-281, February.
    15. Nadja Klein & Thomas Kneib, 2020. "Directional bivariate quantiles: a robust approach based on the cumulative distribution function," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 104(2), pages 225-260, June.
    16. Nadar, M. & Hettmansperger, T. P. & Oja, H., 2003. "The asymptotic covariance matrix of the Oja median," Statistics & Probability Letters, Elsevier, vol. 64(4), pages 431-442, October.
    17. Zuo, Yijun, 2013. "Multidimensional medians and uniqueness," Computational Statistics & Data Analysis, Elsevier, vol. 66(C), pages 82-88.
    18. Sakineh Dehghan & Mohammad Reza Faridrohani, 2019. "Affine invariant depth-based tests for the multivariate one-sample location problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 671-693, September.
    19. Mangold, Benedikt, 2014. "Plausible prior estimation," FAU Discussion Papers in Economics 09/2014, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    20. Ollila, Esa & Oja, Hannu & Croux, Christophe, 2003. "The affine equivariant sign covariance matrix: asymptotic behavior and efficiencies," Journal of Multivariate Analysis, Elsevier, vol. 87(2), pages 328-355, November.
    21. Manuela Moretti & Roberto Guercio, 2024. "Probabilistic Analysis of Extreme Water Demand Peak Factors for Sustainable Resource Management," Sustainability, MDPI, vol. 16(24), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:54:y:2010:i:1:p:16-24. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.