IDEAS home Printed from https://ideas.repec.org/p/hal/wpaper/hal-04017151.html
   My bibliography  Save this paper

Statistical error bounds for weighted mean and median, with application to robust aggregation of cryptocurrency data

Author

Listed:
  • Michaël Allouche

    (Kaiko [Paris])

  • Mnacho Echenim

    (LIG - Laboratoire d'Informatique de Grenoble - CNRS - Centre National de la Recherche Scientifique - UGA - Université Grenoble Alpes - Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes, CAPP - Calculs algorithmes programmes et preuves - LIG - Laboratoire d'Informatique de Grenoble - CNRS - Centre National de la Recherche Scientifique - UGA - Université Grenoble Alpes - Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes)

  • Emmanuel Gobet

    (CMAP - Centre de Mathématiques Appliquées de l'Ecole polytechnique - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - CNRS - Centre National de la Recherche Scientifique)

  • Anne-Claire Maurice

    (Kaiko [Paris])

Abstract

We study price aggregation methodologies applied to crypto-currency prices with quotations fragmented on different platforms. An intrinsic difficulty is that the price returns and volumes are heavytailed, with many outliers, making averaging and aggregation challenging. While conventional methods rely on Volume-Weighted Average Prices (called VWAPs), or Volume-Weighted Median prices (called VWMs), we develop a new Robust Weighted Median (RWM) estimator that is robust to price and volume outliers. Our study is based on new probabilistic concentration inequalities for weighted means and weighted quantiles under different tail assumptions (heavy tails, sub-gamma tails, sub-Gaussian tails). This justifies that fluctuations of VWAP and VWM are statistically important given the heavy-tailed properties of volumes and/or prices. We show that our RWM estimator overcomes this problem and also satisfies all the desirable properties of a price aggregator. We illustrate the behavior of RWM on synthetic data (within a parametric model close to real data): our estimator achieves a statistical accuracy twice as good as its competitors, and also allows to recover realized volatilities in a very accurate way. Tests on real data are also performed and confirm the good behavior of the estimator on various use cases.

Suggested Citation

  • Michaël Allouche & Mnacho Echenim & Emmanuel Gobet & Anne-Claire Maurice, 2023. "Statistical error bounds for weighted mean and median, with application to robust aggregation of cryptocurrency data," Working Papers hal-04017151, HAL.
  • Handle: RePEc:hal:wpaper:hal-04017151
    Note: View the original document on HAL open archive server: https://hal.science/hal-04017151
    as

    Download full text from publisher

    File URL: https://hal.science/hal-04017151/document
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Frahm, Gabriel & Junker, Markus & Schmidt, Rafael, 2005. "Estimating the tail-dependence coefficient: Properties and pitfalls," Insurance: Mathematics and Economics, Elsevier, vol. 37(1), pages 80-100, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gijbels, Irène & Sznajder, Dominik, 2013. "Testing tail monotonicity by constrained copula estimation," Insurance: Mathematics and Economics, Elsevier, vol. 52(2), pages 338-351.
    2. Ziqiang Xing & Denghua Yan & Cheng Zhang & Gang Wang & Dongdong Zhang, 2015. "Spatial Characterization and Bivariate Frequency Analysis of Precipitation and Runoff in the Upper Huai River Basin, China," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 29(9), pages 3291-3304, July.
    3. Mohamad Haytham Klaho & Hamid R. Safavi & Mohammad H. Golmohammadi & Maamoun Alkntar, 2022. "Comparison between bivariate and trivariate flood frequency analysis using the Archimedean copula functions, a case study of the Karun River in Iran," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 112(2), pages 1589-1610, June.
    4. César Garcia-Gomez & Ana Pérez & Mercedes Prieto-Alaiz, 2022. "The evolution of poverty in the EU-28: a further look based on multivariate tail dependence," Working Papers 605, ECINEQ, Society for the Study of Economic Inequality.
    5. Zhou, You & Lin, Lichao & Huang, Ziling, 2024. "Diversification value of green Bonds: Fresh evidence from China," The North American Journal of Economics and Finance, Elsevier, vol. 74(C).
    6. Tjøstheim, Dag & Hufthammer, Karl Ove, 2013. "Local Gaussian correlation: A new measure of dependence," Journal of Econometrics, Elsevier, vol. 172(1), pages 33-48.
    7. Raza, Hamid & Wu, Weiou, 2018. "Quantile dependence between the stock, bond and foreign exchange markets – Evidence from the UK," The Quarterly Review of Economics and Finance, Elsevier, vol. 69(C), pages 286-296.
    8. Víctor Adame-García & Fernando Fernández-Rodríguez & Simón Sosvilla-Rivero, 2017. "“Resolution of optimization problems and construction of efficient portfolios: An application to the Euro Stoxx 50 index"," IREA Working Papers 201702, University of Barcelona, Research Institute of Applied Economics, revised Feb 2017.
    9. Lamneithem Hangshing & Parmendra P. Dabral, 2018. "Multivariate Frequency Analysis of Meteorological Drought Using Copula," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 32(5), pages 1741-1758, March.
    10. Mendes, Beatriz Vaz de Melo & Arslan, Olcay, 2006. "Multivariate Skew Distributions Based on the GT-Copula," Brazilian Review of Econometrics, Sociedade Brasileira de Econometria - SBE, vol. 26(2), November.
    11. Cerrato, Mario & Crosby, John & Kim, Minjoo & Zhao, Yang, 2015. "US Monetary and Fiscal Policies - Conflict or Cooperation?," SIRE Discussion Papers 2015-78, Scottish Institute for Research in Economics (SIRE).
    12. Yuri Salazar & Wing Ng, 2015. "Nonparametric estimation of general multivariate tail dependence and applications to financial time series," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(1), pages 121-158, March.
    13. DiTraglia, Francis J. & Gerlach, Jeffrey R., 2013. "Portfolio selection: An extreme value approach," Journal of Banking & Finance, Elsevier, vol. 37(2), pages 305-323.
    14. Dominique Guégan & Matteo Iacopini, 2018. "Nonparameteric forecasting of multivariate probability density functions," Documents de travail du Centre d'Economie de la Sorbonne 18012, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    15. Pavel Krupskii & Harry Joe, 2015. "Tail-weighted measures of dependence," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(3), pages 614-629, March.
    16. Helena Ferreira & Marta Ferreira, 2021. "Tail dependence and smoothness of time series," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 198-210, March.
    17. Mike So & Alex Tse, 2009. "Dynamic Modeling of Tail Risk: Applications to China, Hong Kong and Other Asian Markets," Asia-Pacific Financial Markets, Springer;Japanese Association of Financial Economics and Engineering, vol. 16(3), pages 183-210, September.
    18. Fischer, Matthias J. & Dörflinger, Marco, 2006. "A note on a non-parametric tail dependence estimator," Discussion Papers 76/2006, Friedrich-Alexander University Erlangen-Nuremberg, Chair of Statistics and Econometrics.
    19. Cerrato, Mario & Crosby, John & Kim, Minjoo & Zhao, Yang, 2014. "Modeling Dependence Structure and Forecasting Portfolio Value-at-Risk with Dynamic Copulas," SIRE Discussion Papers 2015-25, Scottish Institute for Research in Economics (SIRE).
    20. Sun, Mingyang & Cremer, Jochen & Strbac, Goran, 2018. "A novel data-driven scenario generation framework for transmission expansion planning with high renewable energy penetration," Applied Energy, Elsevier, vol. 228(C), pages 546-555.

    More about this item

    Keywords

    robust aggregation; weighted mean and quantile estimation; heavy tails; concentration inequalities; outliers;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:wpaper:hal-04017151. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.