IDEAS home Printed from https://ideas.repec.org/p/aiz/louvar/2023020.html
   My bibliography  Save this paper

Concentration bounds for the empirical angular measure with statistical learning applications

Author

Listed:
  • Clémençon, Stéphan
  • Jalalzai, Hamid
  • Lhaut, Stéphane

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

  • Sabourin, Anne
  • Segers, Johan

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

Abstract

The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. However, the study of the sampling distribution of the resulting empirical angular measure is challenging. It is the purpose of the paper to establish finite-sample bounds for the maximal deviations between the empirical and true angular measures, uniformly over classes of Borel sets of controlled combinatorial complexity. The bounds are valid with high probability and, up to logarithmic factors, scale as the square root of the effective sample size. The bounds are applied to provide performance guarantees for two statistical learning procedures tailored to extreme regions of the input space and built upon the empirical angular measure: binary classification in extreme regions through empirical risk minimization and unsupervised anomaly detection through minimum-volume sets of the sphere.

Suggested Citation

  • Clémençon, Stéphan & Jalalzai, Hamid & Lhaut, Stéphane & Sabourin, Anne & Segers, Johan, 2023. "Concentration bounds for the empirical angular measure with statistical learning applications," LIDAM Reprints ISBA 2023020, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
  • Handle: RePEc:aiz:louvar:2023020
    DOI: https://doi.org/10.3150/22-BEJ1562
    Note: In: Bernoulli, 2023, vol. 29(4), p. 2797-2827
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aiz:louvar:2023020. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Nadja Peiffer (email available below). General contact details of provider: https://edirc.repec.org/data/isuclbe.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.