IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v64y2023i4d10.1007_s00362-023-01436-x.html
   My bibliography  Save this article

Discrimination between Gaussian process models: active learning and static constructions

Author

Listed:
  • Elham Yousefi

    (Johannes Kepler University)

  • Luc Pronzato

    (Laboratoire I3S - UMR 7271)

  • Markus Hainy

    (Johannes Kepler University)

  • Werner G. Müller

    (Johannes Kepler University)

  • Henry P. Wynn

    (London School of Economics)

Abstract

The paper covers the design and analysis of experiments to discriminate between two Gaussian process models with different covariance kernels, such as those widely used in computer experiments, kriging, sensor location and machine learning. Two frameworks are considered. First, we study sequential constructions, where successive design (observation) points are selected, either as additional points to an existing design or from the beginning of observation. The selection relies on the maximisation of the difference between the symmetric Kullback Leibler divergences for the two models, which depends on the observations, or on the mean squared error of both models, which does not. Then, we consider static criteria, such as the familiar log-likelihood ratios and the Fréchet distance between the covariance functions of the two models. Other distance-based criteria, simpler to compute than previous ones, are also introduced, for which, considering the framework of approximate design, a necessary condition for the optimality of a design measure is provided. The paper includes a study of the mathematical links between different criteria and numerical illustrations are provided.

Suggested Citation

  • Elham Yousefi & Luc Pronzato & Markus Hainy & Werner G. Müller & Henry P. Wynn, 2023. "Discrimination between Gaussian process models: active learning and static constructions," Statistical Papers, Springer, vol. 64(4), pages 1275-1304, August.
  • Handle: RePEc:spr:stpapr:v:64:y:2023:i:4:d:10.1007_s00362-023-01436-x
    DOI: 10.1007/s00362-023-01436-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-023-01436-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-023-01436-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. J. López‐Fidalgo & C. Tommasi & P. C. Trandafir, 2007. "An optimal experimental design criterion for discriminating between non‐normal models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(2), pages 231-242, April.
    2. Dowson, D. C. & Landau, B. V., 1982. "The Fréchet distance between multivariate normal distributions," Journal of Multivariate Analysis, Elsevier, vol. 12(3), pages 450-455, September.
    3. Luc Pronzato & Henry P. Wynn & Anatoly Zhigljavsky, 2019. "Bregman divergences based on optimal design criteria and simplicial measures of dispersion," Statistical Papers, Springer, vol. 60(2), pages 545-564, April.
    4. Lee, Xing Ju & Hainy, Markus & McKeone, James P. & Drovandi, Christopher C. & Pettitt, Anthony N., 2018. "ABC model selection for spatial extremes models applied to South Australian maximum temperature data," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 128-144.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rivas-López, M.J. & Yu, R.C. & López-Fidalgo, J. & Ruiz, G., 2017. "Optimal experimental design on the loading frequency for a probabilistic fatigue model for plain and fibre-reinforced concrete," Computational Statistics & Data Analysis, Elsevier, vol. 113(C), pages 363-374.
    2. S. G. J. Senarathne & C. C. Drovandi & J. M. McGree, 2020. "Bayesian sequential design for Copula models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 454-478, June.
    3. Dette, Holger & Titoff, Stefanie, 2008. "Optimal discrimination designs," Technical Reports 2008,06, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    4. Knott, Martin & Smith, Cyril, 2006. "Choosing joint distributions so that the variance of the sum is small," Journal of Multivariate Analysis, Elsevier, vol. 97(8), pages 1757-1765, September.
    5. Rippl, Thomas & Munk, Axel & Sturm, Anja, 2016. "Limit laws of the empirical Wasserstein distance: Gaussian distributions," Journal of Multivariate Analysis, Elsevier, vol. 151(C), pages 90-109.
    6. Santiago Campos-Barreiro & Jesús López-Fidalgo, 2015. "D-optimal experimental designs for a growth model applied to a Holstein-Friesian dairy farm," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(3), pages 491-505, September.
    7. Woods, David C. & McGree, James M. & Lewis, Susan M., 2017. "Model selection via Bayesian information capacity designs for generalised linear models," Computational Statistics & Data Analysis, Elsevier, vol. 113(C), pages 226-238.
    8. Kira Alhorn & Holger Dette & Kirsten Schorning, 2021. "Optimal Designs for Model Averaging in non-nested Models," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 745-778, August.
    9. Víctor Casero-Alonso & Andrey Pepelyshev & Weng K. Wong, 2018. "A web-based tool for designing experimental studies to detect hormesis and estimate the threshold dose," Statistical Papers, Springer, vol. 59(4), pages 1307-1324, December.
    10. Zhongzhi Lawrence He, 2018. "Comparing Asset Pricing Models: Distance-based Metrics and Bayesian Interpretations," Papers 1803.01389, arXiv.org.
    11. Jun Yu & HaiYing Wang, 2022. "Subdata selection algorithm for linear model discrimination," Statistical Papers, Springer, vol. 63(6), pages 1883-1906, December.
    12. David Mogalle & Philipp Seufert & Jan Schwientek & Michael Bortz & Karl-Heinz Küfer, 2024. "Computing T-optimal designs via nested semi-infinite programming and twofold adaptive discretization," Computational Statistics, Springer, vol. 39(5), pages 2451-2478, July.
    13. Laura Deldossi & Silvia Angela Osmetti & Chiara Tommasi, 2019. "Optimal design to discriminate between rival copula models for a bivariate binary response," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 147-165, March.
    14. Mordant, Gilles & Segers, Johan, 2022. "Measuring dependence between random vectors via optimal transport," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    15. Whiteley, Nick, 2021. "Dimension-free Wasserstein contraction of nonlinear filters," Stochastic Processes and their Applications, Elsevier, vol. 135(C), pages 31-50.
    16. Tommasi, C. & López-Fidalgo, J., 2010. "Bayesian optimum designs for discriminating between models with any distribution," Computational Statistics & Data Analysis, Elsevier, vol. 54(1), pages 143-150, January.
    17. Duarte, Belmiro P.M. & Wong, Weng Kee & Atkinson, Anthony C., 2015. "A Semi-Infinite Programming based algorithm for determining T-optimum designs for model discrimination," Journal of Multivariate Analysis, Elsevier, vol. 135(C), pages 11-24.
    18. Ledoit, Olivier & Wolf, Michael, 2021. "Shrinkage estimation of large covariance matrices: Keep it simple, statistician?," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    19. Zhong, Peng & Huser, Raphaël & Opitz, Thomas, 2024. "Exact Simulation of Max-Infinitely Divisible Processes," Econometrics and Statistics, Elsevier, vol. 30(C), pages 96-109.
    20. Zhang, Kefei & Yang, Xiaolin & Xu, Liang & Thé, Jesse & Tan, Zhongchao & Yu, Hesheng, 2024. "Enhancing coal-gangue object detection using GAN-based data augmentation strategy with dual attention mechanism," Energy, Elsevier, vol. 287(C).

    More about this item

    Keywords

    Model discrimination; Gaussian random field; Kriging;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:64:y:2023:i:4:d:10.1007_s00362-023-01436-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.