IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v13y2014i1p67-82n5.html
   My bibliography  Save this article

Semi-automatic selection of summary statistics for ABC model choice

Author

Listed:
  • Prangle Dennis

    (Department of Mathematics and Statistics, Lancaster University, UK)

  • Fearnhead Paul

    (Department of Mathematics and Statistics, Lancaster University, UK)

  • Cox Murray P.

    (Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand)

  • Biggs Patrick J.

    (Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand)

  • French Nigel P.

    (Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand Infectious Disease Research Centre, Institute of Veterinary, Animal and Biomedical Sciences, Massey University, Palmerston North, New Zealand)

Abstract

A central statistical goal is to choose between alternative explanatory models of data. In many modern applications, such as population genetics, it is not possible to apply standard methods based on evaluating the likelihood functions of the models, as these are numerically intractable. Approximate Bayesian computation (ABC) is a commonly used alternative for such situations. ABC simulates data x for many parameter values under each model, which is compared to the observed data xobs. More weight is placed on models under which S(x) is close to S(xobs), where S maps data to a vector of summary statistics. Previous work has shown the choice of S is crucial to the efficiency and accuracy of ABC. This paper provides a method to select good summary statistics for model choice. It uses a preliminary step, simulating many x values from all models and fitting regressions to this with the model as response. The resulting model weight estimators are used as S in an ABC analysis. Theoretical results are given to justify this as approximating low dimensional sufficient statistics. A substantive application is presented: choosing between competing coalescent models of demographic growth for Campylobacter jejuni in New Zealand using multi-locus sequence typing data.

Suggested Citation

  • Prangle Dennis & Fearnhead Paul & Cox Murray P. & Biggs Patrick J. & French Nigel P., 2014. "Semi-automatic selection of summary statistics for ABC model choice," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(1), pages 67-82, February.
  • Handle: RePEc:bpj:sagmbi:v:13:y:2014:i:1:p:67-82:n:5
    DOI: 10.1515/sagmb-2013-0012
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2013-0012
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2013-0012?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. C. C. Drovandi & A. N. Pettitt, 2011. "Estimation of Parameters for Macroparasite Population Evolution Using Approximate Bayesian Computation," Biometrics, The International Biometric Society, vol. 67(1), pages 225-233, March.
    2. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    3. Paul Fearnhead & Dennis Prangle, 2012. "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(3), pages 419-474, June.
    4. Joyce Paul & Marjoram Paul, 2008. "Approximately Sufficient Statistics and Bayesian Computation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-18, August.
    5. repec:dau:papers:123456789/6334 is not listed on IDEAS
    6. Nunes Matthew A & Balding David J, 2010. "On Optimal Selection of Summary Statistics for Approximate Bayesian Computation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-16, September.
    7. Blum, Michael G. B., 2010. "Approximate Bayesian Computation: A Nonparametric Perspective," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1178-1187.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Pierre-Olivier Goffard & Patrick Laub, 2021. "Approximate Bayesian Computations to fit and compare insurance loss models," Post-Print hal-02891046, HAL.
    2. Pierre-Olivier Goffard & Patrick Laub, 2021. "Approximate Bayesian Computations to fit and compare insurance loss models," Working Papers hal-02891046, HAL.
    3. Goffard, Pierre-Olivier & Laub, Patrick J., 2021. "Approximate Bayesian Computations to fit and compare insurance loss models," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 350-371.
    4. Michael Stocks & Mathieu Siol & Martin Lascoux & Stéphane De Mita, 2014. "Amount of Information Needed for Model Choice in Approximate Bayesian Computation," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-13, June.
    5. Lee, Xing Ju & Hainy, Markus & McKeone, James P. & Drovandi, Christopher C. & Pettitt, Anthony N., 2018. "ABC model selection for spatial extremes models applied to South Australian maximum temperature data," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 128-144.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mikael Sunnåker & Alberto Giovanni Busetto & Elina Numminen & Jukka Corander & Matthieu Foll & Christophe Dessimoz, 2013. "Approximate Bayesian Computation," PLOS Computational Biology, Public Library of Science, vol. 9(1), pages 1-10, January.
    2. Creel, Michael & Kristensen, Dennis, 2016. "On selection of statistics for approximate Bayesian computing (or the method of simulated moments)," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 99-114.
    3. Nakagome Shigeki & Fukumizu Kenji & Mano Shuhei, 2013. "Kernel approximate Bayesian computation in population genetic inferences," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(6), pages 667-678, December.
    4. Soubeyrand, Samuel & Haon-Lasportes, Emilie, 2015. "Weak convergence of posteriors conditional on maximum pseudo-likelihood estimates and implications in ABC," Statistics & Probability Letters, Elsevier, vol. 107(C), pages 84-92.
    5. Silk Daniel & Filippi Sarah & Stumpf Michael P. H., 2013. "Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(5), pages 603-618, October.
    6. D.T. Frazier & G.M. Martin & C.P. Robert & J. Rousseau, 2016. "Asymptotic Properties of Approximate Bayesian Computation," Monash Econometrics and Business Statistics Working Papers 18/16, Monash University, Department of Econometrics and Business Statistics.
    7. Li, J. & Nott, D.J. & Fan, Y. & Sisson, S.A., 2017. "Extending approximate Bayesian computation methods to high dimensions via a Gaussian copula model," Computational Statistics & Data Analysis, Elsevier, vol. 106(C), pages 77-89.
    8. Baey, Charlotte & Smith, Henrik G. & Rundlöf, Maj & Olsson, Ola & Clough, Yann & Sahlin, Ullrika, 2023. "Calibration of a bumble bee foraging model using Approximate Bayesian Computation," Ecological Modelling, Elsevier, vol. 477(C).
    9. Jonathan U Harrison & Ruth E Baker, 2020. "An automatic adaptive method to combine summary statistics in approximate Bayesian computation," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-21, August.
    10. Frazier, David T. & Maneesoonthorn, Worapree & Martin, Gael M. & McCabe, Brendan P.M., 2019. "Approximate Bayesian forecasting," International Journal of Forecasting, Elsevier, vol. 35(2), pages 521-539.
    11. Buzbas, Erkan O. & Rosenberg, Noah A., 2015. "AABC: Approximate approximate Bayesian computation for inference in population-genetic models," Theoretical Population Biology, Elsevier, vol. 99(C), pages 31-42.
    12. Soubeyrand Samuel & Carpentier Florence & Guiton François & Klein Etienne K., 2013. "Approximate Bayesian computation with functional statistics," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(1), pages 17-37, March.
    13. Wilkinson Richard David, 2013. "Approximate Bayesian computation (ABC) gives exact results under the assumption of model error," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(2), pages 129-141, May.
    14. Lee, Xing Ju & Hainy, Markus & McKeone, James P. & Drovandi, Christopher C. & Pettitt, Anthony N., 2018. "ABC model selection for spatial extremes models applied to South Australian maximum temperature data," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 128-144.
    15. Michael Stocks & Mathieu Siol & Martin Lascoux & Stéphane De Mita, 2014. "Amount of Information Needed for Model Choice in Approximate Bayesian Computation," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-13, June.
    16. Xing Ju Lee & Christopher C. Drovandi & Anthony N. Pettitt, 2015. "Model choice problems using approximate Bayesian computation with applications to pathogen transmission data sets," Biometrics, The International Biometric Society, vol. 71(1), pages 198-207, March.
    17. Gael M. Martin & David T. Frazier & Christian P. Robert, 2020. "Computing Bayes: Bayesian Computation from 1763 to the 21st Century," Monash Econometrics and Business Statistics Working Papers 14/20, Monash University, Department of Econometrics and Business Statistics.
    18. Pierre-Olivier Goffard & Patrick Laub, 2021. "Approximate Bayesian Computations to fit and compare insurance loss models," Working Papers hal-02891046, HAL.
    19. Florian Maire & Nial Friel & Pierre ALQUIER, 2017. "Informed Sub-Sampling MCMC: Approximate Bayesian Inference for Large Datasets," Working Papers 2017-40, Center for Research in Economics and Statistics.
    20. Henri Pesonen & Umberto Simola & Alvaro Köhn‐Luque & Henri Vuollekoski & Xiaoran Lai & Arnoldo Frigessi & Samuel Kaski & David T. Frazier & Worapree Maneesoonthorn & Gael M. Martin & Jukka Corander, 2023. "ABC of the future," International Statistical Review, International Statistical Institute, vol. 91(2), pages 243-268, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:13:y:2014:i:1:p:67-82:n:5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.