IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003968.html
   My bibliography  Save this article

Bayesian History Matching of Complex Infectious Disease Models Using Emulation: A Tutorial and a Case Study on HIV in Uganda

Author

Listed:
  • Ioannis Andrianakis
  • Ian R Vernon
  • Nicky McCreesh
  • Trevelyan J McKinley
  • Jeremy E Oakley
  • Rebecca N Nsubuga
  • Michael Goldstein
  • Richard G White

Abstract

Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in disease epidemiology, public health and decision making. The utility of these models depends in part on how well they can reproduce empirical data. However, fitting such models to real world data is greatly hindered both by large numbers of input and output parameters, and by long run times, such that many modelling studies lack a formal calibration methodology. We present a novel method that has the potential to improve the calibration of complex infectious disease models (hereafter called simulators). We present this in the form of a tutorial and a case study where we history match a dynamic, event-driven, individual-based stochastic HIV simulator, using extensive demographic, behavioural and epidemiological data available from Uganda. The tutorial describes history matching and emulation. History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data. History matching relies on the computational efficiency of a Bayesian representation of the simulator, known as an emulator. Emulators mimic the simulator's behaviour, but are often several orders of magnitude faster to evaluate. In the case study, we use a 22 input simulator, fitting its 18 outputs simultaneously. After 9 iterations of history matching, a non-implausible region of the simulator input space was identified that was times smaller than the original input space. Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs. History matching and emulation are useful additions to the toolbox of infectious disease modellers. Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.Author Summary: An increasing number of scientific disciplines, and biology in particular, rely on complex computational models. The utility of these models depends on how well they are fitted to empirical data. Fitting is achieved by searching for suitable values for the models' input parameters, in a process known as calibration. Modern computer models typically have a large number of input and output parameters, and long running times, a consequence of their increasing computational complexity. The above two things hinder the calibration process. In this work, we propose a method that can help the calibration of models with long running times and several inputs and outputs. We apply this method on an individual based, dynamic and stochastic HIV model, using HIV data from Uganda. The final system has a 65% probability of selecting an input parameter set that fits all 18 model outputs.

Suggested Citation

  • Ioannis Andrianakis & Ian R Vernon & Nicky McCreesh & Trevelyan J McKinley & Jeremy E Oakley & Rebecca N Nsubuga & Michael Goldstein & Richard G White, 2015. "Bayesian History Matching of Complex Infectious Disease Models Using Emulation: A Tutorial and a Case Study on HIV in Uganda," PLOS Computational Biology, Public Library of Science, vol. 11(1), pages 1-18, January.
  • Handle: RePEc:plo:pcbi00:1003968
    DOI: 10.1371/journal.pcbi.1003968
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003968
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003968&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003968?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Philip D. O'Neill & David J. Balding & Niels G. Becker & Mervi Eerola & Denis Mollison, 2000. "Analyses of infectious disease data from household outbreaks by Markov chain Monte Carlo methods," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 49(4), pages 517-542.
    2. P. D. O’Neill & G. O. Roberts, 1999. "Bayesian inference for partially observed stochastic epidemics," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 162(1), pages 121-129.
    3. Natasha Stout & Amy Knudsen & Chung Kong & Pamela McMahon & G. Gazelle, 2009. "Calibration Methods Used in Cancer Simulation Models and Suggested Reporting Guidelines," PharmacoEconomics, Springer, vol. 27(7), pages 533-545, July.
    4. Henderson, Daniel A. & Boys, Richard J. & Krishnan, Kim J. & Lawless, Conor & Wilkinson, Darren J., 2009. "Bayesian Emulation and Calibration of a Stochastic Computer Model of Mitochondrial DNA Deletions in Substantia Nigra Neurons," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 76-87.
    5. Christophe Andrieu & Arnaud Doucet & Roman Holenstein, 2010. "Particle Markov chain Monte Carlo methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(3), pages 269-342, June.
    6. Andrianakis, Ioannis & Challenor, Peter G., 2012. "The effect of the nugget on Gaussian process emulators of computer models," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 4215-4228.
    7. Sarah Dewilde & Rob Anderson, 2004. "The Cost-Effectiveness of Screening Programs Using Single and Multiple Birth Cohort Simulations: A Comparison Using a Model of Cervical Cancer," Medical Decision Making, , vol. 24(5), pages 486-492, October.
    8. Goldstein, Michael & Rougier, Jonathan, 2006. "Bayes Linear Calibrated Prediction for Complex Systems," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1132-1143, September.
    9. Andrew J K Conlan & Trevelyan J McKinley & Katerina Karolemeas & Ellen Brooks Pollock & Anthony V Goodchild & Andrew P Mitchell & Colin P D Birch & Richard S Clifton-Hadley & James L N Wood, 2012. "Estimating the Hidden Burden of Bovine Tuberculosis in Great Britain," PLOS Computational Biology, Public Library of Science, vol. 8(10), pages 1-14, October.
    10. McKinley Trevelyan & Cook Alex R & Deardon Robert, 2009. "Inference in Epidemic Models without Likelihoods," The International Journal of Biostatistics, De Gruyter, vol. 5(1), pages 1-40, July.
    11. Jeremy E. Oakley & Anthony O'Hagan, 2004. "Probabilistic sensitivity analysis of complex models: a Bayesian approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(3), pages 751-769, August.
    12. Higdon, Dave & Gattiker, James & Williams, Brian & Rightley, Maria, 2008. "Computer Model Calibration Using High-Dimensional Output," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 570-583, June.
    13. Mark Strong & Jeremy E. Oakley & Jim Chilcott, 2012. "Managing structural uncertainty in health economic decision models: a discrepancy approach," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 61(1), pages 25-45, January.
    14. Marc C. Kennedy & Anthony O'Hagan, 2001. "Bayesian calibration of computer models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(3), pages 425-464.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chaitanya Kaligotla & Jonathan Ozik & Nicholson Collier & Charles M. Macal & Kelly Boyd & Jennifer Makelarski & Elbert S. Huang & Stacy T. Lindau, 2020. "Model Exploration of an Information-Based Healthcare Intervention Using Parallelization and Active Learning," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 23(4), pages 1-1.
    2. Evan Baker & Peter Challenor & Matt Eames, 2021. "Future proofing a building design using history matching inspired level‐set techniques," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 335-350, March.
    3. Christopher N Davis & T Deirdre Hollingsworth & Quentin Caudron & Michael A Irvine, 2020. "The use of mixture density networks in the emulation of complex epidemiological individual-based models," PLOS Computational Biology, Public Library of Science, vol. 16(3), pages 1-16, March.
    4. Sean L Wu & Héctor M Sánchez C. & John M Henry & Daniel T Citron & Qian Zhang & Kelly Compton & Biyonka Liang & Amit Verma & Derek A T Cummings & Arnaud Le Menach & Thomas W Scott & Anne L Wilson & St, 2020. "Vector bionomics and vectorial capacity as emergent properties of mosquito behaviors and ecology," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-32, April.
    5. Si Chen & Daniel Friedrich & Zhibin Yu & James Yu, 2019. "District Heating Network Demand Prediction Using a Physics-Based Energy Model with a Bayesian Approach for Parameter Calibration," Energies, MDPI, vol. 12(18), pages 1-19, September.
    6. Nicky McCreesh & Ioannis Andrianakis & Rebecca N Nsubuga & Mark Strong & Ian Vernon & Trevelyan J McKinley & Jeremy E Oakley & Michael Goldstein & Richard Hayes & Richard G White, 2018. "Choice of time horizon critical in estimating costs and effects of changes to HIV programmes," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-10, May.
    7. Jackson Samuel E. & Vernon Ian & Liu Junli & Lindsey Keith, 2020. "Understanding hormonal crosstalk in Arabidopsis root development via emulation and history matching," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 19(2), pages 1-33, April.
    8. Josie McCulloch & Jiaqi Ge & Jonathan A. Ward & Alison Heppenstall & J. Gareth Polhill & Nick Malleson, 2022. "Calibrating Agent-Based Models Using Uncertainty Quantification Methods," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 25(2), pages 1-1.
    9. Gyanendra Pokharel & Rob Deardon, 2022. "Emulation‐based inference for spatial infectious disease transmission models incorporating event time uncertainty," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(1), pages 455-479, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. McKinley, Trevelyan J. & Ross, Joshua V. & Deardon, Rob & Cook, Alex R., 2014. "Simulation-based Bayesian inference for epidemic models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 434-447.
    2. I. Andrianakis & I. Vernon & N. McCreesh & T. J. McKinley & J. E. Oakley & R. N. Nsubuga & M. Goldstein & R. G. White, 2017. "History matching of a complex epidemiological model of human immunodeficiency virus transmission by using variance emulation," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(4), pages 717-740, August.
    3. Mevin Hooten & Christopher Wikle & Michael Schwob, 2020. "Statistical Implementations of Agent‐Based Demographic Models," International Statistical Review, International Statistical Institute, vol. 88(2), pages 441-461, August.
    4. Nott, David J. & Marshall, Lucy & Fielding, Mark & Liong, Shie-Yui, 2014. "Mixtures of experts for understanding model discrepancy in dynamic computer models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 491-505.
    5. Jackson Samuel E. & Vernon Ian & Liu Junli & Lindsey Keith, 2020. "Understanding hormonal crosstalk in Arabidopsis root development via emulation and history matching," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 19(2), pages 1-33, April.
    6. Garbuno-Inigo, A. & DiazDelaO, F.A. & Zuev, K.M., 2016. "Gaussian process hyper-parameter estimation using Parallel Asymptotically Independent Markov Sampling," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 367-383.
    7. Manfren, Massimiliano & Aste, Niccolò & Moshksar, Reza, 2013. "Calibration and uncertainty analysis for computer models – A meta-model based approach for integrated building energy simulation," Applied Energy, Elsevier, vol. 103(C), pages 627-641.
    8. Daniel W. Gladish & Daniel E. Pagendam & Luk J. M. Peeters & Petra M. Kuhnert & Jai Vaze, 2018. "Emulation Engines: Choice and Quantification of Uncertainty for Complex Hydrological Models," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 39-62, March.
    9. K. Sham Bhat & David S. Mebane & Priyadarshi Mahapatra & Curtis B. Storlie, 2017. "Upscaling Uncertainty with Dynamic Discrepancy for a Multi-Scale Carbon Capture System," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1453-1467, October.
    10. Antony M. Overstall & David C. Woods, 2013. "A Strategy for Bayesian Inference for Computationally Expensive Models with Application to the Estimation of Stem Cell Properties," Biometrics, The International Biometric Society, vol. 69(2), pages 458-468, June.
    11. Gyanendra Pokharel & Rob Deardon, 2022. "Emulation‐based inference for spatial infectious disease transmission models incorporating event time uncertainty," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(1), pages 455-479, March.
    12. Curtis B. Storlie & William A. Lane & Emily M. Ryan & James R. Gattiker & David M. Higdon, 2015. "Calibration of Computational Models With Categorical Parameters and Correlated Outputs via Bayesian Smoothing Spline ANOVA," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 68-82, March.
    13. Jakub Bijak & Jason D. Hilton & Eric Silverman & Viet Dung Cao, 2013. "Reforging the Wedding Ring," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 29(27), pages 729-766.
    14. Petropoulos, G. & Wooster, M.J. & Carlson, T.N. & Kennedy, M.C. & Scholze, M., 2009. "A global Bayesian sensitivity analysis of the 1d SimSphere soil–vegetation–atmospheric transfer (SVAT) model using Gaussian model emulation," Ecological Modelling, Elsevier, vol. 220(19), pages 2427-2440.
    15. Hemez, François M. & Atamturktur, Sezer, 2011. "The dangers of sparse sampling for the quantification of margin and uncertainty," Reliability Engineering and System Safety, Elsevier, vol. 96(9), pages 1220-1231.
    16. Drignei, Dorin, 2011. "A general statistical model for computer experiments with time series output," Reliability Engineering and System Safety, Elsevier, vol. 96(4), pages 460-467.
    17. Hwang, Youngdeok & Kim, Hang J. & Chang, Won & Yeo, Kyongmin & Kim, Yongku, 2019. "Bayesian pollution source identification via an inverse physics model," Computational Statistics & Data Analysis, Elsevier, vol. 134(C), pages 76-92.
    18. Yuan, Jun & Ng, Szu Hui, 2013. "A sequential approach for stochastic computer model calibration and prediction," Reliability Engineering and System Safety, Elsevier, vol. 111(C), pages 273-286.
    19. Henri Pesonen & Umberto Simola & Alvaro Köhn‐Luque & Henri Vuollekoski & Xiaoran Lai & Arnoldo Frigessi & Samuel Kaski & David T. Frazier & Worapree Maneesoonthorn & Gael M. Martin & Jukka Corander, 2023. "ABC of the future," International Statistical Review, International Statistical Institute, vol. 91(2), pages 243-268, August.
    20. Marc Kennedy & Clive Anderson & Anthony O'Hagan & Mark Lomas & Ian Woodward & John Paul Gosling & Andreas Heinemeyer, 2008. "Quantifying uncertainty in the biospheric carbon flux for England and Wales," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 171(1), pages 109-135, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003968. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.