IDEAS home Printed from https://ideas.repec.org/a/wly/riskan/v40y2020i9p1693-1705.html
   My bibliography  Save this article

Application of Whole‐Genome Sequences and Machine Learning in Source Attribution of Salmonella Typhimurium

Author

Listed:
  • Nanna Munck
  • Patrick Murigu Kamau Njage
  • Pimlapas Leekitcharoenphon
  • Eva Litrup
  • Tine Hald

Abstract

Prevention of the emergence and spread of foodborne diseases is an important prerequisite for the improvement of public health. Source attribution models link sporadic human cases of a specific illness to food sources and animal reservoirs. With the next generation sequencing technology, it is possible to develop novel source attribution models. We investigated the potential of machine learning to predict the animal reservoir from which a bacterial strain isolated from a human salmonellosis case originated based on whole‐genome sequencing. Machine learning methods recognize patterns in large and complex data sets and use this knowledge to build models. The model learns patterns associated with genetic variations in bacteria isolated from the different animal reservoirs. We selected different machine learning algorithms to predict sources of human salmonellosis cases and trained the model with Danish Salmonella Typhimurium isolates sampled from broilers (n = 34), cattle (n = 2), ducks (n = 11), layers (n = 4), and pigs (n = 159). Using cgMLST as input features, the model yielded an average accuracy of 0.783 (95% CI: 0.77–0.80) in the source prediction for the random forest and 0.933 (95% CI: 0.92–0.94) for the logit boost algorithm. Logit boost algorithm was most accurate (valid accuracy: 92%, CI: 0.8706–0.9579) and predicted the origin of 81% of the domestic sporadic human salmonellosis cases. The most important source was Danish produced pigs (53%) followed by imported pigs (16%), imported broilers (6%), imported ducks (2%), Danish produced layers (2%), Danish produced cattle and imported cattle (

Suggested Citation

  • Nanna Munck & Patrick Murigu Kamau Njage & Pimlapas Leekitcharoenphon & Eva Litrup & Tine Hald, 2020. "Application of Whole‐Genome Sequences and Machine Learning in Source Attribution of Salmonella Typhimurium," Risk Analysis, John Wiley & Sons, vol. 40(9), pages 1693-1705, September.
  • Handle: RePEc:wly:riskan:v:40:y:2020:i:9:p:1693-1705
    DOI: 10.1111/risa.13510
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/risa.13510
    Download Restriction: no

    File URL: https://libkey.io/10.1111/risa.13510?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kursa, Miron B. & Rudnicki, Witold R., 2010. "Feature Selection with the Boruta Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i11).
    2. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    3. Petra Mullner & Geoff Jones & Alasdair Noble & Simon E. F. Spencer & Steve Hathaway & Nigel Peter French, 2009. "Source Attribution of Food‐Borne Zoonoses in New Zealand: A Modified Hald Model," Risk Analysis, John Wiley & Sons, vol. 29(7), pages 970-984, July.
    4. Michael McClelland & Kenneth E. Sanderson & John Spieth & Sandra W. Clifton & Phil Latreille & Laura Courtney & Steffen Porwollik & Johar Ali & Mike Dante & Feiyu Du & Shunfang Hou & Dan Layman & Shaw, 2001. "Complete genome sequence of Salmonella enterica serovar Typhimurium LT2," Nature, Nature, vol. 413(6858), pages 852-856, October.
    5. Tine Hald & David Vose & Henrik C. Wegener & Timour Koupeev, 2004. "A Bayesian Approach to Quantify the Contribution of Animal‐Food Sources to Human Salmonellosis," Risk Analysis, John Wiley & Sons, vol. 24(1), pages 255-269, February.
    6. Leonardo V. de Knegt & Sara M. Pires & Charlotta Löfström & Gitte Sørensen & Karl Pedersen & Mia Torpdahl & Eva M. Nielsen & Tine Hald, 2016. "Application of Molecular Typing Results in Source Attribution Models: The Case of Multiple Locus Variable Number Tandem Repeat Analysis (MLVA) of Salmonella Isolates Obtained from Integrated Surveilla," Risk Analysis, John Wiley & Sons, vol. 36(3), pages 571-588, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. K. Glass & E. Fearnley & H. Hocking & J. Raupach & M. Veitch & L. Ford & M. D. Kirk, 2016. "Bayesian Source Attribution of Salmonellosis in South Australia," Risk Analysis, John Wiley & Sons, vol. 36(3), pages 561-570, March.
    2. Arjan S. Gosal & Janine A. McMahon & Katharine M. Bowgen & Catherine H. Hoppe & Guy Ziv, 2021. "Identifying and Mapping Groups of Protected Area Visitors by Environmental Awareness," Land, MDPI, vol. 10(6), pages 1-14, May.
    3. Francesco Sartor & Jonathan P. Moore & Hans-Peter Kubis, 2021. "Plasma Interleukin-10 and Cholesterol Levels May Inform about Interdependences between Fitness and Fatness in Healthy Individuals," IJERPH, MDPI, vol. 18(4), pages 1-19, February.
    4. Faisal Alsayegh & Moh A Alkhamis & Fatima Ali & Sreeja Attur & Nicholas M Fountain-Jones & Mohammad Zubaid, 2022. "Anemia or other comorbidities? using machine learning to reveal deeper insights into the drivers of acute coronary syndromes in hospital admitted patients," PLOS ONE, Public Library of Science, vol. 17(1), pages 1-15, January.
    5. Franck Ramaharo & Fitiavana Randriamifidy, 2023. "Determinants of renewable energy consumption in Madagascar: Evidence from feature selection algorithms," Papers 2401.13671, arXiv.org.
    6. Sara Saadatmand & Khodakaram Salimifard & Reza Mohammadi & Alex Kuiper & Maryam Marzban & Akram Farhadi, 2023. "Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients," Annals of Operations Research, Springer, vol. 328(1), pages 1043-1071, September.
    7. Svetlana Kresova & Sebastian Hess, 2022. "Identifying the Determinants of Regional Raw Milk Prices in Russia Using Machine Learning," Agriculture, MDPI, vol. 12(7), pages 1-18, July.
    8. Jukka Ranta & Dmitri Matjushin & Terhi Virtanen & Markku Kuusi & Hildegunn Viljugrein & Merete Hofshagen & Marjaana Hakkinen, 2011. "Bayesian Temporal Source Attribution of Foodborne Zoonoses: Campylobacter in Finland and Norway," Risk Analysis, John Wiley & Sons, vol. 31(7), pages 1156-1171, July.
    9. Kresova, Svetlana & Hess, Sebastian, 2021. "Determinants of Regional Raw Milk Prices in Russia," 61st Annual Conference, Berlin, Germany, September 22-24, 2021 317051, German Association of Agricultural Economists (GEWISOLA).
    10. Tanzeela Khalid & Raphael Aggio & Paul White & Ben De Lacy Costello & Raj Persad & Huda Al-Kateb & Peter Jones & Chris S Probert & Norman Ratcliffe, 2015. "Urinary Volatile Organic Compounds for the Detection of Prostate Cancer," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-15, November.
    11. Gehan A. Mousa & Elsayed A. H. Elamir & Khaled Hussainey, 2022. "Using machine learning methods to predict financial performance: Does disclosure tone matter?," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 19(1), pages 93-112, March.
    12. Emma L. Snary & Arno N. Swart & Tine Hald, 2016. "Quantitative Microbiological Risk Assessment and Source Attribution for Salmonella: Taking it Further," Risk Analysis, John Wiley & Sons, vol. 36(3), pages 433-436, March.
    13. Carlos Família & Sarah R Dennison & Alexandre Quintas & David A Phoenix, 2015. "Prediction of Peptide and Protein Propensity for Amyloid Formation," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-16, August.
    14. Prabal Das & D. A. Sachindra & Kironmala Chanda, 2022. "Machine Learning-Based Rainfall Forecasting with Multiple Non-Linear Feature Selection Algorithms," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 36(15), pages 6043-6071, December.
    15. Jie Zhao & Ji Chen & Damien Beillouin & Hans Lambers & Yadong Yang & Pete Smith & Zhaohai Zeng & Jørgen E. Olesen & Huadong Zang, 2022. "Global systematic review with meta-analysis reveals yield advantage of legume-based rotations and its drivers," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    16. Piaopiao Chen & Agnès H. Michel & Jianzhi Zhang, 2022. "Transposon insertional mutagenesis of diverse yeast strains suggests coordinated gene essentiality polymorphisms," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    17. Paulo Infante & Gonçalo Jacinto & Anabela Afonso & Leonor Rego & Pedro Nogueira & Marcelo Silva & Vitor Nogueira & José Saias & Paulo Quaresma & Daniel Santos & Patrícia Góis & Paulo Rebelo Manuel, 2023. "Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal," Sustainability, MDPI, vol. 15(3), pages 1-16, January.
    18. Ephrem Habyarimana & Faheem S Baloch, 2021. "Machine learning models based on remote and proximal sensing as potential methods for in-season biomass yields prediction in commercial sorghum fields," PLOS ONE, Public Library of Science, vol. 16(3), pages 1-23, March.
    19. Tong, Jianfeng & Liu, Zhenxing & Zhang, Yong & Zheng, Xiujuan & Jin, Junyang, 2023. "Improved multi-gate mixture-of-experts framework for multi-step prediction of gas load," Energy, Elsevier, vol. 282(C).
    20. Banks, Jonathan & Rabbani, Arif & Nadkarni, Kabir & Renaud, Evan, 2020. "Estimating parasitic loads related to brine production from a hot sedimentary aquifer geothermal project: A case study from the Clarke Lake gas field, British Columbia," Renewable Energy, Elsevier, vol. 153(C), pages 539-552.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:riskan:v:40:y:2020:i:9:p:1693-1705. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://doi.org/10.1111/(ISSN)1539-6924 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.