IDEAS home Printed from https://ideas.repec.org/a/plo/pntd00/0008018.html
   My bibliography  Save this article

Using affinity propagation clustering for identifying bacterial clades and subclades with whole-genome sequences of Francisella tularensis

Author

Listed:
  • Anne Busch
  • Timo Homeier-Bachmann
  • Mostafa Y Abdel-Glil
  • Anja Hackbart
  • Helmut Hotzel
  • Herbert Tomaso

Abstract

By combining a reference-independent SNP analysis and average nucleotide identity (ANI) with affinity propagation clustering (APC), we developed a significantly improved methodology allowing resolving phylogenetic relationships, based on objective criteria. These bioinformatics tools can be used as a general ruler to determine phylogenetic relationships and clustering of bacteria, exemplary done with Francisella (F.) tularensis. Molecular epidemiology of F. tularensis is currently assessed mostly based on laboratory methods and molecular analysis. The high evolutionary stability and the clonal nature makes Francisella ideal for subtyping with single nucleotide polymorphisms (SNPs). Sequencing and real-time PCR can be used to validate the SNP analysis. We investigate whole-genome sequences of 155 F. tularensis subsp. holarctica isolates. Phylogenetic testing was based on SNPs and average nucleotide identity (ANI) as reference independent, alignment-free methods taking small-scale and large-scale differences within the genomes into account. Especially the whole genome SNP analysis with kSNP3.0 allowed deciphering quite subtle signals of systematic differences in molecular variation. Affinity propagation clustering (APC) resulted in three clusters showing the known clades B.4, B.6, and B.12. These data correlated with the results of real‐time PCR assays targeting canSNPs loci. Additionally, we detected two subtle sub-clusters. SplitsTree was used with standard-setting using the aligned SNPs from Parsnps. Together APC, HierBAPS, and SplitsTree enabled us to generate hypotheses about epidemiologic relationships between bacterial clusters and describing the distribution of isolates. Our data indicate that the choice of the typing technique can increase our understanding of the pathogenesis and transmission of diseases with the eventual for prevention. This is opening perspectives to be applied to other bacterial species. The data provide evidence that Germany might be the collision zone where the clade B.12, also known as the East European clade, overlaps with the clade B.6, also known as the Iberian clade. Described methods allow generating a new, more detailed perspective for F. tularensis subsp. holarctica phylogeny. These results may encourage to determine phylogenetic relationships and clustering of other bacteria the same way.Author summary: By combining a reference-independent SNP analysis and ANI (average nucleotide identity) with affinity propagation clustering (APC), we tested a methodology allowing resolving phylogenetic relationships, based on objective criteria. These bioinformatics tools can be used as a general ruler to determine phylogenetic relationships and clustering of bacteria, exemplary done with Francisella tularensis. Francisella tularensis causes the zoonosis tularemia. We analyzed the relationships between Francisella tularensis subsp. holarctica isolates from Germany using whole-genome sequences. We chose open source, reference independent methods to optimize the level of discrimination. Using a recently described clustering algorithm, we exploit a novel approach to the clustering of bacteria. APC can be used for assigning clades and can be used for rapidly typing strains when they arise. Additionally, we detected two sub-clusters. The data provide evidence that Germany is the collision zone where the clade B.12, also known as the East European clade, overlaps with the clade B.6, also known as the Iberian clade.

Suggested Citation

  • Anne Busch & Timo Homeier-Bachmann & Mostafa Y Abdel-Glil & Anja Hackbart & Helmut Hotzel & Herbert Tomaso, 2020. "Using affinity propagation clustering for identifying bacterial clades and subclades with whole-genome sequences of Francisella tularensis," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 14(9), pages 1-19, September.
  • Handle: RePEc:plo:pntd00:0008018
    DOI: 10.1371/journal.pntd.0008018
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0008018
    Download Restriction: no

    File URL: https://journals.plos.org/plosntds/article/file?id=10.1371/journal.pntd.0008018&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pntd.0008018?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pntd00:0008018. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosntds (email available below). General contact details of provider: https://journals.plos.org/plosntds/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.