IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0066279.html
   My bibliography  Save this article

SVM-Based Prediction of Propeptide Cleavage Sites in Spider Toxins Identifies Toxin Innovation in an Australian Tarantula

Author

Listed:
  • Emily S W Wong
  • Margaret C Hardy
  • David Wood
  • Timothy Bailey
  • Glenn F King

Abstract

Spider neurotoxins are commonly used as pharmacological tools and are a popular source of novel compounds with therapeutic and agrochemical potential. Since venom peptides are inherently toxic, the host spider must employ strategies to avoid adverse effects prior to venom use. It is partly for this reason that most spider toxins encode a protective proregion that upon enzymatic cleavage is excised from the mature peptide. In order to identify the mature toxin sequence directly from toxin transcripts, without resorting to protein sequencing, the propeptide cleavage site in the toxin precursor must be predicted bioinformatically. We evaluated different machine learning strategies (support vector machines, hidden Markov model and decision tree) and developed an algorithm (SpiderP) for prediction of propeptide cleavage sites in spider toxins. Our strategy uses a support vector machine (SVM) framework that combines both local and global sequence information. Our method is superior or comparable to current tools for prediction of propeptide sequences in spider toxins. Evaluation of the SVM method on an independent test set of known toxin sequences yielded 96% sensitivity and 100% specificity. Furthermore, we sequenced five novel peptides (not used to train the final predictor) from the venom of the Australian tarantula Selenotypus plumipes to test the accuracy of the predictor and found 80% sensitivity and 99.6% 8-mer specificity. Finally, we used the predictor together with homology information to predict and characterize seven groups of novel toxins from the deeply sequenced venom gland transcriptome of S. plumipes, which revealed structural complexity and innovations in the evolution of the toxins. The precursor prediction tool (SpiderP) is freely available on ArachnoServer (http://www.arachnoserver.org/spiderP.html), a web portal to a comprehensive relational database of spider toxins. All training data, test data, and scripts used are available from the SpiderP website.

Suggested Citation

  • Emily S W Wong & Margaret C Hardy & David Wood & Timothy Bailey & Glenn F King, 2013. "SVM-Based Prediction of Propeptide Cleavage Sites in Spider Toxins Identifies Toxin Innovation in an Australian Tarantula," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-11, July.
  • Handle: RePEc:plo:pone00:0066279
    DOI: 10.1371/journal.pone.0066279
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066279
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0066279&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0066279?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Asa Ben-Hur & Cheng Soon Ong & Sören Sonnenburg & Bernhard Schölkopf & Gunnar Rätsch, 2008. "Support Vector Machines and Kernels for Computational Biology," PLOS Computational Biology, Public Library of Science, vol. 4(10), pages 1-10, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alaa Tharwat & Aboul Ella Hassanien, 2019. "Quantum-Behaved Particle Swarm Optimization for Parameter Optimization of Support Vector Machine," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 576-598, October.
    2. Lior Shamir & John D Delaney & Nikita Orlov & D Mark Eckley & Ilya G Goldberg, 2010. "Pattern Recognition Software and Techniques for Biological Image Analysis," PLOS Computational Biology, Public Library of Science, vol. 6(11), pages 1-10, November.
    3. Kay H Brodersen & Thomas M Schofield & Alexander P Leff & Cheng Soon Ong & Ekaterina I Lomakina & Joachim M Buhmann & Klaas E Stephan, 2011. "Generative Embedding for Model-Based Classification of fMRI Data," PLOS Computational Biology, Public Library of Science, vol. 7(6), pages 1-19, June.
    4. Shweta Bhandare & Debra S Goldberg & Robin Dowell, 2017. "Discriminating between HuR and TTP binding sites using the k-spectrum kernel method," PLOS ONE, Public Library of Science, vol. 12(3), pages 1-14, March.
    5. Wei Shui & Yiyi Zhang & Xinggui Wang & Yuanmeng Liu & Qianfeng Wang & Fei Duan & Chaowei Wu & Wanyu Shui, 2022. "Does Tibetan Household Livelihood Capital Enhance Tourism Participation Sustainability? Evidence from China’s Jiaju Tibetan Village," IJERPH, MDPI, vol. 19(15), pages 1-15, July.
    6. Marina M -C Vidovic & Nico Görnitz & Klaus-Robert Müller & Gunnar Rätsch & Marius Kloft, 2015. "SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-23, December.
    7. Emili Balaguer-Ballester & Christopher C Lapish & Jeremy K Seamans & Daniel Durstewitz, 2011. "Attracting Dynamics of Frontal Cortex Ensembles during Memory-Guided Decision-Making," PLOS Computational Biology, Public Library of Science, vol. 7(5), pages 1-19, May.
    8. A Ivanenko & P Watkins & M A J van Gerven & K Hammerschmidt & B Englitz, 2020. "Classifying sex and strain from mouse ultrasonic vocalizations using deep learning," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-27, June.
    9. Yue Deng & Yanyu Zhao & Yebin Liu & Qionghai Dai, 2013. "Differences Help Recognition: A Probabilistic Interpretation," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-10, June.
    10. Charlotte Soneson & Sarah Gerster & Mauro Delorenzi, 2014. "Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-13, June.
    11. Igor O Korolev & Laura L Symonds & Andrea C Bozoki & Alzheimer's Disease Neuroimaging Initiative, 2016. "Predicting Progression from Mild Cognitive Impairment to Alzheimer's Dementia Using Clinical, MRI, and Plasma Biomarkers via Probabilistic Pattern Classification," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-25, February.
    12. Stephen J Gilmore, 2018. "Automated decision support in melanocytic lesion management," PLOS ONE, Public Library of Science, vol. 13(9), pages 1-15, September.
    13. Juan A G Ranea & Ian Morilla & Jon G Lees & Adam J Reid & Corin Yeats & Andrew B Clegg & Francisca Sanchez-Jimenez & Christine Orengo, 2010. "Finding the “Dark Matter” in Human and Yeast Protein Network Prediction and Modelling," PLOS Computational Biology, Public Library of Science, vol. 6(9), pages 1-14, September.
    14. S. Camelo & M. González-Lima & A. Quiroz, 2015. "Nearest neighbors methods for support vector machines," Annals of Operations Research, Springer, vol. 235(1), pages 85-101, December.
    15. Takaya Saito & Marc Rehmsmeier, 2015. "The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-21, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0066279. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.