IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0125811.html
   My bibliography  Save this article

Predicting Protein-Protein Interactions from Primary Protein Sequences Using a Novel Multi-Scale Local Feature Representation Scheme and the Random Forest

Author

Listed:
  • Zhu-Hong You
  • Keith C C Chan
  • Pengwei Hu

Abstract

The study of protein-protein interactions (PPIs) can be very important for the understanding of biological cellular functions. However, detecting PPIs in the laboratories are both time-consuming and expensive. For this reason, there has been much recent effort to develop techniques for computational prediction of PPIs as this can complement laboratory procedures and provide an inexpensive way of predicting the most likely set of interactions at the entire proteome scale. Although much progress has already been achieved in this direction, the problem is still far from being solved. More effective approaches are still required to overcome the limitations of the current ones. In this study, a novel Multi-scale Local Descriptor (MLD) feature representation scheme is proposed to extract features from a protein sequence. This scheme can capture multi-scale local information by varying the length of protein-sequence segments. Based on the MLD, an ensemble learning method, the Random Forest (RF) method, is used as classifier. The MLD feature representation scheme facilitates the mining of interaction information from multi-scale continuous amino acid segments, making it easier to capture multiple overlapping continuous binding patterns within a protein sequence. When the proposed method is tested with the PPI data of Saccharomyces cerevisiae, it achieves a prediction accuracy of 94.72% with 94.34% sensitivity at the precision of 98.91%. Extensive experiments are performed to compare our method with existing sequence-based method. Experimental results show that the performance of our predictor is better than several other state-of-the-art predictors also with the H. pylori dataset. The reason why such good results are achieved can largely be credited to the learning capabilities of the RF model and the novel MLD feature representation scheme. The experiment results show that the proposed approach can be very promising for predicting PPIs and can be a useful tool for future proteomic studies.

Suggested Citation

  • Zhu-Hong You & Keith C C Chan & Pengwei Hu, 2015. "Predicting Protein-Protein Interactions from Primary Protein Sequences Using a Novel Multi-Scale Local Feature Representation Scheme and the Random Forest," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-19, May.
  • Handle: RePEc:plo:pone00:0125811
    DOI: 10.1371/journal.pone.0125811
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0125811
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0125811&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0125811?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Luis P Fernandes & Alessia Annibale & Jens Kleinjung & Anthony C C Coolen & Franca Fraternali, 2010. "Protein Networks Reveal Detection Bias and Species Consistency When Analysed by Information-Theoretic Methods," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-14, August.
    2. Qing-Ju Jiao & Yan-Kai Zhang & Lu-Ning Li & Hong-Bin Shen, 2011. "BinTree Seeking: A Novel Approach to Mine Both Bi-Sparse and Cohesive Modules in Protein Interaction Networks," PLOS ONE, Public Library of Science, vol. 6(11), pages 1-12, November.
    3. Benjamin A Shoemaker & Anna R Panchenko, 2007. "Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners," PLOS Computational Biology, Public Library of Science, vol. 3(4), pages 1-7, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ingoo Lee & Jongsoo Keum & Hojung Nam, 2019. "DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences," PLOS Computational Biology, Public Library of Science, vol. 15(6), pages 1-21, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xinyi Liu & Bin Liu & Zhimin Huang & Ting Shi & Yingyi Chen & Jian Zhang, 2012. "SPPS: A Sequence-Based Method for Predicting Probability of Protein-Protein Interaction Partners," PLOS ONE, Public Library of Science, vol. 7(1), pages 1-6, January.
    2. Chuanhua Xing & David B Dunson, 2011. "Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein-Protein Interactions," PLOS Computational Biology, Public Library of Science, vol. 7(7), pages 1-10, July.
    3. Saket Navlakha & Anthony Gitter & Ziv Bar-Joseph, 2012. "A Network-based Approach for Predicting Missing Pathway Interactions," PLOS Computational Biology, Public Library of Science, vol. 8(8), pages 1-13, August.
    4. Saeid Rasti & Chrysafis Vogiatzis, 2019. "A survey of computational methods in protein–protein interaction networks," Annals of Operations Research, Springer, vol. 276(1), pages 35-87, May.
    5. Guilherme T Valente & Marcio L Acencio & Cesar Martins & Ney Lemke, 2013. "The Development of a Universal In Silico Predictor of Protein-Protein Interactions," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-11, May.
    6. Wei Zhang & Jia Xu & Yuanyuan Li & Xiufen Zou, 2017. "A new two-stage method for revealing missing parts of edges in protein-protein interaction networks," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-22, May.
    7. Jana Kludas & Mikko Arvas & Sandra Castillo & Tiina Pakula & Merja Oja & Céline Brouard & Jussi Jäntti & Merja Penttilä & Juho Rousu, 2016. "Machine Learning of Protein Interactions in Fungal Secretory Pathways," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-20, July.
    8. Hai-Bo Zhang & Xiao-Bao Ding & Jie Jin & Wen-Ping Guo & Qiao-Lei Yang & Peng-Cheng Chen & Heng Yao & Li Ruan & Yu-Tian Tao & Xin Chen, 2022. "Predicted mouse interactome and network-based interpretation of differentially expressed genes," PLOS ONE, Public Library of Science, vol. 17(4), pages 1-16, April.
    9. Vijaykumar Yogesh Muley & Akash Ranjan, 2012. "Effect of Reference Genome Selection on the Performance of Computational Methods for Genome-Wide Protein-Protein Interaction Prediction," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-13, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0125811. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.