IDEAS home Printed from https://ideas.repec.org/a/taf/tsysxx/v43y2012i10p1805-1825.html
   My bibliography  Save this article

Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty

Author

Listed:
  • Jiazhen He
  • Yang Zhang
  • Xue Li
  • Peng Shi

Abstract

Traditional classification algorithms require a large number of labelled examples from all the predefined classes, which is generally difficult and time-consuming to obtain. Furthermore, data uncertainty is prevalent in many real-world applications, such as sensor network, market analysis and medical diagnosis. In this article, we explore the issue of classification on uncertain data when only positive and unlabelled examples are available. We propose an algorithm to build naive Bayes classifier from positive and unlabelled examples with uncertainty. However, the algorithm requires the prior probability of positive class, and it is generally difficult for the user to provide this parameter in practice. Two approaches are proposed to avoid this user-specified parameter. One approach is to use a validation set to search for an appropriate value for this parameter, and the other is to estimate it directly. Our extensive experiments show that the two approaches can basically achieve satisfactory classification performance on uncertain data. In addition, our algorithm exploiting uncertainty in the dataset can potentially achieve better classification performance comparing to traditional naive Bayes which ignores uncertainty when handling uncertain data.

Suggested Citation

  • Jiazhen He & Yang Zhang & Xue Li & Peng Shi, 2012. "Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty," International Journal of Systems Science, Taylor & Francis Journals, vol. 43(10), pages 1805-1825.
  • Handle: RePEc:taf:tsysxx:v:43:y:2012:i:10:p:1805-1825
    DOI: 10.1080/00207721.2011.627475
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/00207721.2011.627475
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/00207721.2011.627475?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Akshay Kangale & S. Krishna Kumar & Mohd Arshad Naeem & Mark Williams & M. K. Tiwari, 2016. "Mining consumer reviews to generate ratings of different product attributes while producing feature-based review-summary," International Journal of Systems Science, Taylor & Francis Journals, vol. 47(13), pages 3272-3286, October.
    2. Chunquan Liang & Yang Zhang & Peng Shi & Zhengguo Hu, 2015. "Learning accurate very fast decision trees from uncertain data streams," International Journal of Systems Science, Taylor & Francis Journals, vol. 46(16), pages 3032-3050, December.
    3. Oliver Takawira & John W. Muteba Mwamba, 2020. "Determinants of Sovereign Credit Ratings: An Application of the Naïve Bayes Classifier," Eurasian Journal of Economics and Finance, Eurasian Publications, vol. 8(4), pages 279-299.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:tsysxx:v:43:y:2012:i:10:p:1805-1825. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/TSYS20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.