IDEAS home Printed from https://ideas.repec.org/a/igg/jdwm00/v16y2020i3p168-182.html
   My bibliography  Save this article

Integrating Feature and Instance Selection Techniques in Opinion Mining

Author

Listed:
  • Zi-Hung You

    (Department of Nephrology, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan)

  • Ya-Han Hu

    (Department of Information Management, National Central University, Taoyuan, Taiwan & Center for Innovative Research on Aging Society (CIRAS), Chiayi, National Chung Cheng University, Taiwan & MOST AI Biomedical Research Center at National Cheng Kung University, Tainan, Taiwan)

  • Chih-Fong Tsai

    (Department of Information Management, National Central University, Taiwan)

  • Yen-Ming Kuo

    (Department of Information Management, National Chung Cheng University, Chiayi, Taiwan)

Abstract

Opinion mining focuses on extracting polarity information from texts. For textual term representation, different feature selection methods, e.g. term frequency (TF) or term frequency–inverse document frequency (TF–IDF), can yield diverse numbers of text features. In text classification, however, a selected training set may contain noisy documents (or outliers), which can degrade the classification performance. To solve this problem, instance selection can be adopted to filter out unrepresentative training documents. Therefore, this article investigates the opinion mining performance associated with feature and instance selection steps simultaneously. Two combination processes based on performing feature selection and instance selection in different orders, were compared. Specifically, two feature selection methods, namely TF and TF–IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiers showed that TF–IDF followed by DROP3 performs the best.

Suggested Citation

  • Zi-Hung You & Ya-Han Hu & Chih-Fong Tsai & Yen-Ming Kuo, 2020. "Integrating Feature and Instance Selection Techniques in Opinion Mining," International Journal of Data Warehousing and Mining (IJDWM), IGI Global, vol. 16(3), pages 168-182, July.
  • Handle: RePEc:igg:jdwm00:v:16:y:2020:i:3:p:168-182
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJDWM.2020070109
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jdwm00:v:16:y:2020:i:3:p:168-182. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.