IDEAS home Printed from https://ideas.repec.org/a/eee/proeco/v165y2015icp215-222.html
   My bibliography  Save this article

A discriminative and semantic feature selection method for text categorization

Author

Listed:
  • Zong, Wei
  • Wu, Feng
  • Chu, Lap-Keung
  • Sculli, Domenic

Abstract

Text categorization is an important and critical task in the current era of high volume data storage and handling. Feature selection is obviously one of the most important steps in text categorization. Traditional feature selection methods tend to only consider the correlation between features and categories, and have in the main ignored the semantic similarity between features and documents. To further explore this issue, this paper proposes a novel feature selection method that first selects features in documents with discriminative power and then computes the semantic similarity between features and documents. The proposed feature selection method is tested using a support vector machine (SVM) classifier upon two published datasets, viz. Reuters-21578 and 20-Newsgroups. The experimental results show that the proposed feature selection method generally outperforms the traditional feature selection methods for text categorization for both published datasets.

Suggested Citation

  • Zong, Wei & Wu, Feng & Chu, Lap-Keung & Sculli, Domenic, 2015. "A discriminative and semantic feature selection method for text categorization," International Journal of Production Economics, Elsevier, vol. 165(C), pages 215-222.
  • Handle: RePEc:eee:proeco:v:165:y:2015:i:c:p:215-222
    DOI: 10.1016/j.ijpe.2014.12.035
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0925527314004290
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ijpe.2014.12.035?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Balakrishnan, Anant & Kalakota, Ravi & Ow, Peng Si & Whinston, Andrew B., 1995. "Document-centered information systems to support reactive problem-solving in manufacturing," International Journal of Production Economics, Elsevier, vol. 38(1), pages 31-58, March.
    2. Deng, S. & Yeh, Tsung-Han, 2011. "Using least squares support vector machines for the airframe structures manufacturing cost estimation," International Journal of Production Economics, Elsevier, vol. 131(2), pages 701-708, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Binni & Wang, Pong & Tu, Yiliu, 2021. "Customer satisfaction service match and service quality-based blockchain cloud manufacturing," International Journal of Production Economics, Elsevier, vol. 240(C).
    2. Yang Liu & Naiwei Lu & Xinfeng Yin & Mohammad Noori, 2016. "An adaptive support vector regression method for structural system reliability assessment and its application to a cable-stayed bridge," Journal of Risk and Reliability, , vol. 230(2), pages 204-219, April.
    3. Fujimoto, Takahiro & Park, Young Won, 2014. "Balancing supply chain competitiveness and robustness through “virtual dual sourcing”: Lessons from the Great East Japan Earthquake," International Journal of Production Economics, Elsevier, vol. 147(PB), pages 429-436.
    4. Amani, Farzaneh A. & Fadlalla, Adam M., 2017. "Data mining applications in accounting: A review of the literature and organizing framework," International Journal of Accounting Information Systems, Elsevier, vol. 24(C), pages 32-58.
    5. Du, Shichang & Lv, Jun, 2013. "Minimal Euclidean distance chart based on support vector regression for monitoring mean shifts of auto-correlated processes," International Journal of Production Economics, Elsevier, vol. 141(1), pages 377-387.
    6. Isaac Kofi Nti & Adebayo Felix Adekoya & Benjamin Asubam Weyori & Owusu Nyarko-Boateng, 2022. "Applications of artificial intelligence in engineering and manufacturing: a systematic review," Journal of Intelligent Manufacturing, Springer, vol. 33(6), pages 1581-1601, August.
    7. Loyer, Jean-Loup & Henriques, Elsa & Fontul, Mihail & Wiseall, Steve, 2016. "Comparison of Machine Learning methods applied to the estimation of manufacturing cost of jet engine components," International Journal of Production Economics, Elsevier, vol. 178(C), pages 109-119.
    8. Bodendorf, Frank & Xie, Qiao & Merkl, Philipp & Franke, Jörg, 2022. "A multi-perspective approach to support collaborative cost management in supplier-buyer dyads," International Journal of Production Economics, Elsevier, vol. 245(C).
    9. Hong-Sen Yan & Yu-Fang Wang, 2019. "Matching decision method for knowledgeable manufacturing system and its production environment," Journal of Intelligent Manufacturing, Springer, vol. 30(2), pages 771-782, February.
    10. Ayşe Nur Adıgüzel Tüylü & Ergün Eroğlu, 2019. "Using Machine Learning Algorithms For Forecasting Rate of Return Product In Reverse Logistics Process," Alphanumeric Journal, Bahadir Fatih Yildirim, vol. 7(1), pages 143-156, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:proeco:v:165:y:2015:i:c:p:215-222. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/ijpe .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.