IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v7y2015i3p2338-2352d46200.html
   My bibliography  Save this article

SAW Classification Algorithm for Chinese Text Classification

Author

Listed:
  • Xiaoli Guo

    (School of Information Engineering, Northeast Dianli University, Jilin 132012, China)

  • Huiyu Sun

    (School of Information Engineering, Northeast Dianli University, Jilin 132012, China)

  • Tiehua Zhou

    (Database/Bioinformatics Laboratory, Chungbuk National University, Chungbuk 362-763, Korea)

  • Ling Wang

    (School of Information Engineering, Northeast Dianli University, Jilin 132012, China)

  • Zhaoyang Qu

    (School of Information Engineering, Northeast Dianli University, Jilin 132012, China)

  • Jiannan Zang

    (School of Information Engineering, Northeast Dianli University, Jilin 132012, China)

Abstract

Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words have an implied correlation between text information mining and text categorization for high-correlation matching. Experiments show that SAW classification algorithm on the premise of ensuring precision in classification, significantly improve the classification precision and recall, obviously improving the performance of information retrieval, and providing an effective means of data use in the era of big data information extraction.

Suggested Citation

  • Xiaoli Guo & Huiyu Sun & Tiehua Zhou & Ling Wang & Zhaoyang Qu & Jiannan Zang, 2015. "SAW Classification Algorithm for Chinese Text Classification," Sustainability, MDPI, vol. 7(3), pages 1-15, February.
  • Handle: RePEc:gam:jsusta:v:7:y:2015:i:3:p:2338-2352:d:46200
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/7/3/2338/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/7/3/2338/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Taeyeoun Roh & Yujin Jeong & Byungun Yoon, 2017. "Developing a Methodology of Structuring and Layering Technological Information in Patent Documents through Natural Language Processing," Sustainability, MDPI, vol. 9(11), pages 1-19, November.
    2. Ling Wang & Gongliang Hu & Tiehua Zhou, 2018. "Semantic Analysis of Learners’ Emotional Tendencies on Online MOOC Education," Sustainability, MDPI, vol. 10(6), pages 1-19, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:7:y:2015:i:3:p:2338-2352:d:46200. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.