IDEAS home Printed from https://ideas.repec.org/a/wsi/jikmxx/v11y2012i01ns0219649212500062.html
   My bibliography  Save this article

Arabic Text Mining Using Rule Based Classification

Author

Listed:
  • Fadi Thabtah

    (MIS Department, Philadelphia University, Amman, Jordan)

  • Omar Gharaibeh

    (Computing and IS Department, Brunel University, UK)

  • Rashid Al-Zubaidy

    (CIS Department, Philadelphia University, Amman, Jordan)

Abstract

A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studies conducted on other languages like Arabic. In this respect, the paper proposes to investigate the problem of Arabic text classification comprehensively. More specifically the study measures the performance of different rule based classification approaches adopted from machine learning and data mining towards the problem of text Arabic classification. In particular, four different rule based classification approaches: Decision trees (C4.5), Rule Induction (RIPPER), Hybrid (PART) and Simple Rule (One Rule) are evaluated against the published Corpus of Contemporary Arabic Arabic text collection. This experimentation is carried out by employing a modified version of WEKA business intelligence tool. Through analysing the produced results from the experimentation, we determine the most suitable classification algorithms for classifying Arabic texts.

Suggested Citation

  • Fadi Thabtah & Omar Gharaibeh & Rashid Al-Zubaidy, 2012. "Arabic Text Mining Using Rule Based Classification," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 11(01), pages 1-10.
  • Handle: RePEc:wsi:jikmxx:v:11:y:2012:i:01:n:s0219649212500062
    DOI: 10.1142/S0219649212500062
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219649212500062
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219649212500062?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jan Jonker & Bartjan W. Pennink, 2010. "The Essence of Methodology," Springer Books, in: The Essence of Research Methodology, chapter 0, pages 21-41, Springer.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Firuz Kamalov & Fadi Thabtah, 2017. "A Feature Selection Method Based on Ranked Vector Scores of Features for Classification," Annals of Data Science, Springer, vol. 4(4), pages 483-502, December.
    2. Fadi Thabtah & Firuz Kamalov, 2017. "Phishing Detection: A Case Analysis on Classifiers with Rules Using Machine Learning," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 16(04), pages 1-16, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nuru Siraj & István Hágen & Afriyadi Cahyadi & Anita Tangl & Goshu Desalegn, 2022. "Linking Leadership to Employees Performance: The Mediating Role of Human Resource Management," Economies, MDPI, vol. 10(5), pages 1-21, May.
    2. Daniel Owusu-Mensah & Ren Naifei & Lydia Brako & Priscilla Boateng & Williams Kweku Darkwah, 2020. "Analysis of Production System Management of Ghana¡¯s Food and Beverage Industry: Empirical evidence from Spare Parts Inventory Control, Production Quality and Maintenance Modeling," Journal of Food Industry, Macrothink Institute, vol. 4(1), pages 1-43, November.
    3. Eugine Tafadzwa Maziriri & Miston Mapuranga & Tafadzwa Clementine Maramura & Ogochukwu I. Nzewi, 2019. "Navigating on the key drivers for a transition to a green economy: evidence from women entrepreneurs in South Africa," Entrepreneurship and Sustainability Issues, VsI Entrepreneurship and Sustainability Center, vol. 7(2), pages 1686-1703, December.
    4. Sana Mumtaz, 2022. "Should practical usefulness be considered for theory building in HRD? Traditional versus pragmatism approach," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(3), pages 1245-1259, June.
    5. Heba Maarouf, 2019. "Pragmatism as a Supportive Paradigm for the Mixed Research Approach: Conceptualizing the Ontological, Epistemological, and Axiological Stances of Pragmatism," International Business Research, Canadian Center of Science and Education, vol. 12(9), pages 1-12, September.
    6. Okolie, Ugochukwu Chinonso & Nwosu, Hyginus Emeka & Eneje, Beatrice Chinyere & Oluka, Beth N., 2019. "Reclaiming education: Rising above examination malpractices, and its contextual factors on study progress in Nigeria," International Journal of Educational Development, Elsevier, vol. 65(C), pages 44-56.
    7. Mihaela Paraschiva Luca & Ileana Tache, 2021. "Sustainability of Public Finance through the Lens of Transfer Prices and Their Associated Risks: An Empirical Research," Sustainability, MDPI, vol. 13(12), pages 1-19, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:jikmxx:v:11:y:2012:i:01:n:s0219649212500062. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/jikm/jikm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.