IDEAS home Printed from https://ideas.repec.org/a/igg/jdwm00/v20y2024i1p1-25.html
   My bibliography  Save this article

Arabic Clustering Through Advanced Stemming and WordNet-Based Extraction for Water Cycle Cluster

Author

Listed:
  • Deema Mohammed Alsekait

    (Department of Computer Science and Information Technology, Applied College, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia)

  • Jaffar Atwan

    (Department of Computer Information System, Prince Abdullah Bin Ghazi Faculty of ICT, Al~Balqa Applied University, Al-Salt, Jordan)

  • Qusay Bsoul

    (Cybersecurity Department, College of Computer Sciences and Informatics, Amman Arab University, Amman, Jordan)

  • Sharaf Alzoubi

    (College of Computer Sciences and Informatics, Amman Arab University, Amman, Jordan)

  • Hanaa Fathi

    (Applied Science Research Center, Applied Science Private University, Amman, Jordan)

  • Malik Jawarneh

    (College of Computer Sciences and Informatics, Amman Arab University, Amman, Jordan)

  • Abeer Saber

    (Benha Univerity, Egypt)

  • Diaa Salama AbdElminaam

    (MEU Research Unit, Middle East University, Amman, Jordan & Jadara Research Center, Jadara University, Irbid, Jordan)

Abstract

Natural language processing represents human language in computational technique, which is to achieve the extraction of important words. The verbs and nouns found in the Arabic language are significantly pertinent in the process of differentiating each class label available for the purpose of machine learning, specifically in 'Arabic Clustering'. This paper implemented the extraction of verbs and nouns sourced from the Qur'an and text clustering for further evaluation by using two datasets. The limitations of conventional clusters were identified, such as k-means clustering on the initial centroids. Therefore, the current work incorporated a novel clustering optimisation technique known as the water cycle algorithm; when combined with k-means, the algorithm would select the optimal initial centroids. Consequently, the experiments revealed the proposed extraction technique to outperform other extraction methods when using an actual Qur'an dataset.

Suggested Citation

  • Deema Mohammed Alsekait & Jaffar Atwan & Qusay Bsoul & Sharaf Alzoubi & Hanaa Fathi & Malik Jawarneh & Abeer Saber & Diaa Salama AbdElminaam, 2024. "Arabic Clustering Through Advanced Stemming and WordNet-Based Extraction for Water Cycle Cluster," International Journal of Data Warehousing and Mining (IJDWM), IGI Global, vol. 20(1), pages 1-25, January.
  • Handle: RePEc:igg:jdwm00:v:20:y:2024:i:1:p:1-25
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJDWM.352601
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hani Abu‐Salem & Mahmoud Al‐Omari & Martha W. Evens, 1999. "Stemming methodologies over individual query words for an Arabic Information Retrieval System," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 50(6), pages 524-529.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      More about this item

      Statistics

      Access and download statistics

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jdwm00:v:20:y:2024:i:1:p:1-25. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.