IDEAS home Printed from https://ideas.repec.org/a/bla/jamist/v61y2010i5p1015-1024.html
   My bibliography  Save this article

A comparison study of some Arabic root finding algorithms

Author

Listed:
  • Emad Al‐Shawakfa
  • Amer Al‐Badarneh
  • Safwan Shatnawi
  • Khaleel Al‐Rabab'ah
  • Basel Bani‐Ismail

Abstract

Arabic has a complex structure, which makes it difficult to apply natural language processing (NLP). Much research on Arabic NLP (ANLP) does exist; however, it is not as mature as that of other languages. Finding Arabic roots is an important step toward conducting effective research on most of ANLP applications. The authors have studied and compared six root‐finding algorithms with success rates of over 90%. All algorithms of this study did not use the same testing corpus and/or benchmarking measures. They unified the testing process by implementing their own algorithm descriptions and building a corpus out of 3823 triliteral roots, applying 73 triliteral patterns, and with 18 affixes, producing around 27.6 million words. They tested the algorithms with the generated corpus and have obtained interesting results; they offer to share the corpus freely for benchmarking and ANLP research.

Suggested Citation

  • Emad Al‐Shawakfa & Amer Al‐Badarneh & Safwan Shatnawi & Khaleel Al‐Rabab'ah & Basel Bani‐Ismail, 2010. "A comparison study of some Arabic root finding algorithms," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(5), pages 1015-1024, May.
  • Handle: RePEc:bla:jamist:v:61:y:2010:i:5:p:1015-1024
    DOI: 10.1002/asi.21301
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.21301
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.21301?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ibrahim A. Al‐Kharashi & Martha W. Evens, 1994. "Comparing words, stems, and roots as index terms in an Arabic Information Retrieval system," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 45(8), pages 548-560, September.
    2. Rehab M. Duwairi, 2006. "Machine learning for Arabic text categorization," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(8), pages 1005-1010, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Khreisat, Laila, 2009. "A machine learning approach for Arabic text classification using N-gram frequency statistics," Journal of Informetrics, Elsevier, vol. 3(1), pages 72-77.
    2. Mohammed Rushdi‐Saleh & M. Teresa Martín‐Valdivia & L. Alfonso Ureña‐López & José M. Perea‐Ortega, 2011. "OCA: Opinion corpus for Arabic," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(10), pages 2045-2054, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jamist:v:61:y:2010:i:5:p:1015-1024. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.