IDEAS home Printed from https://ideas.repec.org/a/bla/jamist/v62y2011i10p2045-2054.html
   My bibliography  Save this article

OCA: Opinion corpus for Arabic

Author

Listed:
  • Mohammed Rushdi‐Saleh
  • M. Teresa Martín‐Valdivia
  • L. Alfonso Ureña‐López
  • José M. Perea‐Ortega

Abstract

Sentiment analysis is a challenging new task related to text mining and natural language processing. Although there are, at present, several studies related to this theme, most of these focus mainly on English texts. The resources available for opinion mining (OM) in other languages are still limited. In this article, we present a new Arabic corpus for the OM task that has been made available to the scientific community for research purposes. The corpus contains 500 movie reviews collected from different web pages and blogs in Arabic, 250 of them considered as positive reviews, and the other 250 as negative opinions. Furthermore, different experiments have been carried out on this corpus, using machine learning algorithms such as support vector machines and Nave Bayes. The results obtained are very promising and we are encouraged to continue this line of research.

Suggested Citation

  • Mohammed Rushdi‐Saleh & M. Teresa Martín‐Valdivia & L. Alfonso Ureña‐López & José M. Perea‐Ortega, 2011. "OCA: Opinion corpus for Arabic," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(10), pages 2045-2054, October.
  • Handle: RePEc:bla:jamist:v:62:y:2011:i:10:p:2045-2054
    DOI: 10.1002/asi.21598
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.21598
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.21598?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Changli Zhang & Daniel Zeng & Jiexun Li & Fei‐Yue Wang & Wanli Zuo, 2009. "Sentiment analysis of Chinese documents: From sentence to document level," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(12), pages 2474-2487, December.
    2. Rehab M. Duwairi, 2006. "Machine learning for Arabic text categorization," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(8), pages 1005-1010, June.
    3. Rehab Duwairi & Mohammad Nayef Al‐Refai & Natheer Khasawneh, 2009. "Feature reduction techniques for Arabic text categorization," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(11), pages 2347-2352, November.
    4. Prabowo, Rudy & Thelwall, Mike, 2009. "Sentiment analysis: A combined approach," Journal of Informetrics, Elsevier, vol. 3(2), pages 143-157.
    5. Khaled Shaalan & Hafsa Raza, 2009. "NERA: Named Entity Recognition for Arabic," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(8), pages 1652-1663, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gang Wang & Daqing Zheng & Shanlin Yang & Jian Ma, 2018. "FCE-SVM: a new cluster based ensemble method for opinion mining from social media," Information Systems and e-Business Management, Springer, vol. 16(4), pages 721-742, November.
    2. Shuyue Huang & Lena Jingen Liang & Hwansuk Chris Choi, 2022. "How We Failed in Context: A Text-Mining Approach to Understanding Hotel Service Failures," Sustainability, MDPI, vol. 14(5), pages 1-18, February.
    3. Xiao, Yan & Li, Congdong & Thürer, Matthias & Liu, Yide & Qu, Ting, 2022. "User preference mining based on fine-grained sentiment analysis," Journal of Retailing and Consumer Services, Elsevier, vol. 68(C).
    4. Damiano De Marchi & Rudy Becarelli & Leonardo Di Sarli, 2022. "Tourism Sustainability Index: Measuring Tourism Sustainability Based on the ETIS Toolkit, by Exploring Tourist Satisfaction via Sentiment Analysis," Sustainability, MDPI, vol. 14(13), pages 1-18, July.
    5. Hui Yuan & Wei Xu & Qian Li & Raymond Lau, 2018. "Topic sentiment mining for sales performance prediction in e-commerce," Annals of Operations Research, Springer, vol. 270(1), pages 553-576, November.
    6. Yong Shi & Luyao Zhu & Wei Li & Kun Guo & Yuanchun Zheng, 2019. "Survey on Classic and Latest Textual Sentiment Analysis Articles and Techniques," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 18(04), pages 1243-1287, July.
    7. Mohammed N. A. Ali & Guanzheng Tan & Aamir Hussain, 2018. "Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition," Future Internet, MDPI, vol. 10(12), pages 1-12, December.
    8. Yucel, Ahmet & Dag, Ali & Oztekin, Asil & Carpenter, Mark, 2022. "A novel text analytic methodology for classification of product and service reviews," Journal of Business Research, Elsevier, vol. 151(C), pages 287-297.
    9. Shivendra Kumar & C. Ravindranath Chowdary, 2022. "Semantic model to extract tips from hotel reviews," Electronic Commerce Research, Springer, vol. 22(4), pages 1059-1077, December.
    10. F. Schweitzer & D. Garcia, 2010. "An agent-based model of collective emotions in online communities," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 77(4), pages 533-545, October.
    11. Ghasem Javadi & Mohammad Taleai, 2020. "Integration of User Generated Geo-contents and Official Data to Assess Quality of Life in Intra-national Level," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 152(1), pages 205-235, November.
    12. Tidor-Vlad Pricope, 2021. "Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review," Papers 2106.00123, arXiv.org.
    13. Khreisat, Laila, 2009. "A machine learning approach for Arabic text classification using N-gram frequency statistics," Journal of Informetrics, Elsevier, vol. 3(1), pages 72-77.
    14. Yaxin Bi, 2022. "Sentiment classification in social media data by combining triplet belief functions," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(7), pages 968-991, July.
    15. Lima, Ana Carolina E.S. & de Castro, Leandro Nunes & Corchado, Juan M., 2015. "A polarity analysis framework for Twitter messages," Applied Mathematics and Computation, Elsevier, vol. 270(C), pages 756-767.
    16. Youngseok Choi & Habin Lee, 2017. "Data properties and the performance of sentiment classification for electronic commerce applications," Information Systems Frontiers, Springer, vol. 19(5), pages 993-1012, October.
    17. Nan Jing & Tao Jiang & Juan Du & Vijayan Sugumaran, 2018. "Personalized recommendation based on customer preference mining and sentiment assessment from a Chinese e-commerce website," Electronic Commerce Research, Springer, vol. 18(1), pages 159-179, March.
    18. Xiangfeng Luo & Yawen Yi, 2019. "Topic-Specific Emotion Mining Model for Online Comments," Future Internet, MDPI, vol. 11(3), pages 1-18, March.
    19. Chen, Long-Sheng & Liu, Cheng-Hsiang & Chiu, Hui-Ju, 2011. "A neural network based approach for sentiment classification in the blogosphere," Journal of Informetrics, Elsevier, vol. 5(2), pages 313-322.
    20. Emad Al‐Shawakfa & Amer Al‐Badarneh & Safwan Shatnawi & Khaleel Al‐Rabab'ah & Basel Bani‐Ismail, 2010. "A comparison study of some Arabic root finding algorithms," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(5), pages 1015-1024, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jamist:v:62:y:2011:i:10:p:2045-2054. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.