IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v10y2023i5d10.1007_s40745-021-00359-4.html
   My bibliography  Save this article

Part of Speech Tagging Using Part of Speech Sequence Graph

Author

Listed:
  • Pejman Gholami-Dastgerdi

    (University of Tabriz)

  • Mohammad-Reza Feizi-Derakhshi

    (University of Tabriz)

Abstract

Part of speech tagging is one of the most fundamental needs of intelligent text processing, which is assigning the most appropriate grammatical category to each word on the text. Hence, provision of a tagger with high accuracy for the Persian language is the major priority of this article. Numerous other methods of POS tagging have already been presented in a way that each one has been applied in taggers to achieve high performance and accuracy. Statistical methods known as a primary technique and one of the most important issues in POS tagging systems is identifying unknown words. This paper investigates all tags that the Maximum Likelihood Estimation method assigns the words existing in the text (including known and unknown) by proposing a graph-based method and correcting them. To do so, a graph is created from the training corpus including the part of speech sequence in the sentences. Then, sentences tagged with Maximum Likelihood Estimation will be corrected by traversing the graph. It should be noted that different methods have been proposed, implemented, and evaluated for tagging using graphs. Next, by investigating pros and cons, a method is proposed which tags the unknown words with the accuracy of 86.84% and the known words with the accuracy of 97.54%. In conclusion, the overall accuracy of the method is calculated as 96.78%, which is an improvement in comparison to the Maximum Likelihood Estimation method and consequently, the graph method shows an acceptable performance in part of speech tagging and is more reliable.

Suggested Citation

  • Pejman Gholami-Dastgerdi & Mohammad-Reza Feizi-Derakhshi, 2023. "Part of Speech Tagging Using Part of Speech Sequence Graph," Annals of Data Science, Springer, vol. 10(5), pages 1301-1328, October.
  • Handle: RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4
    DOI: 10.1007/s40745-021-00359-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-021-00359-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-021-00359-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. James M. Tien, 2017. "Internet of Things, Real-Time Decision Making, and Artificial Intelligence," Annals of Data Science, Springer, vol. 4(2), pages 149-178, June.
    2. Feng Liu & Yong Shi, 2020. "Investigating Laws of Intelligence Based on AI IQ Research," Annals of Data Science, Springer, vol. 7(3), pages 399-416, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xueyan Xu & Fusheng Yu & Runjun Wan, 2023. "A Determining Degree-Based Method for Classification Problems with Interval-Valued Attributes," Annals of Data Science, Springer, vol. 10(2), pages 393-413, April.
    2. Anda Tang & Pei Quan & Lingfeng Niu & Yong Shi, 2022. "A Survey for Sparse Regularization Based Compression Methods," Annals of Data Science, Springer, vol. 9(4), pages 695-722, August.
    3. Xingsen Li & Junlin Zeng & Haitao Liu & Peizhuang Wang, 2022. "Intelligent Problem Solving Model and its Cross Research Directions Based on Factor Space and Extenics," Annals of Data Science, Springer, vol. 9(3), pages 469-484, June.
    4. Elton G. Aráujo & Julio C. S. Vasconcelos & Denize P. Santos & Edwin M. M. Ortega & Dalton Souza & João P. F. Zanetoni, 2023. "The Zero-Inflated Negative Binomial Semiparametric Regression Model: Application to Number of Failing Grades Data," Annals of Data Science, Springer, vol. 10(4), pages 991-1006, August.
    5. Yundong Gu & Dongfen Ma & Jiawei Cui & Zhenhua Li & Yaqi Chen, 2022. "Variable-Weighted Ensemble Forecasting of Short-Term Power Load Based on Factor Space Theory," Annals of Data Science, Springer, vol. 9(3), pages 485-501, June.
    6. Hui Sun & Fanhui Zeng & Yang Yang, 2022. "Covert Factor’s Exploiting and Factor Planning," Annals of Data Science, Springer, vol. 9(3), pages 449-467, June.
    7. Xiangfu Meng & Jing Wen & Jiasheng Shi & Zihan Li & Jinxia Zhu & Peizhuang Wang, 2022. "Factor Query Language (FQL): A Fundamental Language for the Next Generation of Intelligent Database," Annals of Data Science, Springer, vol. 9(3), pages 539-554, June.
    8. Binxiang Jiang, 2022. "Research on Factor Space Engineering and Application of Evidence Factor Mining in Evidence-based Reconstruction," Annals of Data Science, Springer, vol. 9(3), pages 503-537, June.
    9. Durgesh Samariya & Amit Thakkar, 2023. "A Comprehensive Survey of Anomaly Detection Algorithms," Annals of Data Science, Springer, vol. 10(3), pages 829-850, June.
    10. Aidin Zehtab-Salmasi & Ali-Reza Feizi-Derakhshi & Narjes Nikzad-Khasmakhi & Meysam Asgari-Chenaghlu & Saeideh Nabipour, 2023. "Multimodal Price Prediction," Annals of Data Science, Springer, vol. 10(3), pages 619-635, June.
    11. Heba Soltan Mohamed & M. Masoom Ali & Haitham M. Yousof, 2023. "The Lindley Gompertz Model for Estimating the Survival Rates: Properties and Applications in Insurance," Annals of Data Science, Springer, vol. 10(5), pages 1199-1216, October.
    12. Patrick Osatohanmwen & Eferhonore Efe-Eyefia & Francis O. Oyegue & Joseph E. Osemwenkhae & Sunday M. Ogbonmwan & Benson A. Afere, 2022. "The Exponentiated Gumbel–Weibull {Logistic} Distribution with Application to Nigeria’s COVID-19 Infections Data," Annals of Data Science, Springer, vol. 9(5), pages 909-943, October.
    13. Petar Radanliev & David Roure & Rob Walton & Max Kleek & Omar Santos & La’Treall Maddox, 2022. "What Country, University, or Research Institute, Performed the Best on Covid-19 During the First Wave of the Pandemic?," Annals of Data Science, Springer, vol. 9(5), pages 1049-1067, October.
    14. Roberto Moro-Visconti & Salvador Cruz Rambaud & Joaquín López Pascual, 2023. "Artificial intelligence-driven scalability and its impact on the sustainability and valuation of traditional firms," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    15. Anjan Mukherjee & Abhik Mukherjee, 2022. "Interval-Valued Intuitionistic Fuzzy Soft Rough Approximation Operators and Their Applications in Decision Making Problem," Annals of Data Science, Springer, vol. 9(3), pages 611-625, June.
    16. Mansoureh Beheshti Nejad & Seyed Mahmoud Zanjirchi & Seyed Mojtaba Hosseini Bamakan & Negar Jalilian, 2024. "Blockchain Adoption in Operations Management: A Systematic Literature Review of 14 Years of Research," Annals of Data Science, Springer, vol. 11(4), pages 1361-1389, August.
    17. M. Sridharan, 2023. "Generalized Regression Neural Network Model Based Estimation of Global Solar Energy Using Meteorological Parameters," Annals of Data Science, Springer, vol. 10(4), pages 1107-1125, August.
    18. Guangrui Tang & Neng Fan, 2022. "A Survey of Solution Path Algorithms for Regression and Classification Models," Annals of Data Science, Springer, vol. 9(4), pages 749-789, August.
    19. Amaal Elsayed Mubarak & Ehab Mohamed Almetwally, 2024. "Modelling and Forecasting of Covid-19 Using Periodical ARIMA Models," Annals of Data Science, Springer, vol. 11(4), pages 1483-1502, August.
    20. Qinghua Zheng & Chutong Yang & Haijun Yang & Jianhe Zhou, 2020. "A Fast Exact Algorithm for Deployment of Sensor Nodes for Internet of Things," Information Systems Frontiers, Springer, vol. 22(4), pages 829-842, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.