IDEAS home Printed from https://ideas.repec.org/a/hin/complx/6669672.html
   My bibliography  Save this article

Benchmark Pashto Handwritten Character Dataset and Pashto Object Character Recognition (OCR) Using Deep Neural Network with Rule Activation Function

Author

Listed:
  • Imran Uddin
  • Dzati A. Ramli
  • Abdullah Khan
  • Javed Iqbal Bangash
  • Nosheen Fayyaz
  • Asfandyar Khan
  • Mahwish Kundi
  • Atif Khan

Abstract

In the area of machine learning, different techniques are used to train machines and perform different tasks like computer vision, data analysis, natural language processing, and speech recognition. Computer vision is one of the main branches where machine learning and deep learning techniques are being applied. Optical character recognition (OCR) is the ability of a machine to recognize the character of a language. Pashto is one of the most ancient and historical languages of the world, spoken in Afghanistan and Pakistan. OCR application has been developed for various cursive languages like Urdu, Chinese, and Japanese, but very little work is done for the recognition of the Pashto language. When it comes to handwritten character recognition, it becomes more difficult for OCR to recognize the characters as every handwritten character’s shape is influenced by the writer’s hand motion dynamics. The reason for the lack of research in Pashto handwritten character data as compared to other languages is because there is no benchmark dataset available for experimental purposes. This study focuses on the creation of such a dataset, and then for the evaluation purpose, a machine is trained to correctly recognize unseen Pashto handwritten characters. To achieve this objective, a dataset of 43000 images was created. Three Feed Forward Neural Network models with backpropagation algorithm using different Rectified Linear Unit (ReLU) layer configurations (Model 1 with 1-ReLU Layer, Model 2 with 2-ReLU layers, and Model 3 with 3-ReLU Layers) were trained and tested with this dataset. The simulation shows that Model 1 achieved accuracy up to 87.6% on unseen data while Model 2 achieved an accuracy of 81.60% and 3% accuracy, respectively. Similarly, loss (cross-entropy) was the lowest for Model 1 with 0.15 and 3.17 for training and testing, followed by Model 2 with 0.7 and 4.2 for training and testing, while Model 3 was the last with loss values of 6.4 and 3.69. The precision, recall, and f-measure values of Model 1 were better than those of both Model 2 and Model 3. Based on results, Model 1 (with 1 ReLU activation layer) is found to be the most efficient as compared to the other two models in terms of accuracy to recognize Pashto handwritten characters.

Suggested Citation

  • Imran Uddin & Dzati A. Ramli & Abdullah Khan & Javed Iqbal Bangash & Nosheen Fayyaz & Asfandyar Khan & Mahwish Kundi & Atif Khan, 2021. "Benchmark Pashto Handwritten Character Dataset and Pashto Object Character Recognition (OCR) Using Deep Neural Network with Rule Activation Function," Complexity, Hindawi, vol. 2021, pages 1-16, March.
  • Handle: RePEc:hin:complx:6669672
    DOI: 10.1155/2021/6669672
    as

    Download full text from publisher

    File URL: http://downloads.hindawi.com/journals/complexity/2021/6669672.pdf
    Download Restriction: no

    File URL: http://downloads.hindawi.com/journals/complexity/2021/6669672.xml
    Download Restriction: no

    File URL: https://libkey.io/10.1155/2021/6669672?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hin:complx:6669672. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Mohamed Abdelhakeem (email available below). General contact details of provider: https://www.hindawi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.