IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i22p4585-d1276733.html
   My bibliography  Save this article

Gradual OCR: An Effective OCR Approach Based on Gradual Detection of Texts

Author

Listed:
  • Youngki Park

    (Department of Computer Education, Chuncheon National University of Education, Chuncheon 24328, Republic of Korea)

  • Youhyun Shin

    (Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea)

Abstract

In this paper, we present a novel approach to optical character recognition that incorporates various supplementary techniques, including the gradual detection of texts and gradual filtering of inaccurately recognized texts. To minimize false negatives, we attempt to detect all text by incrementally lowering the relevant thresholds. To mitigate false positives, we implement a novel filtering method that dynamically adjusts based on the confidence levels of recognized texts and their corresponding detection thresholds. Additionally, we use straightforward yet effective strategies to enhance the optical character recognition accuracy and speed, such as upscaling, link refinement, perspective transformation, the merging of cropped images, and simple autoregression. Given our focus on Korean chart data, we compile a mix of real-world and artificial Korean chart datasets for experimentation. Our experimental results show that our approach outperforms Tesseract by approximately 7 to 15 times and EasyOCR by 3 to 5 times in accuracy, as measured using a Jaccard similarity-based error rate on our datasets.

Suggested Citation

  • Youngki Park & Youhyun Shin, 2023. "Gradual OCR: An Effective OCR Approach Based on Gradual Detection of Texts," Mathematics, MDPI, vol. 11(22), pages 1-20, November.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4585-:d:1276733
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/22/4585/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/22/4585/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4585-:d:1276733. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.