Author
Listed:
- Mahmoud SalahEldin Kasem
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea
Multimedia Department, Faculty of Computers and Information, Assiut University, Assiut 71526, Egypt)
- Mohamed Mahmoud
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea
Information Technology Department, Faculty of Computers and Information, Assiut University, Assiut 71526, Egypt)
- Bilel Yagoub
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea)
- Mostafa Farouk Senussi
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea
Information Technology Department, Faculty of Computers and Information, Assiut University, Assiut 71526, Egypt)
- Mahmoud Abdalla
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea)
- Hyun-Soo Kang
(Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea)
Abstract
Table detection in document images is a challenging problem due to diverse layouts, irregular structures, and embedded graphical elements. In this study, we present HTTD (Hierarchical Transformer for Table Detection), a cutting-edge model that combines a Swin-L Transformer backbone with advanced Transformer-based mechanisms to achieve superior performance. HTTD addresses three key challenges: handling diverse document layouts, including historical and modern structures; improving computational efficiency and training convergence; and demonstrating adaptability to non-standard tasks like medical imaging and receipt key detection. Evaluated on benchmark datasets, HTTD achieves state-of-the-art results, with precision rates of 96.98% on ICDAR-2019 cTDaR, 96.43% on TNCR, and 93.14% on TabRecSet. These results validate its effectiveness and efficiency, paving the way for advanced document analysis and data digitization tasks.
Suggested Citation
Mahmoud SalahEldin Kasem & Mohamed Mahmoud & Bilel Yagoub & Mostafa Farouk Senussi & Mahmoud Abdalla & Hyun-Soo Kang, 2025.
"HTTD: A Hierarchical Transformer for Accurate Table Detection in Document Images,"
Mathematics, MDPI, vol. 13(2), pages 1-20, January.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:2:p:266-:d:1567621
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:2:p:266-:d:1567621. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.