Author
Listed:
- Zheng Zhang
(School of Information, North China University of Technology, Beijing 100144, China)
- Chunle Miao
(School of Information, North China University of Technology, Beijing 100144, China)
- Changan Liu
(School of Information, North China University of Technology, Beijing 100144, China)
- Qing Tian
(School of Information, North China University of Technology, Beijing 100144, China)
- Yongsheng Zhou
(College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China)
Abstract
Road segmentation is one of the essential tasks in remote sensing. Large-scale high-resolution remote sensing images originally have larger pixel sizes than natural images, while the existing models based on Transformer have the high computational cost of square complexity, leading to more extended model training and inference time. Inspired by the long text Transformer model, this paper proposes a novel hybrid attention mechanism to improve the inference speed of the model. By calculating several diagonals and random blocks of the attention matrix, hybrid attention achieves linear time complexity in the token sequence. Using the superposition of adjacent and random attention, hybrid attention introduces the inductive bias similar to convolutional neural networks (CNNs) and retains the ability to acquire long-distance dependence. In addition, the dense road segmentation result of remote sensing image still has the problem of insufficient continuity. However, multiscale feature representation is an effective means in the network based on CNNs. Inspired by this, we propose a multi-scale patch embedding module, which divides images by patches with different scales to obtain coarse-to-fine feature representations. Experiments on the Massachusetts dataset show that the proposed HA-RoadFormer could effectively preserve the integrity of the road segmentation results, achieving a higher Intersection over Union (IoU) 67.36% of road segmentation compared to other state-of-the-art (SOTA) methods. At the same time, the inference speed has also been greatly improved compared with other Transformer based models.
Suggested Citation
Zheng Zhang & Chunle Miao & Changan Liu & Qing Tian & Yongsheng Zhou, 2022.
"HA-RoadFormer: Hybrid Attention Transformer with Multi-Branch for Large-Scale High-Resolution Dense Road Segmentation,"
Mathematics, MDPI, vol. 10(11), pages 1-15, June.
Handle:
RePEc:gam:jmathe:v:10:y:2022:i:11:p:1915-:d:830854
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:11:p:1915-:d:830854. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.