Author
Listed:
- Woowoen Gwun
(Department of Computer Science and Engineering, College of Software, Kyung Hee University, Yongin 17104, Gyeonggi-do, Republic of Korea)
- Kiho Choi
(Department of Electronics and Information Convergence Engineering, Kyung Hee University, Yongin 17104, Gyeonggi-do, Republic of Korea
Department of Electronic Engineering, Kyung Hee University, Yongin 17104, Gyeonggi-do, Republic of Korea)
- Gwang Hoon Park
(Department of Computer Science and Engineering, College of Software, Kyung Hee University, Yongin 17104, Gyeonggi-do, Republic of Korea)
Abstract
Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos.
Suggested Citation
Woowoen Gwun & Kiho Choi & Gwang Hoon Park, 2024.
"Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec,"
Mathematics, MDPI, vol. 12(18), pages 1-24, September.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:18:p:2874-:d:1478780
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2874-:d:1478780. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.