Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec

IF 2.3 3区数学 Q1 MATHEMATICS Mathematics Pub Date : 2024-09-15 DOI:10.3390/math12182874

Woowoen Gwun, Kiho Choi, Gwang Hoon Park

{"title":"Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec","authors":"Woowoen Gwun, Kiho Choi, Gwang Hoon Park","doi":"10.3390/math12182874","DOIUrl":null,"url":null,"abstract":"Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos.","PeriodicalId":18303,"journal":{"name":"Mathematics","volume":"15 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3390/math12182874","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于 AV1 编解码器的基于卷积神经网络的多类型自注意后过滤技术

在过去几年中，围绕卷积神经网络（CNN）在视频编码中用于后过滤的应用产生了浓厚的兴趣并开展了大量研究活动。目前的大部分研究工作都集中在使用具有不同内核大小的 CNN 进行后过滤，主要集中在高效视频编码/H.265 (HEVC) 和多功能视频编码/H.266 (VVC)。这种狭隘的关注点限制了这些技术对其他视频编码标准的探索和应用，如开放媒体联盟开发的 AV1，它具有出色的压缩效率，可减少带宽使用并提高视频质量，对现代流媒体和媒体应用极具吸引力。本文介绍了一种超越传统 CNN 方法的新方法，即在 CNN 框架中集成三个不同的自我关注层。在 AV1 编解码器中应用时，通过整合这些不同的自我注意层，所提出的方法显著提高了视频质量。这一改进表明，自注意机制具有超越基于卷积的方法的限制，彻底改变视频编码中的后过滤技术的潜力。实验结果表明，与 AV1 锚点相比，拟议网络的 Luma 部分平均 BD 速率降低了 10.40%，Chroma 部分降低了 19.22% 和 16.52%。视觉质量评估进一步验证了我们方法的有效性，显示了视频中人工痕迹的显著减少和细节的增强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Mathematics Mathematics-General Mathematics

CiteScore

4.00

自引率

16.70%

发文量

4032

审稿时长

21.9 days

期刊介绍： Mathematics (ISSN 2227-7390) is an international, open access journal which provides an advanced forum for studies related to mathematical sciences. It devotes exclusively to the publication of high-quality reviews, regular research papers and short communications in all areas of pure and applied mathematics. Mathematics also publishes timely and thorough survey articles on current trends, new theoretical techniques, novel ideas and new mathematical tools in different branches of mathematics.