A ViT-Based Adaptive Recurrent Mobilenet With Attention Network for Video Compression and Bit-Rate Reduction Using Improved Heuristic Approach Under Versatile Video Coding

IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computational Intelligence Pub Date : 2024-12-10 DOI:10.1111/coin.70014
D. Padmapriya, Ameelia Roseline A
{"title":"A ViT-Based Adaptive Recurrent Mobilenet With Attention Network for Video Compression and Bit-Rate Reduction Using Improved Heuristic Approach Under Versatile Video Coding","authors":"D. Padmapriya,&nbsp;Ameelia Roseline A","doi":"10.1111/coin.70014","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Video compression received attention from the communities of video processing and deep learning. Modern learning-aided mechanisms use a hybrid coding approach to reduce redundancy in pixel space across time and space, improving motion compensation accuracy. The experiments in video compression have important improvements in past years. The Versatile Video Coding (VVC) is the primary enhancing standard of video compression which is also referred to as H. 226. The VVC codec is a block-assisted hybrid codec, making it highly capable and complex. Video coding effectively compresses data while reducing compression artifacts, enhancing the quality and functionality of AI video technologies. However, the traditional models suffer from the incorrect compression of the motion and ineffective compensation frameworks of the motion leading to compression faults with a minimal trade-off of the rate distortion. This work implements an automated and effective video compression task under VVC using a deep learning approach. Motion estimation is conducted using the Motion Vector (MV) encoder-decoder model to track movements in the video. Based on these MV, the reconstruction of the frame is carried out to compensate for the motions. The residual images are obtained by using Vision Transformer-based Adaptive Recurrent MobileNet with Attention Network (ViT-ARMAN). The parameters optimization of the ViT-ARMAN is done using the Opposition-based Golden Tortoise Beetle Optimizer (OGTBO). Entropy coding is used in the training phase of the developed work to find the bit rate of residual images. Extensive experiments were conducted to demonstrate the effectiveness of the developed deep learning-based method for video compression and bit rate reduction under VVC.</p>\n </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 6","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/coin.70014","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Video compression received attention from the communities of video processing and deep learning. Modern learning-aided mechanisms use a hybrid coding approach to reduce redundancy in pixel space across time and space, improving motion compensation accuracy. The experiments in video compression have important improvements in past years. The Versatile Video Coding (VVC) is the primary enhancing standard of video compression which is also referred to as H. 226. The VVC codec is a block-assisted hybrid codec, making it highly capable and complex. Video coding effectively compresses data while reducing compression artifacts, enhancing the quality and functionality of AI video technologies. However, the traditional models suffer from the incorrect compression of the motion and ineffective compensation frameworks of the motion leading to compression faults with a minimal trade-off of the rate distortion. This work implements an automated and effective video compression task under VVC using a deep learning approach. Motion estimation is conducted using the Motion Vector (MV) encoder-decoder model to track movements in the video. Based on these MV, the reconstruction of the frame is carried out to compensate for the motions. The residual images are obtained by using Vision Transformer-based Adaptive Recurrent MobileNet with Attention Network (ViT-ARMAN). The parameters optimization of the ViT-ARMAN is done using the Opposition-based Golden Tortoise Beetle Optimizer (OGTBO). Entropy coding is used in the training phase of the developed work to find the bit rate of residual images. Extensive experiments were conducted to demonstrate the effectiveness of the developed deep learning-based method for video compression and bit rate reduction under VVC.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多用途视频编码下基于改进启发式算法的视频压缩和降码自适应循环注意网络
视频压缩受到了视频处理和深度学习领域的关注。现代学习辅助机制使用混合编码方法来减少像素空间在时间和空间上的冗余,提高运动补偿精度。在过去的几年里,视频压缩实验有了重要的进步。通用视频编码(VVC)是视频压缩的主要增强标准,也称为H. 226。VVC编解码器是一种块辅助混合编解码器,使其功能强大且复杂。视频编码有效地压缩数据,同时减少压缩伪影,提高人工智能视频技术的质量和功能。然而,传统的模型存在运动压缩错误和运动补偿框架无效的问题,导致压缩错误,而速率失真的代价最小。本文采用深度学习的方法实现了VVC下自动有效的视频压缩任务。运动估计使用运动矢量(MV)编码器-解码器模型来跟踪视频中的运动。基于这些MV,进行帧的重建以补偿运动。残差图像采用基于视觉变换的自适应循环移动网络(vita - arman)获取。利用基于对差的金龟甲虫优化器(Golden Tortoise Beetle Optimizer, OGTBO)对vita - arman进行了参数优化。在所开发的工作的训练阶段使用熵编码来找到残差图像的比特率。大量的实验证明了所开发的基于深度学习的方法在VVC下视频压缩和比特率降低的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computational Intelligence
Computational Intelligence 工程技术-计算机:人工智能
CiteScore
6.90
自引率
3.60%
发文量
65
审稿时长
>12 weeks
期刊介绍: This leading international journal promotes and stimulates research in the field of artificial intelligence (AI). Covering a wide range of issues - from the tools and languages of AI to its philosophical implications - Computational Intelligence provides a vigorous forum for the publication of both experimental and theoretical research, as well as surveys and impact studies. The journal is designed to meet the needs of a wide range of AI workers in academic and industrial research.
期刊最新文献
Vision-Based UAV Detection and Tracking Using Deep Learning and Kalman Filter An Implementation of Adaptive Multi-CNN Feature Fusion Model With Attention Mechanism With Improved Heuristic Algorithm for Kidney Stone Detection TransPapCanCervix: An Enhanced Transfer Learning-Based Ensemble Model for Cervical Cancer Classification Deep Reinforcement Learning Based Flow Aware-QoS Provisioning in SD-IoT for Precision Agriculture Deep Learning and X-Ray Imaging Innovations for Pneumonia Infection Diagnosis: Introducing DeepPneuNet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1