Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2024-09-12 DOI:10.1016/j.patcog.2024.111011
{"title":"Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment","authors":"","doi":"10.1016/j.patcog.2024.111011","DOIUrl":null,"url":null,"abstract":"<div><p>High dynamic range (HDR) video represents a wider range of brightness, detail and colour than standard dynamic range (SDR) video. However, SDR-based VQA (Video Quality Assessment) models struggle to capture HDR distortions. In addition, some of the existing methods designed for HDR video focus on emphasising the distortion of local areas of the video frame, ignoring the distortion of the video frame as a whole. Therefore, we propose a no reference VQA model based on luminance decomposition and recombination that provides excellent performance for HDR videos, called HDR-DRVQA. Specifically, HDR-DRVQA utilises a luminance decomposition strategy to decompose video frames into different regions for explicit extraction of perceptual features in different regions of the high dynamic range. We then further propose a residual aggregation module for recombining multi-region features to extract static spatial distortion representations and dynamic motion perception (captured by feature differences). Taking advantage of the Transformer network in remote dependency modelling, this information is fed into the Transformer network for interactive learning of motion perception and adaptively constructs a stream of spatial distortion information from shallow to deep layers during temporal aggregation. We validate that our model significantly outperforms SDR VQA and existing HDR VQA methods on the publicly available HDR databases.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324007623","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

High dynamic range (HDR) video represents a wider range of brightness, detail and colour than standard dynamic range (SDR) video. However, SDR-based VQA (Video Quality Assessment) models struggle to capture HDR distortions. In addition, some of the existing methods designed for HDR video focus on emphasising the distortion of local areas of the video frame, ignoring the distortion of the video frame as a whole. Therefore, we propose a no reference VQA model based on luminance decomposition and recombination that provides excellent performance for HDR videos, called HDR-DRVQA. Specifically, HDR-DRVQA utilises a luminance decomposition strategy to decompose video frames into different regions for explicit extraction of perceptual features in different regions of the high dynamic range. We then further propose a residual aggregation module for recombining multi-region features to extract static spatial distortion representations and dynamic motion perception (captured by feature differences). Taking advantage of the Transformer network in remote dependency modelling, this information is fed into the Transformer network for interactive learning of motion perception and adaptively constructs a stream of spatial distortion information from shallow to deep layers during temporal aggregation. We validate that our model significantly outperforms SDR VQA and existing HDR VQA methods on the publicly available HDR databases.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于高动态范围视频质量评估的亮度分解和重建
与标准动态范围(SDR)视频相比,高动态范围(HDR)视频具有更宽的亮度、细节和色彩范围。然而,基于 SDR 的 VQA(视频质量评估)模型难以捕捉 HDR 失真。此外,一些针对 HDR 视频设计的现有方法侧重于强调视频帧局部区域的失真,而忽略了视频帧整体的失真。因此,我们提出了一种基于亮度分解和重组的无参考 VQA 模型,它能为 HDR 视频提供出色的性能,称为 HDR-DRVQA。具体来说,HDR-DRVQA 利用亮度分解策略将视频帧分解成不同的区域,以明确提取高动态范围不同区域的感知特征。然后,我们进一步提出了一个残差聚合模块,用于重新组合多区域特征,以提取静态空间失真表示和动态运动感知(通过特征差异捕捉)。利用远程依赖建模中 Transformer 网络的优势,这些信息被输入 Transformer 网络,用于运动感知的交互式学习,并在时间聚合过程中自适应地构建从浅层到深层的空间失真信息流。我们在公开的 HDR 数据库上验证了我们的模型明显优于 SDR VQA 和现有的 HDR VQA 方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
期刊最新文献
A novel domain independent scene text localizer Video Anomaly Detection via self-supervised and spatio-temporal proxy tasks learning FICE: Text-conditioned fashion-image editing with guided GAN inversion Collaborative graph neural networks for augmented graphs: A local-to-global perspective Asymmetric patch sampling for contrastive learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1