Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-11-14 DOI:10.1016/j.displa.2024.102881
Zikang Chen , Zhouyan He , Ting Luo , Chongchong Jin , Yang Song
{"title":"Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment","authors":"Zikang Chen ,&nbsp;Zhouyan He ,&nbsp;Ting Luo ,&nbsp;Chongchong Jin ,&nbsp;Yang Song","doi":"10.1016/j.displa.2024.102881","DOIUrl":null,"url":null,"abstract":"<div><div>Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102881"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224002452","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于亮度分解和变换器的无参考色调映射图像质量评估
阶调映射操作器(TMO)在将高动态范围(HDR)图像转换为具有标准动态范围的阶调映射图像(TMI)以在标准显示器上实现最佳显示效果方面发挥着至关重要的作用。然而,由不同 TMO 生成的 TMI 可能会表现出不同的视觉效果,这就凸显了 TMI 质量评估(TMIQA)方法在预测感知质量和指导 TMO 技术进步方面的重要性。受亮度分解和变换器的启发,本文提出了一种基于深度学习的全新无参照 TMIQA 方法,命名为 LDT-TMIQA。具体来说,TMI 在不同 TMO 的影响下会发生变化,可能导致曝光过度或曝光不足,从而导致结构失真和纹理细节的变化。因此,我们首先将 TMI 的亮度通道分解为基础层和细节层,分别捕捉结构信息和纹理信息。然后,将它们与 TMI 一起作为特征提取模块(FEM)的输入,以提高亮度、结构和纹理先验信息的可用性。此外,FEM 还结合了交叉注意先验模块 (CAPM),以模拟基础层、细节层和 TMI 之间的相互依存关系,同时采用迭代注意先验模块 (IAPM) 来提取多尺度和多层次的视觉特征。最后,提出了一个特征选择融合模块(FSFM),通过减少不必要特征的权重和融合不同层次的同等重要特征,获得预测 TMI 质量得分的最终有效特征。在公开的 TMI 基准数据库上进行的大量实验表明,所提出的 LDT-TMIQA 达到了最先进的水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1