Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild

Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen
{"title":"Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild","authors":"Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen","doi":"arxiv-2409.05540","DOIUrl":null,"url":null,"abstract":"Traditional in the wild image quality assessment (IQA) models are generally\ntrained with the quality labels of mean opinion score (MOS), while missing the\nrich subjective quality information contained in the quality ratings, for\nexample, the standard deviation of opinion scores (SOS) or even distribution of\nopinion scores (DOS). In this paper, we propose a novel IQA method named\nRichIQA to explore the rich subjective rating information beyond MOS to predict\nimage quality in the wild. RichIQA is characterized by two key novel designs:\n(1) a three-stage image quality prediction network which exploits the powerful\nfeature representation capability of the Convolutional vision Transformer (CvT)\nand mimics the short-term and long-term memory mechanisms of human brain; (2) a\nmulti-label training strategy in which rich subjective quality information like\nMOS, SOS and DOS are concurrently used to train the quality prediction network.\nPowered by these two novel designs, RichIQA is able to predict the image\nquality in terms of a distribution, from which the mean image quality can be\nsubsequently obtained. Extensive experimental results verify that the\nthree-stage network is tailored to predict rich quality information, while the\nmulti-label training strategy can fully exploit the potentials within\nsubjective quality rating and enhance the prediction performance and\ngeneralizability of the network. RichIQA outperforms state-of-the-art\ncompetitors on multiple large-scale in the wild IQA databases with rich\nsubjective rating labels. The code of RichIQA will be made publicly available\non GitHub.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional in the wild image quality assessment (IQA) models are generally trained with the quality labels of mean opinion score (MOS), while missing the rich subjective quality information contained in the quality ratings, for example, the standard deviation of opinion scores (SOS) or even distribution of opinion scores (DOS). In this paper, we propose a novel IQA method named RichIQA to explore the rich subjective rating information beyond MOS to predict image quality in the wild. RichIQA is characterized by two key novel designs: (1) a three-stage image quality prediction network which exploits the powerful feature representation capability of the Convolutional vision Transformer (CvT) and mimics the short-term and long-term memory mechanisms of human brain; (2) a multi-label training strategy in which rich subjective quality information like MOS, SOS and DOS are concurrently used to train the quality prediction network. Powered by these two novel designs, RichIQA is able to predict the image quality in terms of a distribution, from which the mean image quality can be subsequently obtained. Extensive experimental results verify that the three-stage network is tailored to predict rich quality information, while the multi-label training strategy can fully exploit the potentials within subjective quality rating and enhance the prediction performance and generalizability of the network. RichIQA outperforms state-of-the-art competitors on multiple large-scale in the wild IQA databases with rich subjective rating labels. The code of RichIQA will be made publicly available on GitHub.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为野外图像质量评估探索丰富的主观质量信息
传统的野外图像质量评估(IQA)模型一般使用平均意见分(MOS)作为质量标签进行训练,而忽略了质量评分中包含的丰富的主观质量信息,例如意见分的标准偏差(SOS)甚至意见分的分布(DOS)。在本文中,我们提出了一种名为 "RichIQA "的新型 IQA 方法,以探索 MOS 以外的丰富主观评分信息,从而预测野生图像的质量。RichIQA 有两个关键的新设计:(1)三阶段图像质量预测网络,利用卷积视觉变换器(CvT)强大的特征表示能力,模拟人脑的短期和长期记忆机制;(2)多标签训练策略,同时使用丰富的主观质量信息(如 MOS、SOS 和 DOS)来训练质量预测网络。在这两种新颖设计的支持下,RichIQA 能够以分布的形式预测图像质量,并从中获得平均图像质量。广泛的实验结果验证了三级网络能够预测丰富的质量信息,而多标签训练策略能够充分挖掘主观质量评级的潜力,提高网络的预测性能和通用性。RichIQA 在多个具有丰富主观评分标签的大规模野生 IQA 数据库上的表现优于目前的竞争对手。RichIQA 的代码将在 GitHub 上公开发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vista3D: Unravel the 3D Darkside of a Single Image MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion Efficient Low-Resolution Face Recognition via Bridge Distillation Enhancing Few-Shot Classification without Forgetting through Multi-Level Contrastive Constraints NVLM: Open Frontier-Class Multimodal LLMs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1