为野外图像质量评估探索丰富的主观质量信息

arXiv - CS - Multimedia Pub Date : 2024-09-09 DOI:arxiv-2409.05540

Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen

{"title":"为野外图像质量评估探索丰富的主观质量信息","authors":"Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen","doi":"arxiv-2409.05540","DOIUrl":null,"url":null,"abstract":"Traditional in the wild image quality assessment (IQA) models are generally\ntrained with the quality labels of mean opinion score (MOS), while missing the\nrich subjective quality information contained in the quality ratings, for\nexample, the standard deviation of opinion scores (SOS) or even distribution of\nopinion scores (DOS). In this paper, we propose a novel IQA method named\nRichIQA to explore the rich subjective rating information beyond MOS to predict\nimage quality in the wild. RichIQA is characterized by two key novel designs:\n(1) a three-stage image quality prediction network which exploits the powerful\nfeature representation capability of the Convolutional vision Transformer (CvT)\nand mimics the short-term and long-term memory mechanisms of human brain; (2) a\nmulti-label training strategy in which rich subjective quality information like\nMOS, SOS and DOS are concurrently used to train the quality prediction network.\nPowered by these two novel designs, RichIQA is able to predict the image\nquality in terms of a distribution, from which the mean image quality can be\nsubsequently obtained. Extensive experimental results verify that the\nthree-stage network is tailored to predict rich quality information, while the\nmulti-label training strategy can fully exploit the potentials within\nsubjective quality rating and enhance the prediction performance and\ngeneralizability of the network. RichIQA outperforms state-of-the-art\ncompetitors on multiple large-scale in the wild IQA databases with rich\nsubjective rating labels. The code of RichIQA will be made publicly available\non GitHub.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild\",\"authors\":\"Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen\",\"doi\":\"arxiv-2409.05540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional in the wild image quality assessment (IQA) models are generally\\ntrained with the quality labels of mean opinion score (MOS), while missing the\\nrich subjective quality information contained in the quality ratings, for\\nexample, the standard deviation of opinion scores (SOS) or even distribution of\\nopinion scores (DOS). In this paper, we propose a novel IQA method named\\nRichIQA to explore the rich subjective rating information beyond MOS to predict\\nimage quality in the wild. RichIQA is characterized by two key novel designs:\\n(1) a three-stage image quality prediction network which exploits the powerful\\nfeature representation capability of the Convolutional vision Transformer (CvT)\\nand mimics the short-term and long-term memory mechanisms of human brain; (2) a\\nmulti-label training strategy in which rich subjective quality information like\\nMOS, SOS and DOS are concurrently used to train the quality prediction network.\\nPowered by these two novel designs, RichIQA is able to predict the image\\nquality in terms of a distribution, from which the mean image quality can be\\nsubsequently obtained. Extensive experimental results verify that the\\nthree-stage network is tailored to predict rich quality information, while the\\nmulti-label training strategy can fully exploit the potentials within\\nsubjective quality rating and enhance the prediction performance and\\ngeneralizability of the network. RichIQA outperforms state-of-the-art\\ncompetitors on multiple large-scale in the wild IQA databases with rich\\nsubjective rating labels. The code of RichIQA will be made publicly available\\non GitHub.\",\"PeriodicalId\":501480,\"journal\":{\"name\":\"arXiv - CS - Multimedia\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

传统的野外图像质量评估（IQA）模型一般使用平均意见分（MOS）作为质量标签进行训练，而忽略了质量评分中包含的丰富的主观质量信息，例如意见分的标准偏差（SOS）甚至意见分的分布（DOS）。在本文中，我们提出了一种名为 "RichIQA "的新型 IQA 方法，以探索 MOS 以外的丰富主观评分信息，从而预测野生图像的质量。RichIQA 有两个关键的新设计：（1）三阶段图像质量预测网络，利用卷积视觉变换器（CvT）强大的特征表示能力，模拟人脑的短期和长期记忆机制；（2）多标签训练策略，同时使用丰富的主观质量信息（如 MOS、SOS 和 DOS）来训练质量预测网络。在这两种新颖设计的支持下，RichIQA 能够以分布的形式预测图像质量，并从中获得平均图像质量。广泛的实验结果验证了三级网络能够预测丰富的质量信息，而多标签训练策略能够充分挖掘主观质量评级的潜力，提高网络的预测性能和通用性。RichIQA 在多个具有丰富主观评分标签的大规模野生 IQA 数据库上的表现优于目前的竞争对手。RichIQA 的代码将在 GitHub 上公开发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild

Traditional in the wild image quality assessment (IQA) models are generally trained with the quality labels of mean opinion score (MOS), while missing the rich subjective quality information contained in the quality ratings, for example, the standard deviation of opinion scores (SOS) or even distribution of opinion scores (DOS). In this paper, we propose a novel IQA method named RichIQA to explore the rich subjective rating information beyond MOS to predict image quality in the wild. RichIQA is characterized by two key novel designs: (1) a three-stage image quality prediction network which exploits the powerful feature representation capability of the Convolutional vision Transformer (CvT) and mimics the short-term and long-term memory mechanisms of human brain; (2) a multi-label training strategy in which rich subjective quality information like MOS, SOS and DOS are concurrently used to train the quality prediction network. Powered by these two novel designs, RichIQA is able to predict the image quality in terms of a distribution, from which the mean image quality can be subsequently obtained. Extensive experimental results verify that the three-stage network is tailored to predict rich quality information, while the multi-label training strategy can fully exploit the potentials within subjective quality rating and enhance the prediction performance and generalizability of the network. RichIQA outperforms state-of-the-art competitors on multiple large-scale in the wild IQA databases with rich subjective rating labels. The code of RichIQA will be made publicly available on GitHub.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助