Bo Hu;Tuoxun Zhao;Jia Zheng;Yan Zhang;Leida Li;Weisheng Li;Xinbo Gao
{"title":"Blind Image Quality Assessment With Coarse-Grained Perception Construction and Fine-Grained Interaction Learning","authors":"Bo Hu;Tuoxun Zhao;Jia Zheng;Yan Zhang;Leida Li;Weisheng Li;Xinbo Gao","doi":"10.1109/TBC.2023.3342696","DOIUrl":null,"url":null,"abstract":"Image Quality Assessment (IQA) plays an important role in the field of computer vision. However, most of the existing metrics for Blind IQA (BIQA) adopt an end-to-end way and do not adequately simulate the process of human subjective evaluation, which limits further improvements in model performance. In the process of perception, people first give a preliminary impression of the distortion type and relative quality of the images, and then give a specific quality score under the influence of the interaction of the two. Although some methods have attempted to explore the effects of distortion type and relative quality, the relationship between them has been neglected. In this paper, we propose a BIQA with coarse-grained perception construction and fine-grained interaction learning, called PINet for short. The fundamental idea is to learn from the two-stage human perceptual process. Specifically, in the pre-training stage, the backbone initially processes a pair of synthetic distorted images with pseudo-subjective scores, and the multi-scale feature extraction module integrates the deep information and delivers it to the coarse-grained perception construction module, which performs the distortion discrimination and the quality ranking. In the fine-tuning stage, we propose a fine-grained interactive learning module to interact with the two pieces of information to further improve the performance of the proposed PINet. The experimental results prove that the proposed PINet not only achieves competing performances on synthetic distortion datasets but also performs better on authentic distortion datasets.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"533-544"},"PeriodicalIF":3.2000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10375566/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Image Quality Assessment (IQA) plays an important role in the field of computer vision. However, most of the existing metrics for Blind IQA (BIQA) adopt an end-to-end way and do not adequately simulate the process of human subjective evaluation, which limits further improvements in model performance. In the process of perception, people first give a preliminary impression of the distortion type and relative quality of the images, and then give a specific quality score under the influence of the interaction of the two. Although some methods have attempted to explore the effects of distortion type and relative quality, the relationship between them has been neglected. In this paper, we propose a BIQA with coarse-grained perception construction and fine-grained interaction learning, called PINet for short. The fundamental idea is to learn from the two-stage human perceptual process. Specifically, in the pre-training stage, the backbone initially processes a pair of synthetic distorted images with pseudo-subjective scores, and the multi-scale feature extraction module integrates the deep information and delivers it to the coarse-grained perception construction module, which performs the distortion discrimination and the quality ranking. In the fine-tuning stage, we propose a fine-grained interactive learning module to interact with the two pieces of information to further improve the performance of the proposed PINet. The experimental results prove that the proposed PINet not only achieves competing performances on synthetic distortion datasets but also performs better on authentic distortion datasets.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”