基于深度神经网络的失真参数估计，用于盲测立体图像质量

IF 3.4 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Signal Processing-Image Communication Pub Date : 2024-05-04 DOI:10.1016/j.image.2024.117138

Yi Zhang , Damon M. Chandler , Xuanqin Mou

{"title":"基于深度神经网络的失真参数估计，用于盲测立体图像质量","authors":"Yi Zhang , Damon M. Chandler , Xuanqin Mou","doi":"10.1016/j.image.2024.117138","DOIUrl":null,"url":null,"abstract":"<div><p>Stereoscopic/3D image quality measurement (SIQM) has emerged as an active and important research branch in image processing/computer vision field. Existing methods for blind/no-reference SIQM often train machine-learning models on degraded stereoscopic images for which human subjective quality ratings have been obtained, and they are thus constrained by the fact that only a limited number of 3D image quality datasets currently exist. Although methods have been proposed to overcome this restriction by predicting distortion parameters rather than quality scores, the approach is still limited to the time-consuming, hand-crafted features extracted to train the corresponding classification/regression models as well as the rather complicated binocular fusion/rivalry models used to predict the cyclopean view. In this paper, we explore the use of deep learning to predict distortion parameters, giving rise to a more efficient opinion-unaware SIQM technique. Specifically, a deep fusion-and-excitation network which takes into account the multiple-distortion interactions is proposed to perform distortion parameter estimation, thus avoiding hand-crafted features by using convolution layers while simultaneously accelerating the algorithm by using the GPU. Moreover, we measure distortion parameter values of the cyclopean view by using support vector regression models which are trained on the data obtained from a newly-designed subjective test. In this way, the potential errors in computing the disparity map and cyclopean view can be prevented, leading to a more rapid and precise 3D-vision distortion parameter estimation. Experimental results tested on various 3D image quality datasets demonstrate that our proposed method, in most cases, offers improved predictive performance over existing state-of-the-art methods.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"126 ","pages":"Article 117138"},"PeriodicalIF":3.4000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep neural network based distortion parameter estimation for blind quality measurement of stereoscopic images\",\"authors\":\"Yi Zhang , Damon M. Chandler , Xuanqin Mou\",\"doi\":\"10.1016/j.image.2024.117138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Stereoscopic/3D image quality measurement (SIQM) has emerged as an active and important research branch in image processing/computer vision field. Existing methods for blind/no-reference SIQM often train machine-learning models on degraded stereoscopic images for which human subjective quality ratings have been obtained, and they are thus constrained by the fact that only a limited number of 3D image quality datasets currently exist. Although methods have been proposed to overcome this restriction by predicting distortion parameters rather than quality scores, the approach is still limited to the time-consuming, hand-crafted features extracted to train the corresponding classification/regression models as well as the rather complicated binocular fusion/rivalry models used to predict the cyclopean view. In this paper, we explore the use of deep learning to predict distortion parameters, giving rise to a more efficient opinion-unaware SIQM technique. Specifically, a deep fusion-and-excitation network which takes into account the multiple-distortion interactions is proposed to perform distortion parameter estimation, thus avoiding hand-crafted features by using convolution layers while simultaneously accelerating the algorithm by using the GPU. Moreover, we measure distortion parameter values of the cyclopean view by using support vector regression models which are trained on the data obtained from a newly-designed subjective test. In this way, the potential errors in computing the disparity map and cyclopean view can be prevented, leading to a more rapid and precise 3D-vision distortion parameter estimation. Experimental results tested on various 3D image quality datasets demonstrate that our proposed method, in most cases, offers improved predictive performance over existing state-of-the-art methods.</p></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"126 \",\"pages\":\"Article 117138\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0923596524000390\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596524000390","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

立体/三维图像质量测量（SIQM）已成为图像处理/计算机视觉领域一个活跃而重要的研究分支。现有的盲/无参考 SIQM 方法通常是在已获得人类主观质量评分的降级立体图像上训练机器学习模型，因此这些方法受到目前仅有数量有限的三维图像质量数据集这一事实的限制。虽然有人提出了通过预测失真参数而不是质量分数来克服这一限制的方法，但这种方法仍然局限于为训练相应的分类/回归模型而提取的耗时的手工制作特征，以及用于预测环视视图的相当复杂的双眼融合/竞争模型。在本文中，我们将探索使用深度学习来预测失真参数，从而产生一种更高效的无舆情感知 SIQM 技术。具体来说，我们提出了一种考虑到多重失真相互作用的深度融合与激励网络来执行失真参数估计，从而避免使用卷积层手工创建特征，同时利用 GPU 加速算法。此外，我们还使用支持向量回归模型来测量环形视图的失真参数值，这些模型是根据从新设计的主观测试中获得的数据进行训练的。通过这种方法，可以避免在计算差距图和环视图时可能出现的误差，从而实现更快速、更精确的三维视觉失真参数估计。在各种三维图像质量数据集上测试的实验结果表明，与现有的先进方法相比，我们提出的方法在大多数情况下都能提高预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep neural network based distortion parameter estimation for blind quality measurement of stereoscopic images

Stereoscopic/3D image quality measurement (SIQM) has emerged as an active and important research branch in image processing/computer vision field. Existing methods for blind/no-reference SIQM often train machine-learning models on degraded stereoscopic images for which human subjective quality ratings have been obtained, and they are thus constrained by the fact that only a limited number of 3D image quality datasets currently exist. Although methods have been proposed to overcome this restriction by predicting distortion parameters rather than quality scores, the approach is still limited to the time-consuming, hand-crafted features extracted to train the corresponding classification/regression models as well as the rather complicated binocular fusion/rivalry models used to predict the cyclopean view. In this paper, we explore the use of deep learning to predict distortion parameters, giving rise to a more efficient opinion-unaware SIQM technique. Specifically, a deep fusion-and-excitation network which takes into account the multiple-distortion interactions is proposed to perform distortion parameter estimation, thus avoiding hand-crafted features by using convolution layers while simultaneously accelerating the algorithm by using the GPU. Moreover, we measure distortion parameter values of the cyclopean view by using support vector regression models which are trained on the data obtained from a newly-designed subjective test. In this way, the potential errors in computing the disparity map and cyclopean view can be prevented, leading to a more rapid and precise 3D-vision distortion parameter estimation. Experimental results tested on various 3D image quality datasets demonstrate that our proposed method, in most cases, offers improved predictive performance over existing state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.