通过双通道卷积网络和判别器实现无监督多焦点图像融合方法

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-05-01 DOI:10.1016/j.cviu.2024.104029

Lixing Fang , Xiangxiang Wang , Junli Zhao , Zhenkuan Pan , Hui Li , Yi Li

{"title":"通过双通道卷积网络和判别器实现无监督多焦点图像融合方法","authors":"Lixing Fang , Xiangxiang Wang , Junli Zhao , Zhenkuan Pan , Hui Li , Yi Li","doi":"10.1016/j.cviu.2024.104029","DOIUrl":null,"url":null,"abstract":"<div><p>The challenge in multi-focus image fusion tasks lies in accurately preserving the complementary information from the source images in the fused image. However, existing datasets often lack ground truth images, making it difficult for some full-reference loss functions (such as SSIM) to effectively participate in model training, thereby further affecting the performance of retaining source image details. To address this issue, this paper proposes an unsupervised dual-channel dense convolutional method, DCD, for multi-focus image fusion. DCD designs Patch processing blocks specifically for the fusion task, which segment the source image pairs into equally sized patches and evaluate their information to obtain a reconstructed image and a set of adaptive weight coefficients. The reconstructed image is used as the reference image, enabling unsupervised methods to utilize full-reference loss functions in training and overcoming the challenge of lacking labeled data in the training set. Furthermore, considering that the human visual system (HVS) is more sensitive to brightness than color, DCD trains the dual-channel network using both RGB images and their luminance components. This allows the network to focus more on the brightness information while preserving the color and gradient details of the source images, resulting in fused images that are more compatible with the HVS. The adaptive weight coefficients obtained through the Patch processing blocks are also used to determine the degree of preservation of the brightness information in the source images. Finally, comparative experiments on different datasets also demonstrate the superior performance of DCD in terms of fused image quality compared to other methods.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An unsupervised multi-focus image fusion method via dual-channel convolutional network and discriminator\",\"authors\":\"Lixing Fang , Xiangxiang Wang , Junli Zhao , Zhenkuan Pan , Hui Li , Yi Li\",\"doi\":\"10.1016/j.cviu.2024.104029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The challenge in multi-focus image fusion tasks lies in accurately preserving the complementary information from the source images in the fused image. However, existing datasets often lack ground truth images, making it difficult for some full-reference loss functions (such as SSIM) to effectively participate in model training, thereby further affecting the performance of retaining source image details. To address this issue, this paper proposes an unsupervised dual-channel dense convolutional method, DCD, for multi-focus image fusion. DCD designs Patch processing blocks specifically for the fusion task, which segment the source image pairs into equally sized patches and evaluate their information to obtain a reconstructed image and a set of adaptive weight coefficients. The reconstructed image is used as the reference image, enabling unsupervised methods to utilize full-reference loss functions in training and overcoming the challenge of lacking labeled data in the training set. Furthermore, considering that the human visual system (HVS) is more sensitive to brightness than color, DCD trains the dual-channel network using both RGB images and their luminance components. This allows the network to focus more on the brightness information while preserving the color and gradient details of the source images, resulting in fused images that are more compatible with the HVS. The adaptive weight coefficients obtained through the Patch processing blocks are also used to determine the degree of preservation of the brightness information in the source images. Finally, comparative experiments on different datasets also demonstrate the superior performance of DCD in terms of fused image quality compared to other methods.</p></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314224001103\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224001103","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多焦点图像融合任务的难点在于如何在融合图像中准确保留源图像的互补信息。然而，现有的数据集往往缺乏地面真实图像，使得一些全参考损失函数（如 SSIM）难以有效参与模型训练，从而进一步影响了保留源图像细节的性能。针对这一问题，本文提出了一种用于多焦点图像融合的无监督双通道密集卷积方法 DCD。DCD 专门为融合任务设计了 "补丁 "处理块，将源图像对分割成大小相等的补丁，并对其信息进行评估，从而获得重建图像和一组自适应权重系数。重建后的图像被用作参考图像，从而使无监督方法能够在训练中使用全参考损失函数，并克服了训练集中缺乏标记数据的难题。此外，考虑到人类视觉系统（HVS）对亮度的敏感度高于对色彩的敏感度，DCD 使用 RGB 图像及其亮度分量来训练双通道网络。这使得网络在保留源图像的色彩和梯度细节的同时，更加关注亮度信息，从而生成更符合 HVS 的融合图像。通过 "补丁 "处理块获得的自适应权重系数也用于确定源图像中亮度信息的保留程度。最后，不同数据集的对比实验也证明，与其他方法相比，DCD 在融合图像质量方面表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An unsupervised multi-focus image fusion method via dual-channel convolutional network and discriminator

The challenge in multi-focus image fusion tasks lies in accurately preserving the complementary information from the source images in the fused image. However, existing datasets often lack ground truth images, making it difficult for some full-reference loss functions (such as SSIM) to effectively participate in model training, thereby further affecting the performance of retaining source image details. To address this issue, this paper proposes an unsupervised dual-channel dense convolutional method, DCD, for multi-focus image fusion. DCD designs Patch processing blocks specifically for the fusion task, which segment the source image pairs into equally sized patches and evaluate their information to obtain a reconstructed image and a set of adaptive weight coefficients. The reconstructed image is used as the reference image, enabling unsupervised methods to utilize full-reference loss functions in training and overcoming the challenge of lacking labeled data in the training set. Furthermore, considering that the human visual system (HVS) is more sensitive to brightness than color, DCD trains the dual-channel network using both RGB images and their luminance components. This allows the network to focus more on the brightness information while preserving the color and gradient details of the source images, resulting in fused images that are more compatible with the HVS. The adaptive weight coefficients obtained through the Patch processing blocks are also used to determine the degree of preservation of the brightness information in the source images. Finally, comparative experiments on different datasets also demonstrate the superior performance of DCD in terms of fused image quality compared to other methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems