Regularization by denoising diffusion process meets deep relaxation in phase

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Image and Vision Computing Pub Date : 2024-11-01 Epub Date: 2024-09-20 DOI:10.1016/j.imavis.2024.105282

Eunju Cha

{"title":"Regularization by denoising diffusion process meets deep relaxation in phase","authors":"Eunju Cha","doi":"10.1016/j.imavis.2024.105282","DOIUrl":null,"url":null,"abstract":"<div><div>Fourier phase retrieval is one of the representative inverse problems where a signal needs to be recovered using only the measured magnitude of its Fourier transform. Deep learning-based algorithms for solving Fourier phase retrieval have been widely studied. These methods provide better reconstruction than the conventional algorithms, such as alternating projection approaches and convex relaxation methods. However, it is difficult to recover the phase information of 256 × 256 images accurately, and they often cannot provide fine details and textures. Recently, diffusion models have been used to solve Fourier phase retrieval problems. They offer realistic reconstruction results, but due to the nature of generative models, they often create non-existent features in the actual images. To address these issues, we introduced a novel algorithm inspired by regularization by denoising diffusion, a variational diffusion sampling for reconstructing the images from the measurements. In particular, the optimization problem in the convex relaxation approach for phase retrieval is interpreted as an additional constraint during the variational sampling process to estimate the phase from the given Fourier magnitude measurement. The proposed method stands out by leveraging not only pre-trained diffusion models as image priors but also the classical optimization approach as the regularization. This novel combination ensures not just accurate phase reconstruction, but also performance guarantees. Our experiments demonstrate that the proposed algorithm consistently provides state-of-the-art performance across various datasets of 256 × 256 images. We further showed the effectiveness of the new regularization for the performance gain in the phase estimation.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105282"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003871","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Fourier phase retrieval is one of the representative inverse problems where a signal needs to be recovered using only the measured magnitude of its Fourier transform. Deep learning-based algorithms for solving Fourier phase retrieval have been widely studied. These methods provide better reconstruction than the conventional algorithms, such as alternating projection approaches and convex relaxation methods. However, it is difficult to recover the phase information of 256 × 256 images accurately, and they often cannot provide fine details and textures. Recently, diffusion models have been used to solve Fourier phase retrieval problems. They offer realistic reconstruction results, but due to the nature of generative models, they often create non-existent features in the actual images. To address these issues, we introduced a novel algorithm inspired by regularization by denoising diffusion, a variational diffusion sampling for reconstructing the images from the measurements. In particular, the optimization problem in the convex relaxation approach for phase retrieval is interpreted as an additional constraint during the variational sampling process to estimate the phase from the given Fourier magnitude measurement. The proposed method stands out by leveraging not only pre-trained diffusion models as image priors but also the classical optimization approach as the regularization. This novel combination ensures not just accurate phase reconstruction, but also performance guarantees. Our experiments demonstrate that the proposed algorithm consistently provides state-of-the-art performance across various datasets of 256 × 256 images. We further showed the effectiveness of the new regularization for the performance gain in the phase estimation.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

去噪扩散过程的正则化与相位深度松弛的结合

傅立叶相位检索是具有代表性的逆问题之一，在这种问题中，只需使用信号傅立叶变换的测量幅值即可恢复信号。基于深度学习的傅立叶相位检索算法已被广泛研究。与交替投影法和凸松弛法等传统算法相比，这些方法能提供更好的重建效果。然而，这些方法很难准确恢复 256 × 256 图像的相位信息，而且往往无法提供精细的细节和纹理。最近，扩散模型被用来解决傅立叶相位检索问题。它们能提供逼真的重建结果，但由于生成模型的性质，它们往往会在实际图像中创建不存在的特征。为了解决这些问题，我们受去噪扩散正则化的启发，引入了一种新型算法--变异扩散采样，用于根据测量结果重建图像。特别是，在变分采样过程中，相位检索的凸松弛方法中的优化问题被解释为从给定的傅立叶幅值测量中估计相位的附加约束。所提出的方法不仅利用了预先训练的扩散模型作为图像先验，还利用了经典的优化方法作为正则化，从而脱颖而出。这种新颖的组合不仅能确保准确的相位重建，还能保证性能。我们的实验证明，所提出的算法在 256 × 256 图像的各种数据集上始终保持着最先进的性能。我们进一步证明了新正则化在相位估计性能提升方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.