Pub Date : 2025-02-28DOI: 10.1109/TCI.2025.3539448
Eric Bezzam;Yohann Perron;Martin Vetterli
Lensless cameras disregard the conventional design that imaging should mimic the human eye. This is done by replacing the lens with a thin mask, and moving image formation to the digital post-processing. State-of-the-art lensless imaging techniques use learned approaches that combine physical modeling and neural networks. However, these approaches make simplifying modeling assumptions for ease of calibration and computation. Moreover, the generalizability of learned approaches to lensless measurements of new masks has not been studied. To this end, we utilize a modular learned reconstruction in which a key component is a pre-processor prior to image recovery. We theoretically demonstrate the pre-processor's necessity for standard image recovery techniques (Wiener filtering and iterative algorithms), and through extensive experiments show its effectiveness for multiple lensless imaging approaches and across datasets of different mask types (amplitude and phase). We also perform the first generalization benchmark across mask types to evaluate how well reconstructions trained with one system generalize to others. Our modular reconstruction enables us to use pre-trained components and transfer learning on new systems to cut down weeks of tedious measurements and training. As part of our work, we open-source four datasets, and software for measuring datasets and for training our modular reconstruction.
{"title":"Towards Robust and Generalizable Lensless Imaging With Modular Learned Reconstruction","authors":"Eric Bezzam;Yohann Perron;Martin Vetterli","doi":"10.1109/TCI.2025.3539448","DOIUrl":"https://doi.org/10.1109/TCI.2025.3539448","url":null,"abstract":"Lensless cameras disregard the conventional design that imaging should mimic the human eye. This is done by replacing the lens with a thin mask, and moving image formation to the digital post-processing. State-of-the-art lensless imaging techniques use learned approaches that combine physical modeling and neural networks. However, these approaches make simplifying modeling assumptions for ease of calibration and computation. Moreover, the generalizability of learned approaches to lensless measurements of new masks has not been studied. To this end, we utilize a modular learned reconstruction in which a key component is a pre-processor prior to image recovery. We theoretically demonstrate the pre-processor's necessity for standard image recovery techniques (Wiener filtering and iterative algorithms), and through extensive experiments show its effectiveness for multiple lensless imaging approaches and across datasets of different mask types (amplitude and phase). We also perform the first generalization benchmark across mask types to evaluate how well reconstructions trained with one system generalize to others. Our modular reconstruction enables us to use pre-trained components and transfer learning on new systems to cut down weeks of tedious measurements and training. As part of our work, we open-source four datasets, and software for measuring datasets and for training our modular reconstruction.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"213-227"},"PeriodicalIF":4.2,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Controllable Depth-of-Field (DoF) imaging commonly produces amazing visual effects based on heavy and expensive high-end lenses. However, confronted with the increasing demand for mobile scenarios, it is desirable to achieve a lightweight solution with Minimalist Optical Systems (MOS). This work centers around two major limitations of MOS, i.e., the severe optical aberrations and uncontrollable DoF, for achieving single-lens controllable DoF imaging via computational methods. A Depth-aware Controllable DoF Imaging (DCDI) framework is proposed equipped with All-in-Focus (AiF) aberration correction and monocular depth estimation, where the recovered image and corresponding depth map are utilized to produce imaging results under diverse DoFs of any high-end lens via patch-wise convolution. To address the depth-varying optical degradation, we introduce a Depth-aware Degradation-adaptive Training (DA$^{2}$T) scheme. At the dataset level, a Depth-aware Aberration MOS (DAMOS) dataset is established based on the simulation of Point Spread Functions (PSFs) under different object distances. Additionally, we design two plug-and-play depth-aware mechanisms to embed depth information into the aberration image recovery for better tackling depth-aware degradation. Furthermore, we propose a storage-efficient Omni-Lens-Field model to represent the 4D PSF library of various lenses. With the predicted depth map, recovered image, and depth-aware PSF map inferred by Omni-Lens-Field, single-lens controllable DoF imaging is achieved. To the best of our knowledge, we are the first to explore the single-lens controllable DoF imaging solution. Comprehensive experimental results demonstrate that the proposed framework enhances the recovery performance, and attains impressive single-lens controllable DoF imaging results, providing a seminal baseline for this field.
{"title":"Towards Single-Lens Controllable Depth-of-Field Imaging via Depth-Aware Point Spread Functions","authors":"Xiaolong Qian;Qi Jiang;Yao Gao;Shaohua Gao;Zhonghua Yi;Lei Sun;Kai Wei;Haifeng Li;Kailun Yang;Kaiwei Wang;Jian Bai","doi":"10.1109/TCI.2025.3544019","DOIUrl":"https://doi.org/10.1109/TCI.2025.3544019","url":null,"abstract":"Controllable Depth-of-Field (DoF) imaging commonly produces amazing visual effects based on heavy and expensive high-end lenses. However, confronted with the increasing demand for mobile scenarios, it is desirable to achieve a lightweight solution with Minimalist Optical Systems (MOS). This work centers around two major limitations of MOS, i.e., the severe optical aberrations and uncontrollable DoF, for achieving single-lens controllable DoF imaging via computational methods. A Depth-aware Controllable DoF Imaging (DCDI) framework is proposed equipped with All-in-Focus (AiF) aberration correction and monocular depth estimation, where the recovered image and corresponding depth map are utilized to produce imaging results under diverse DoFs of any high-end lens via patch-wise convolution. To address the depth-varying optical degradation, we introduce a Depth-aware Degradation-adaptive Training (DA<inline-formula> <tex-math>$^{2}$</tex-math></inline-formula>T) scheme. At the dataset level, a Depth-aware Aberration MOS (DAMOS) dataset is established based on the simulation of Point Spread Functions (PSFs) under different object distances. Additionally, we design two plug-and-play depth-aware mechanisms to embed depth information into the aberration image recovery for better tackling depth-aware degradation. Furthermore, we propose a storage-efficient Omni-Lens-Field model to represent the 4D PSF library of various lenses. With the predicted depth map, recovered image, and depth-aware PSF map inferred by Omni-Lens-Field, single-lens controllable DoF imaging is achieved. To the best of our knowledge, we are the first to explore the single-lens controllable DoF imaging solution. Comprehensive experimental results demonstrate that the proposed framework enhances the recovery performance, and attains impressive single-lens controllable DoF imaging results, providing a seminal baseline for this field.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"305-320"},"PeriodicalIF":4.2,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recovering a dense depth image from sparse inputs is inherently challenging. Image-guided depth completion has become a prevalent technique, leveraging sparse depth data alongside RGB images to produce detailed depth maps. Although deep learning-based methods have achieved notable success, many state-of-the-art networks operate as black boxes, lacking transparent mechanisms for depth recovery. To address this, we introduce a novel model-guided depth recovery method. Our approach is built on a maximum a posterior (MAP) framework and features an optimization model that incorporates a non-local cross-modality regularizer and a deep image prior. The cross-modality regularizer capitalizes on the inherent correlations between depth and RGB images, enhancing the extraction of shared information. Additionally, the deep image prior captures local characteristics between the depth and RGB domains effectively. To counter the challenge of high heterogeneity leading to degenerate operators, we have integrated an implicit data consistency term into our model. Our model is then realized as a network using the half-quadratic splitting algorithm. Extensive evaluations on the NYU-Depth V2 and SUN RGB-D datasets demonstrate that our method performs competitively with current deep learning techniques.
{"title":"NLCMR: Indoor Depth Recovery Model With Non-Local Cross-Modality Prior","authors":"Junkang Zhang;Zhengkai Qi;Faming Fang;Tingting Wang;Guixu Zhang","doi":"10.1109/TCI.2025.3545358","DOIUrl":"https://doi.org/10.1109/TCI.2025.3545358","url":null,"abstract":"Recovering a dense depth image from sparse inputs is inherently challenging. Image-guided depth completion has become a prevalent technique, leveraging sparse depth data alongside RGB images to produce detailed depth maps. Although deep learning-based methods have achieved notable success, many state-of-the-art networks operate as black boxes, lacking transparent mechanisms for depth recovery. To address this, we introduce a novel model-guided depth recovery method. Our approach is built on a maximum a posterior (MAP) framework and features an optimization model that incorporates a non-local cross-modality regularizer and a deep image prior. The cross-modality regularizer capitalizes on the inherent correlations between depth and RGB images, enhancing the extraction of shared information. Additionally, the deep image prior captures local characteristics between the depth and RGB domains effectively. To counter the challenge of high heterogeneity leading to degenerate operators, we have integrated an implicit data consistency term into our model. Our model is then realized as a network using the half-quadratic splitting algorithm. Extensive evaluations on the NYU-Depth V2 and SUN RGB-D datasets demonstrate that our method performs competitively with current deep learning techniques.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"265-276"},"PeriodicalIF":4.2,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.1109/TCI.2025.3545357
Yao Gao;Qi Jiang;Shaohua Gao;Lei Sun;Kailun Yang;Kaiwei Wang
Recently, joint design approaches that simultaneously optimize optical systems and downstream algorithms through data-driven learning have demonstrated superior performance over traditional separate design approaches. However, current joint design approaches heavily rely on the manual identification of initial lenses, posing challenges and limitations, particularly for compound lens systems with multiple potential starting points. In this work, we present Quasi-Global Search Optics (QGSO) to automatically design compound lens based computational imaging systems through two parts: (i) Fused Optimization Method for Automatic Optical Design (OptiFusion), which searches for diverse initial optical systems under certain design specifications; and (ii) Efficient Physic-aware Joint Optimization (EPJO), which conducts parallel joint optimization of initial optical systems and image reconstruction networks with the consideration of physical constraints, culminating in the selection of the optimal solution in all search results. Extensive experimental results illustrate that QGSO serves as a transformative end-to-end lens design paradigm for superior global search ability, which automatically provides compound lens based computational imaging systems with higher imaging quality compared to existing paradigms.
{"title":"Exploring Quasi-Global Solutions to Compound Lens Based Computational Imaging Systems","authors":"Yao Gao;Qi Jiang;Shaohua Gao;Lei Sun;Kailun Yang;Kaiwei Wang","doi":"10.1109/TCI.2025.3545357","DOIUrl":"https://doi.org/10.1109/TCI.2025.3545357","url":null,"abstract":"Recently, joint design approaches that simultaneously optimize optical systems and downstream algorithms through data-driven learning have demonstrated superior performance over traditional separate design approaches. However, current joint design approaches heavily rely on the manual identification of initial lenses, posing challenges and limitations, particularly for compound lens systems with multiple potential starting points. In this work, we present Quasi-Global Search Optics (QGSO) to automatically design compound lens based computational imaging systems through two parts: (i) Fused Optimization Method for Automatic Optical Design (OptiFusion), which searches for diverse initial optical systems under certain design specifications; and (ii) Efficient Physic-aware Joint Optimization (EPJO), which conducts parallel joint optimization of initial optical systems and image reconstruction networks with the consideration of physical constraints, culminating in the selection of the optimal solution in all search results. Extensive experimental results illustrate that QGSO serves as a transformative end-to-end lens design paradigm for superior global search ability, which automatically provides compound lens based computational imaging systems with higher imaging quality compared to existing paradigms.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"333-348"},"PeriodicalIF":4.2,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1109/TCI.2025.3544087
Jiawei Dong;Hong Zeng;Sen Dong;Weining Chen;Qianxi Li;Jianzhong Cao;Qiurong Yan;Hao Wang
Single-pixel imaging can reconstruct the original image at a low measurement rate (MR), and the target can be measured and reconstructed in low-light environments by capturing the light intensity information using a single-photon detector. Optimizing reconstruction results at low MR has become a focal point of research aimed at enhancing measurement efficiency. The application of neural network has significantly improved reconstruction quality, but the performance still requires further enhancement. In this paper, a Diffusion Single Pixel Imaging Model (DSPIM) method is proposed. The conditional diffusion model is utilized in the training and reconstruction processes of single-pixel imaging and is jointly optimized with an autoencoder network. This approach simulates the measurement and preliminary reconstruction of images, which are incorporated into the diffusion process as conditions. The noises and features are learned through a designed loss function that consists of predicted noise loss and measurement accuracy loss, allowing the reconstruction to perform well at very low MR. Besides, an adaptive regularization coefficients adjustment method (ARCA) has been designed for more effective optimization. Finally, the learned weights are loaded into the single photon counting system as a measurement matrix, demonstrating that the blurriness caused by insufficient features at low MR is effectively addressed using our methods, resulting in clearer targets and well-distinguished features.
{"title":"Enhanced Single Pixel Imaging by Using Adaptive Jointly Optimized Conditional Diffusion","authors":"Jiawei Dong;Hong Zeng;Sen Dong;Weining Chen;Qianxi Li;Jianzhong Cao;Qiurong Yan;Hao Wang","doi":"10.1109/TCI.2025.3544087","DOIUrl":"https://doi.org/10.1109/TCI.2025.3544087","url":null,"abstract":"Single-pixel imaging can reconstruct the original image at a low measurement rate (MR), and the target can be measured and reconstructed in low-light environments by capturing the light intensity information using a single-photon detector. Optimizing reconstruction results at low MR has become a focal point of research aimed at enhancing measurement efficiency. The application of neural network has significantly improved reconstruction quality, but the performance still requires further enhancement. In this paper, a Diffusion Single Pixel Imaging Model (DSPIM) method is proposed. The conditional diffusion model is utilized in the training and reconstruction processes of single-pixel imaging and is jointly optimized with an autoencoder network. This approach simulates the measurement and preliminary reconstruction of images, which are incorporated into the diffusion process as conditions. The noises and features are learned through a designed loss function that consists of predicted noise loss and measurement accuracy loss, allowing the reconstruction to perform well at very low MR. Besides, an adaptive regularization coefficients adjustment method (ARCA) has been designed for more effective optimization. Finally, the learned weights are loaded into the single photon counting system as a measurement matrix, demonstrating that the blurriness caused by insufficient features at low MR is effectively addressed using our methods, resulting in clearer targets and well-distinguished features.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"289-304"},"PeriodicalIF":4.2,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1109/TCI.2025.3544065
Yi-Zeng Hsieh;Ming-Ching Chang
Underwater image analytic technologies is important to study in-water imagery in oceanography. Due to the poor lighting conditions and severe scattering and attenuation of light, underwater image quality is heavily reduced in such environment. Therefore, underwater image enhancement has always been an essential step in the analysis pipeline. We develop an Underwater Image Enhancement and Attenuation Restoration (UIEAR) algorithm from a RGB image input based on 3D depth and backscatter estimation. The proposed underwater image enhancement method achieves superior performance with light computational requirements, making it easy to deploy on edge devices. We provide the following contributions: (1) Our image enhancement is based on depth estimation using a new smooth operator on RGB pixels, which provides 3D spatial information for improved backscatter estimation and attenuation restoration. (2) We develop an improved imaging model by considering parameters relative to the camera and the local light source to estimate the attenuation and the backscatter effects. Our light source estimation is constructed from a local neighborhood of pixels to avoid distortion of the backscatter and attenuation estimation. (3) We adopt white balance adjustment to enhance underwater pixels and better match real-world colors. Our method improves general underwater image analysis including object detection and segmentation. Experimental results demonstrate the effectiveness of our algorithm in restoring and enhancing underwater images.
{"title":"Underwater Image Enhancement and Attenuation Restoration Based on Depth and Backscatter Estimation","authors":"Yi-Zeng Hsieh;Ming-Ching Chang","doi":"10.1109/TCI.2025.3544065","DOIUrl":"https://doi.org/10.1109/TCI.2025.3544065","url":null,"abstract":"Underwater image analytic technologies is important to study in-water imagery in oceanography. Due to the poor lighting conditions and severe scattering and attenuation of light, underwater image quality is heavily reduced in such environment. Therefore, underwater image enhancement has always been an essential step in the analysis pipeline. We develop an Underwater Image Enhancement and Attenuation Restoration (UIEAR) algorithm from a RGB image input based on 3D depth and backscatter estimation. The proposed underwater image enhancement method achieves superior performance with light computational requirements, making it easy to deploy on edge devices. We provide the following contributions: (1) Our image enhancement is based on depth estimation using a new smooth operator on RGB pixels, which provides 3D spatial information for improved backscatter estimation and attenuation restoration. (2) We develop an improved imaging model by considering parameters relative to the camera and the local light source to estimate the attenuation and the backscatter effects. Our light source estimation is constructed from a local neighborhood of pixels to avoid distortion of the backscatter and attenuation estimation. (3) We adopt white balance adjustment to enhance underwater pixels and better match real-world colors. Our method improves general underwater image analysis including object detection and segmentation. Experimental results demonstrate the effectiveness of our algorithm in restoring and enhancing underwater images.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"321-332"},"PeriodicalIF":4.2,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1109/TCI.2025.3530256
Kang Qin;Meng Cao;Peng Ren;Fengchen Luo;Siyu Liu
Medium heterogeneity poses a severe challenge to image reconstruction in transcranial photoacoustic tomography, which cannot be fully addressed by the homogeneous phase shift migration method. Although the existing methods can enhancethe imaging quality to a certain extent, they are limited by the large approximation errors and low computational efficiency. To further improve imaging performance and calculation speed, this paper proposes full matrix wavefield migration, which takes into account both lateral and longitudinal variations of speed of sound (SOS). Unlike the PSM method which relies on a layer-by-layer migration framework, the proposed approach reformulates the SOS map across the propagation medium into a spatial matrix of SOS. By means of extrapolating wavefield data in the wavenumber domain and correcting phase deviations in the spatial domain, this method reduces the image distortion caused by SOS irregularity and suppresses artifacts in reconstructed images. Moreover, the calculation process is further optimized to eliminate redundancy. Simulation and experimental results demonstrate that full matrix wavefield migration method improves lateral resolution (up to 21.24%) and computational efficiency (about 19.84%) compared to the previous methods.
{"title":"Full Matrix Wavefield Migration for Layered Photoacoustic Imaging","authors":"Kang Qin;Meng Cao;Peng Ren;Fengchen Luo;Siyu Liu","doi":"10.1109/TCI.2025.3530256","DOIUrl":"https://doi.org/10.1109/TCI.2025.3530256","url":null,"abstract":"Medium heterogeneity poses a severe challenge to image reconstruction in transcranial photoacoustic tomography, which cannot be fully addressed by the homogeneous phase shift migration method. Although the existing methods can enhancethe imaging quality to a certain extent, they are limited by the large approximation errors and low computational efficiency. To further improve imaging performance and calculation speed, this paper proposes full matrix wavefield migration, which takes into account both lateral and longitudinal variations of speed of sound (SOS). Unlike the PSM method which relies on a layer-by-layer migration framework, the proposed approach reformulates the SOS map across the propagation medium into a spatial matrix of SOS. By means of extrapolating wavefield data in the wavenumber domain and correcting phase deviations in the spatial domain, this method reduces the image distortion caused by SOS irregularity and suppresses artifacts in reconstructed images. Moreover, the calculation process is further optimized to eliminate redundancy. Simulation and experimental results demonstrate that full matrix wavefield migration method improves lateral resolution (up to 21.24%) and computational efficiency (about 19.84%) compared to the previous methods.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"179-188"},"PeriodicalIF":4.2,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1109/TCI.2025.3540711
Qikui Zhu;Andrew L. Wentland;Shuo Li
Contrast-enhanced CT imaging (CECTI) is crucial for the diagnosis of patients with liver tumors. Therefore, if CECTI can be synthesized using only non-contrast CT imaging (NCCTI), it will provide significant clinical advantages. We propose a novel contrast-aware network with Aggregated-interacted Transformer and Multi-granularity aligned contrastive learning (AMNet) for CECTI synthesizing, which enables synthesizing CECTI for the first time. AMNet mitigates the challenges associated with high-risk, time-consuming, expensive, and radiation-intensive procedures required for obtaining CECTI. Furthermore, it overcomes the challenges of low contrast and low sensitivity in CT imaging through four key innovations to address these challenges: 1) The Aggregated-Interacted Transformer (AI-Transformer) introduces two mechanisms: multi-scale token aggregation and cross-token interaction. These enable long-range dependencies between multi-scale cross-tokens, facilitating the extraction of discriminative structural and content features of tissues, thereby addressing the low-contrast challenge. 2) The Multi-granularity Aligned Contrastive Learning (MACL) constructs a new regularization term for exploiting intra-domain compact and inter-domain separable features to improve the model's sensitivity to chemical contrast agents (CAs) and overcome the low sensitivity challenge. 3) The Contrast-Aware Adaptive Layer (CAL) imbues the AMNet with contrast-aware abilities that adaptively adjust the contrast information of various regions to achieve perfect synthesis. 4) The dual-stream discriminator (DSD) adopts an ensemble strategy to evaluate the synthetic CECTI from multiple perspectives. AMNet is validated using two corresponding CT imaging modalities (pre-contrast and portal venous-phase), an essential procedure for liver tumor biopsy. Experimental results demonstrate that our AMNet has successfully synthesized CECTI without chemical CA injections for the first time.
{"title":"Contrast-Aware Network With Aggregated-Interacted Transformer and Multi-Granularity Aligned Contrastive Learning for Synthesizing Contrast-Enhanced Abdomen CT Imaging","authors":"Qikui Zhu;Andrew L. Wentland;Shuo Li","doi":"10.1109/TCI.2025.3540711","DOIUrl":"https://doi.org/10.1109/TCI.2025.3540711","url":null,"abstract":"Contrast-enhanced CT imaging (CECTI) is crucial for the diagnosis of patients with liver tumors. Therefore, if CECTI can be synthesized using only non-contrast CT imaging (NCCTI), it will provide significant clinical advantages. We propose a novel contrast-aware network with Aggregated-interacted Transformer and Multi-granularity aligned contrastive learning (AMNet) for CECTI synthesizing, which enables synthesizing CECTI for the first time. AMNet mitigates the challenges associated with high-risk, time-consuming, expensive, and radiation-intensive procedures required for obtaining CECTI. Furthermore, it overcomes the challenges of low contrast and low sensitivity in CT imaging through four key innovations to address these challenges: 1) The Aggregated-Interacted Transformer (AI-Transformer) introduces two mechanisms: multi-scale token aggregation and cross-token interaction. These enable long-range dependencies between multi-scale cross-tokens, facilitating the extraction of discriminative structural and content features of tissues, thereby addressing the low-contrast challenge. 2) The Multi-granularity Aligned Contrastive Learning (MACL) constructs a new regularization term for exploiting intra-domain compact and inter-domain separable features to improve the model's sensitivity to chemical contrast agents (CAs) and overcome the low sensitivity challenge. 3) The Contrast-Aware Adaptive Layer (CAL) imbues the AMNet with contrast-aware abilities that adaptively adjust the contrast information of various regions to achieve perfect synthesis. 4) The dual-stream discriminator (DSD) adopts an ensemble strategy to evaluate the synthetic CECTI from multiple perspectives. AMNet is validated using two corresponding CT imaging modalities (pre-contrast and portal venous-phase), an essential procedure for liver tumor biopsy. Experimental results demonstrate that our AMNet has successfully synthesized CECTI without chemical CA injections for the first time.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"277-288"},"PeriodicalIF":4.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143594310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1109/TCI.2025.3540707
Changyu Chen;Yuxiang Xing;Li Zhang;Zhiqiang Chen
In this work, we investigate the feature of projection sampling and analytical reconstruction algorithms for a Static CT with sources and detectors distributed in a Multi-Segment manner (MS-StaticCT). MS-StaticCT is a generalized configuration of previous static linear CT systems offering enhanced design flexibility and utilization efficiency in both X-ray source and detector components. By analyzing the imaging geometry of single-segment source and detector pairs, we delved into the Radon space properties of MS-StaticCT and proposed a data sufficiency condition for system design. To explore the impact of the unique sampling characteristics of MS-StaticCT on reconstruction quality, we derived analytical algorithms under two popular pipelines filtered-backprojection (MS-FBP) and differentiated backprojection filtration (MS-DBF), and assessed their performance. Due to the non-uniform sampling and singular points between segments, the global filtration process of MS-FBP requires local rebinning. The local nature of differentiation enables convenient filtration without rebinning. Besides, to address insufficient data caused by optical obstruction by sources and detectors, we incorporated multiple imaging planes and designed a generalized weighting function that efficiently utilizes conjugate projections. Simulation studies on numerical phantoms and clinical CT data demonstrate the feasibility of MS-StaticCT and the proposed reconstruction algorithms. The results highlighted MS-DBF's superiority in accuracy and spatial resolution for multi-segment geometries without compromising noise performance compared to MS-FBP whose performance depends on the number of detector segments involved for each focal spot. Our study provides a comprehensive understanding of the essential data structure and basic reconstruction tailored for systems characterized by linear source trajectories and detectors.
{"title":"Static CT With Sources and Detectors Distributed in a Multi-Segment Manner: System Analysis and Analytical Reconstruction","authors":"Changyu Chen;Yuxiang Xing;Li Zhang;Zhiqiang Chen","doi":"10.1109/TCI.2025.3540707","DOIUrl":"https://doi.org/10.1109/TCI.2025.3540707","url":null,"abstract":"In this work, we investigate the feature of projection sampling and analytical reconstruction algorithms for a Static CT with sources and detectors distributed in a Multi-Segment manner (MS-StaticCT). MS-StaticCT is a generalized configuration of previous static linear CT systems offering enhanced design flexibility and utilization efficiency in both X-ray source and detector components. By analyzing the imaging geometry of single-segment source and detector pairs, we delved into the Radon space properties of MS-StaticCT and proposed a data sufficiency condition for system design. To explore the impact of the unique sampling characteristics of MS-StaticCT on reconstruction quality, we derived analytical algorithms under two popular pipelines filtered-backprojection (MS-FBP) and differentiated backprojection filtration (MS-DBF), and assessed their performance. Due to the non-uniform sampling and singular points between segments, the global filtration process of MS-FBP requires local rebinning. The local nature of differentiation enables convenient filtration without rebinning. Besides, to address insufficient data caused by optical obstruction by sources and detectors, we incorporated multiple imaging planes and designed a generalized weighting function that efficiently utilizes conjugate projections. Simulation studies on numerical phantoms and clinical CT data demonstrate the feasibility of MS-StaticCT and the proposed reconstruction algorithms. The results highlighted MS-DBF's superiority in accuracy and spatial resolution for multi-segment geometries without compromising noise performance compared to MS-FBP whose performance depends on the number of detector segments involved for each focal spot. Our study provides a comprehensive understanding of the essential data structure and basic reconstruction tailored for systems characterized by linear source trajectories and detectors.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"251-264"},"PeriodicalIF":4.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-05DOI: 10.1109/TCI.2025.3539021
Nikola Janjušević;Amirhossein Khalilian-Gourtani;Adeen Flinker;Li Feng;Yao Wang
Nonlocal self-similarity within images has become an increasingly popular prior in deep-learning models. Despite their successful image restoration performance, such models remain largely uninterpretable due to their black-box construction. Our previous studies have shown that interpretable construction of a fully convolutional denoiser (CDLNet), with performance on par with state-of-the-art black-box counterparts, is achievable by unrolling a convolutional dictionary learning algorithm. In this manuscript, we seek an interpretable construction of a convolutional network with a nonlocal self-similarity prior that performs on par with black-box nonlocal models. We show that such an architecture can be effectively achieved by upgrading the $ell _{1}$ sparsity prior (soft-thresholding) of CDLNet to an image-adaptive group-sparsity prior (group-thresholding). The proposed learned group-thresholding makes use of nonlocal attention to perform spatially varying soft-thresholding on the latent representation. To enable effective training and inference on large images with global artifacts, we propose a novel circulant-sparse attention. We achieve competitive natural-image denoising performance compared to black-box nonlocal DNNs and transformers. The interpretable construction of our network allows for a straightforward extension to Compressed Sensing MRI (CS-MRI), yielding state-of-the-art performance. Lastly, we show robustness to noise-level mismatches between training and inference for denoising and CS-MRI reconstruction.
{"title":"GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention","authors":"Nikola Janjušević;Amirhossein Khalilian-Gourtani;Adeen Flinker;Li Feng;Yao Wang","doi":"10.1109/TCI.2025.3539021","DOIUrl":"https://doi.org/10.1109/TCI.2025.3539021","url":null,"abstract":"Nonlocal self-similarity within images has become an increasingly popular prior in deep-learning models. Despite their successful image restoration performance, such models remain largely uninterpretable due to their black-box construction. Our previous studies have shown that interpretable construction of a fully convolutional denoiser (CDLNet), with performance on par with state-of-the-art black-box counterparts, is achievable by unrolling a convolutional dictionary learning algorithm. In this manuscript, we seek an interpretable construction of a convolutional network with a nonlocal self-similarity prior that performs on par with black-box nonlocal models. We show that such an architecture can be effectively achieved by upgrading the <inline-formula><tex-math>$ell _{1}$</tex-math></inline-formula> sparsity prior (soft-thresholding) of CDLNet to an image-adaptive group-sparsity prior (group-thresholding). The proposed learned group-thresholding makes use of nonlocal attention to perform spatially varying soft-thresholding on the latent representation. To enable effective training and inference on large images with global artifacts, we propose a novel <italic>circulant-sparse attention</i>. We achieve competitive natural-image denoising performance compared to black-box nonlocal DNNs and transformers. The interpretable construction of our network allows for a straightforward extension to Compressed Sensing MRI (CS-MRI), yielding state-of-the-art performance. Lastly, we show robustness to noise-level mismatches between training and inference for denoising and CS-MRI reconstruction.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"201-212"},"PeriodicalIF":4.2,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143455295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}