Magnetic Particle Imaging (MPI) is an emerging biomedical imaging technique. The x-space method, one of the mainstream reconstruction methods in MPI, offers high efficiency and real-time capabilities but is limited by theoretical spatial resolution constraints and typically necessitates high gradient magnetic fields. This study introduces a semi-analytical reconstruction (Semi-AR) method for x-space MPI scanner, incorporating a kernel optimization step to achieve a spatial resolution better than the theoretical limit. By modeling the x-space MPI system with focus-field sequences as a linear shift invariant system, the point spread function (PSF) is decomposed into basis functions and variants across different spatial frequencies. These functions are weighted to reconstruct a high-resolution PSF, with optimal weights adaptively determined via quadratic programming. A mouse-sized MPI scanner with 3D focus-field sequences was developed to evaluate the method. Simulation and experimental results showcase Semi-AR’s superior spatial resolution and robustness compared to existing x-space techniques, particularly in detecting low-brightness targets near highlighted non-target organs. Both phantom and in vivo experiments robustly validate Semi-AR’s effectiveness, providing new insights into MPI scanner development, and advancing preclinical and potential clinical MPI applications.
{"title":"Semi-Analytical Super-Resolution X-Space Reconstruction for Magnetic Particle Imaging Scanner via Adaptive Kernel Optimization","authors":"Yanjun Liu;Lei Li;Guanghui Li;Siao Lei;Deshang Duan;Yang Jing;Peng Yang;Xin Feng;Yu An;Hui Hui;Jie Tian","doi":"10.1109/TCI.2025.3615397","DOIUrl":"https://doi.org/10.1109/TCI.2025.3615397","url":null,"abstract":"Magnetic Particle Imaging (MPI) is an emerging biomedical imaging technique. The x-space method, one of the mainstream reconstruction methods in MPI, offers high efficiency and real-time capabilities but is limited by theoretical spatial resolution constraints and typically necessitates high gradient magnetic fields. This study introduces a semi-analytical reconstruction (Semi-AR) method for x-space MPI scanner, incorporating a kernel optimization step to achieve a spatial resolution better than the theoretical limit. By modeling the x-space MPI system with focus-field sequences as a linear shift invariant system, the point spread function (PSF) is decomposed into basis functions and variants across different spatial frequencies. These functions are weighted to reconstruct a high-resolution PSF, with optimal weights adaptively determined via quadratic programming. A mouse-sized MPI scanner with 3D focus-field sequences was developed to evaluate the method. Simulation and experimental results showcase Semi-AR’s superior spatial resolution and robustness compared to existing x-space techniques, particularly in detecting low-brightness targets near highlighted non-target organs. Both phantom and in vivo experiments robustly validate Semi-AR’s effectiveness, providing new insights into MPI scanner development, and advancing preclinical and potential clinical MPI applications.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1404-1418"},"PeriodicalIF":4.8,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145351903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Magnetic Particle Imaging (MPI) offers unique advantages, including high sensitivity, real-time imaging, and absence of ionizing radiation. However, the prevailing system matrix (SM)-based reconstruction in MPI faces critical limitations: time-consuming calibration, noise vulnerability, and reliance on high-resolution training data. To overcome these challenges, we propose an imaging physics driven neural field framework for efficient SM calibration and robust reconstruction. Key innovations include: (1) First-order derivative constraints to suppress spiky noise, (2) An M-order separable representation to enforce smoothness and reduce fluctuation artifacts, and (3) Chebyshev polynomial integration to enhance encoding efficiency and embed imaging physics priors. The method adapts to variable resolution requirements, reduces dependency on high-resolution data, and demonstrates robustness to noise across diverse SNR conditions. Experiments on the OpenMPI dataset show remarkable performance, achieving 1.55% nRMSE at 25% sparsity and minimal 0.21% degradation at 6.25% sparsity. Furthermore, upsampling sparse internal MPI system via the proposed method successfully reconstructs phantom geometries with high fidelity. These results validate the method’s potential to advance MPI toward broader research applications.
{"title":"Modeling Real-World MPI System Matrices From Sparse Observations","authors":"Feiyang Liao;Ming Li;Weixuan Feng;Yajie Xu;Tongtong Zhang;Zhongyi Wu;Hui Hui;Jian Zheng;Jie Tian","doi":"10.1109/TCI.2025.3614497","DOIUrl":"https://doi.org/10.1109/TCI.2025.3614497","url":null,"abstract":"Magnetic Particle Imaging (MPI) offers unique advantages, including high sensitivity, real-time imaging, and absence of ionizing radiation. However, the prevailing system matrix (SM)-based reconstruction in MPI faces critical limitations: time-consuming calibration, noise vulnerability, and reliance on high-resolution training data. To overcome these challenges, we propose an imaging physics driven neural field framework for efficient SM calibration and robust reconstruction. Key innovations include: (1) First-order derivative constraints to suppress spiky noise, (2) An M-order separable representation to enforce smoothness and reduce fluctuation artifacts, and (3) Chebyshev polynomial integration to enhance encoding efficiency and embed imaging physics priors. The method adapts to variable resolution requirements, reduces dependency on high-resolution data, and demonstrates robustness to noise across diverse SNR conditions. Experiments on the OpenMPI dataset show remarkable performance, achieving 1.55% nRMSE at 25% sparsity and minimal 0.21% degradation at 6.25% sparsity. Furthermore, upsampling sparse internal MPI system via the proposed method successfully reconstructs phantom geometries with high fidelity. These results validate the method’s potential to advance MPI toward broader research applications.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1419-1433"},"PeriodicalIF":4.8,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145351928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-24DOI: 10.1109/TCI.2025.3613961
Xiangyu Zhang;Xinyu Song;Jing Li;Lian Duan;Guangyu Wang;Weige Wei;Yongchang Wu;Sen Bai;Guangjun Li
This study aims to develop a deep learning-based synthetic 4DCT (s4DCT) generation method from 4DCBCT to enhance the accuracy of dose calculation and respiratory motion management in adaptive radiotherapy for lung tumors. A Unet-based attention mechanism integrated with CycleGAN, incorporating structure-consistency loss called UGGAN-GC, was developed to generate s4DCT images and was compared with several commonly used models. 4DCT and 4DCBCT images of 17 lung tumor patients were included and randomly divided into training set, validation set and test set. Elastix was used to deformably register 4DCT to 4DCBCT to generate the ground truth for training and evaluation of image-quality and dose calculation. Quantitative and qualitative methods were used to assess the quality of regions of interest (ROIs) and images of s4DCT. 4DCT was deformably registered to 4DCBCT and s4DCT using Elastix to evaluate the Dice similarity coefficient (DSC) of ROIs and gross tumor volume (GTV) motion. The average intensity projections (AIP) of the ground truth were used to design photon and proton therapy plans. Dose distributions were compared between s4DCT-AIP and ground truth-AIP using gamma analysis and dose-volume histograms. The experimental results showed that UGGAN-GC eliminated streak artifacts, generated the clearest anatomical structures, and achieved the best HU correction for soft tissues. The MAEs of 4DCBCT, Unet, Pix2pix, Cut, Fastcut, CycleGAN, UGGAN, and UGGAN-GC were 117.65, 71.87, 64.73, 62.92, 62.14, 63.01, 59.97, and 59.66 HU, respectively. The gamma passing rate (GPR) (2%/2 mm) of photon plans exceeded 99.8% for all models. The ranking of proton plan GPR (2%/2 mm) was: UGGAN-GC (97.7%), CycleGAN (95.4%), UGGAN (95.2%), Fastcut (93.1%), Pix2pix (90.8%), Unet (89.9%), and Cut (87.7%). The s4DCT generated by UGGAN-GC demonstrated excellent image quality, characterized by high HU accuracy, structural similarity, and edge detail fidelity, and had the potential to achieve accurate dose calculation and respiratory motion management for online photon and proton therapy plans.
{"title":"Generating Synthetic 4DCT From 4DCBCT for Lung Tumor Adaptive Photon and Proton Therapy Using a Unet Attention-Guided CycleGAN With Structure-Consistency Loss","authors":"Xiangyu Zhang;Xinyu Song;Jing Li;Lian Duan;Guangyu Wang;Weige Wei;Yongchang Wu;Sen Bai;Guangjun Li","doi":"10.1109/TCI.2025.3613961","DOIUrl":"https://doi.org/10.1109/TCI.2025.3613961","url":null,"abstract":"This study aims to develop a deep learning-based synthetic 4DCT (s4DCT) generation method from 4DCBCT to enhance the accuracy of dose calculation and respiratory motion management in adaptive radiotherapy for lung tumors. A Unet-based attention mechanism integrated with CycleGAN, incorporating structure-consistency loss called UGGAN-GC, was developed to generate s4DCT images and was compared with several commonly used models. 4DCT and 4DCBCT images of 17 lung tumor patients were included and randomly divided into training set, validation set and test set. Elastix was used to deformably register 4DCT to 4DCBCT to generate the ground truth for training and evaluation of image-quality and dose calculation. Quantitative and qualitative methods were used to assess the quality of regions of interest (ROIs) and images of s4DCT. 4DCT was deformably registered to 4DCBCT and s4DCT using Elastix to evaluate the Dice similarity coefficient (DSC) of ROIs and gross tumor volume (GTV) motion. The average intensity projections (AIP) of the ground truth were used to design photon and proton therapy plans. Dose distributions were compared between s4DCT-AIP and ground truth-AIP using gamma analysis and dose-volume histograms. The experimental results showed that UGGAN-GC eliminated streak artifacts, generated the clearest anatomical structures, and achieved the best HU correction for soft tissues. The MAEs of 4DCBCT, Unet, Pix2pix, Cut, Fastcut, CycleGAN, UGGAN, and UGGAN-GC were 117.65, 71.87, 64.73, 62.92, 62.14, 63.01, 59.97, and 59.66 HU, respectively. The gamma passing rate (GPR) (2%/2 mm) of photon plans exceeded 99.8% for all models. The ranking of proton plan GPR (2%/2 mm) was: UGGAN-GC (97.7%), CycleGAN (95.4%), UGGAN (95.2%), Fastcut (93.1%), Pix2pix (90.8%), Unet (89.9%), and Cut (87.7%). The s4DCT generated by UGGAN-GC demonstrated excellent image quality, characterized by high HU accuracy, structural similarity, and edge detail fidelity, and had the potential to achieve accurate dose calculation and respiratory motion management for online photon and proton therapy plans.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1361-1374"},"PeriodicalIF":4.8,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-22DOI: 10.1109/TCI.2025.3612849
Leon Suarez-Rodriguez;Roman Jacome;Henry Arguello
Designing the physical encoder is crucial for accurate image reconstruction in computational imaging (CI) systems. Currently, these systems are designed using an end-to-end (E2E) optimization approach, where the encoder is represented as a neural network layer and is jointly optimized with the computational decoder. However, the performance of E2E optimization is significantly reduced by the physical constraints imposed on the encoder, such as binarization, light throughput, and the compression ratio. Additionally, since the E2E learns the parameters of the encoder by backpropagating the reconstruction error, it does not promote optimal intermediate outputs and suffers from gradient vanishing. To address these limitations, we reinterpret the concept of knowledge distillation (KD)—traditionally used to train smaller neural networks by transferring knowledge from a larger pretrained model—for designing a physically constrained CI system by transferring the knowledge of a pretrained, less-constrained CI system. Our approach involves three steps: First, given the original CI system (student), a teacher system is created by relaxing the constraints on the student’s encoder. Second, the teacher is optimized to solve a less-constrained version of the student’s problem. Third, the teacher guides the training of the highly constrained student through two proposed knowledge transfer functions, targeting both the encoder and the decoder feature space. The proposed method can be employed to any imaging modality since the relaxation scheme and the loss functions can be adapted according to the physical acquisition and the employed decoder. This approach was validated on three representative CI modalities: magnetic resonance, single-pixel, and compressive spectral imaging. Simulations show that a teacher system with an encoder that has a structure similar to that of the student encoder provides effective guidance. Our approach achieves significantly improved reconstruction performance and encoder design, outperforming both E2E optimization and traditional non-data-driven encoder designs.
{"title":"Distilling Knowledge for Designing Computational Imaging Systems","authors":"Leon Suarez-Rodriguez;Roman Jacome;Henry Arguello","doi":"10.1109/TCI.2025.3612849","DOIUrl":"https://doi.org/10.1109/TCI.2025.3612849","url":null,"abstract":"Designing the physical encoder is crucial for accurate image reconstruction in computational imaging (CI) systems. Currently, these systems are designed using an end-to-end (E2E) optimization approach, where the encoder is represented as a neural network layer and is jointly optimized with the computational decoder. However, the performance of E2E optimization is significantly reduced by the physical constraints imposed on the encoder, such as binarization, light throughput, and the compression ratio. Additionally, since the E2E learns the parameters of the encoder by backpropagating the reconstruction error, it does not promote optimal intermediate outputs and suffers from gradient vanishing. To address these limitations, we reinterpret the concept of knowledge distillation (KD)—traditionally used to train smaller neural networks by transferring knowledge from a larger pretrained model—for designing a physically constrained CI system by transferring the knowledge of a pretrained, less-constrained CI system. Our approach involves three steps: First, given the original CI system (student), a teacher system is created by relaxing the constraints on the student’s encoder. Second, the teacher is optimized to solve a less-constrained version of the student’s problem. Third, the teacher guides the training of the highly constrained student through two proposed knowledge transfer functions, targeting both the encoder and the decoder feature space. The proposed method can be employed to any imaging modality since the relaxation scheme and the loss functions can be adapted according to the physical acquisition and the employed decoder. This approach was validated on three representative CI modalities: magnetic resonance, single-pixel, and compressive spectral imaging. Simulations show that a teacher system with an encoder that has a structure similar to that of the student encoder provides effective guidance. Our approach achieves significantly improved reconstruction performance and encoder design, outperforming both E2E optimization and traditional non-data-driven encoder designs.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1306-1319"},"PeriodicalIF":4.8,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work aims at the precise and efficient computation of the x-ray projection of an image represented by a linear combination of general shifted basis functions that typically overlap. We achieve this with a suitable adaptation of ray tracing, which is one of the most efficient methods to compute line integrals. In our work, the cases in which the image is expressed as a spline are of particular relevance. The proposed implementation is applicable to any projection geometry as it computes the forward and backward operators over a collection of arbitrary lines. We validate our work with experiments in the context of inverse problems for image reconstruction to maximize the image quality for a given resolution of the reconstruction grid.
{"title":"Generalized Ray Tracing With Basis Functions for Tomographic Projections","authors":"Youssef Haouchat;Sepand Kashani;Philippe Thévenaz;Michael Unser","doi":"10.1109/TCI.2025.3611590","DOIUrl":"https://doi.org/10.1109/TCI.2025.3611590","url":null,"abstract":"This work aims at the precise and efficient computation of the x-ray projection of an image represented by a linear combination of general shifted basis functions that typically overlap. We achieve this with a suitable adaptation of ray tracing, which is one of the most efficient methods to compute line integrals. In our work, the cases in which the image is expressed as a spline are of particular relevance. The proposed implementation is applicable to any projection geometry as it computes the forward and backward operators over a collection of arbitrary lines. We validate our work with experiments in the context of inverse problems for image reconstruction to maximize the image quality for a given resolution of the reconstruction grid.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1294-1305"},"PeriodicalIF":4.8,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11170459","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-15DOI: 10.1109/TCI.2025.3609974
Yunsong Liu;Debdut Mandal;Congyu Liao;Kawin Setsompop;Justin P. Haldar
We introduce a new algorithm to solve a regularized spatial-spectral image estimation problem. Our approach is based on the linearized alternating directions method of multipliers (LADMM), which is a variation of the popular ADMM algorithm. Although LADMM has existed for some time, it has not been very widely used in the computational imaging literature. This is in part because there are many possible ways of mapping LADMM to a specific optimization problem, and it is nontrivial to find a computationally efficient implementation out of the many competing alternatives. We believe that our proposed implementation represents the first application of LADMM to the type of optimization problem considered in this work (involving a linear-mixture forward model, spatial regularization, and nonnegativity constraints). We evaluate our algorithm in a variety of multiparametric MRI partial volume mapping scenarios (diffusion-relaxation, relaxation-relaxation, relaxometry, and fingerprinting), where we consistently observe substantial ($sim 3 ,times$−50 ×) speed improvements. We expect this to reduce barriers to using spatially-regularized partial volume compartment mapping methods. Further, the considerable improvements we observed also suggest the potential value of considering LADMM for a broader set of computational imaging problems.
{"title":"An Efficient Algorithm for Spatial-Spectral Partial Volume Compartment Mapping With Applications to Multicomponent Diffusion and Relaxation MRI","authors":"Yunsong Liu;Debdut Mandal;Congyu Liao;Kawin Setsompop;Justin P. Haldar","doi":"10.1109/TCI.2025.3609974","DOIUrl":"https://doi.org/10.1109/TCI.2025.3609974","url":null,"abstract":"We introduce a new algorithm to solve a regularized spatial-spectral image estimation problem. Our approach is based on the linearized alternating directions method of multipliers (LADMM), which is a variation of the popular ADMM algorithm. Although LADMM has existed for some time, it has not been very widely used in the computational imaging literature. This is in part because there are many possible ways of mapping LADMM to a specific optimization problem, and it is nontrivial to find a computationally efficient implementation out of the many competing alternatives. We believe that our proposed implementation represents the first application of LADMM to the type of optimization problem considered in this work (involving a linear-mixture forward model, spatial regularization, and nonnegativity constraints). We evaluate our algorithm in a variety of multiparametric MRI partial volume mapping scenarios (diffusion-relaxation, relaxation-relaxation, relaxometry, and fingerprinting), where we consistently observe substantial (<inline-formula><tex-math>$sim 3 ,times$</tex-math></inline-formula>−50 ×) speed improvements. We expect this to reduce barriers to using spatially-regularized partial volume compartment mapping methods. Further, the considerable improvements we observed also suggest the potential value of considering LADMM for a broader set of computational imaging problems.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1283-1293"},"PeriodicalIF":4.8,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-15DOI: 10.1109/TCI.2025.3609957
Qiuchen Zhai;Gregery T. Buzzard;Kevin M. Mertes;Brendt Wohlberg;Charles A. Bouman
Ptychography is an imaging technique that enables nanometer-scale reconstruction of complex transmittance images by scanning objects with overlapping X-ray illumination patterns. However, the illumination function is typically unknown and only partially coherent, which presents challenges for reconstruction. In this paper, we introduce Blind Multi-Mode Projected Multi-Agent Consensus Equilibrium (BM-PMACE) for blind ptychographic reconstruction. BM-PMACE jointly estimates both the complex transmittance image and the multi-modal probe functions associated with a partially coherent probe source. Importantly, BM-PMACE maintains a location-specific probe state that captures spatially varying probe aberrations. Our method also incorporates a dynamic strategy for integrating additional probe modes. Our experiments on synthetic and measured data demonstrate that BM-PMACE outperforms existing approaches in reconstruction quality and convergence rate.
{"title":"Ptychography Using Blind Multi-Mode PMACE","authors":"Qiuchen Zhai;Gregery T. Buzzard;Kevin M. Mertes;Brendt Wohlberg;Charles A. Bouman","doi":"10.1109/TCI.2025.3609957","DOIUrl":"https://doi.org/10.1109/TCI.2025.3609957","url":null,"abstract":"Ptychography is an imaging technique that enables nanometer-scale reconstruction of complex transmittance images by scanning objects with overlapping X-ray illumination patterns. However, the illumination function is typically unknown and only partially coherent, which presents challenges for reconstruction. In this paper, we introduce Blind Multi-Mode Projected Multi-Agent Consensus Equilibrium (BM-PMACE) for blind ptychographic reconstruction. BM-PMACE jointly estimates both the complex transmittance image and the multi-modal probe functions associated with a partially coherent probe source. Importantly, BM-PMACE maintains a location-specific probe state that captures spatially varying probe aberrations. Our method also incorporates a dynamic strategy for integrating additional probe modes. Our experiments on synthetic and measured data demonstrate that BM-PMACE outperforms existing approaches in reconstruction quality and convergence rate.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1320-1335"},"PeriodicalIF":4.8,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1109/TCI.2025.3608969
Zhijun Zeng;Matej Neumann;Yunan Yang
Conventional frequency-domain full-waveform inversion (FWI) is typically implemented with an $L^{2}$ misfit function, which suffers from challenges such as cycle skipping and sensitivity to noise. While the Wasserstein metric has proven effective in addressing these issues in time-domain FWI, its applicability in frequency-domain FWI is limited due to the complex-valued nature of the data and reduced transport-like dependency on wave speed. To mitigate these challenges, we introduce the HV metric ($d_{text{HV}}$), inspired by optimal transport theory, which compares signals based on horizontal and vertical changes without requiring the normalization of data. We implement $d_{text{HV}}$ as the misfit function in frequency-domain FWI and evaluate its performance on synthetic and real-world datasets from seismic imaging and ultrasound computed tomography (USCT). Numerical experiments demonstrate that $d_{text{HV}}$ outperforms the $L^{2}$ and Wasserstein metrics in scenarios with limited prior model information and high noise while robustly improving inversion results on clinical USCT data.
{"title":"Robust Frequency Domain Full-Waveform Inversion via HV-Geometry","authors":"Zhijun Zeng;Matej Neumann;Yunan Yang","doi":"10.1109/TCI.2025.3608969","DOIUrl":"https://doi.org/10.1109/TCI.2025.3608969","url":null,"abstract":"Conventional frequency-domain full-waveform inversion (FWI) is typically implemented with an <inline-formula><tex-math>$L^{2}$</tex-math></inline-formula> misfit function, which suffers from challenges such as cycle skipping and sensitivity to noise. While the Wasserstein metric has proven effective in addressing these issues in time-domain FWI, its applicability in frequency-domain FWI is limited due to the complex-valued nature of the data and reduced transport-like dependency on wave speed. To mitigate these challenges, we introduce the HV metric (<inline-formula><tex-math>$d_{text{HV}}$</tex-math></inline-formula>), inspired by optimal transport theory, which compares signals based on horizontal and vertical changes without requiring the normalization of data. We implement <inline-formula><tex-math>$d_{text{HV}}$</tex-math></inline-formula> as the misfit function in frequency-domain FWI and evaluate its performance on synthetic and real-world datasets from seismic imaging and ultrasound computed tomography (USCT). Numerical experiments demonstrate that <inline-formula><tex-math>$d_{text{HV}}$</tex-math></inline-formula> outperforms the <inline-formula><tex-math>$L^{2}$</tex-math></inline-formula> and Wasserstein metrics in scenarios with limited prior model information and high noise while robustly improving inversion results on clinical USCT data.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1271-1282"},"PeriodicalIF":4.8,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1109/TCI.2025.3608970
Lin Feng;Xinying Wang;Zhixiong Huang;Yining Wang;Jiawen Zhu;Paolo Gamba
Mainstream spectral reconstruction methods typically meticulously design complex and computationally intensive architectures in convolutional neural networks (CNNs) or Transformers to model the mapping from RGB to hyperspectral image (HSI). However, the bottleneck in achieving accurate spectral reconstruction may not lie in model complexity. Direct end-to-end learning on limited training samples struggles to encapsulate discriminative and generalizable feature representations, leading to overfitting and consequently suboptimal reconstruction fidelity. To address these challenges, we propose a new Masked Autoencoder-based Knowledge Transfer network for Spectral Reconstruction from RGB images (MAE-KTSR). MAE-KTSR decouples the feature representation process into a two-stage paradigm, facilitating a holistic comprehension of diverse objects and scenes, thereby enhancing the generalizability of spectral reconstruction. In the first stage, we introduce Spatial-Spectral Masked Autoencoders (S$^{2}$-MAE) to extract discriminative spectral features through masked modeling under constrained spectral conditions. S$^{2}$-MAE reconstructs spectral images from partially masked inputs, learning a generalizable feature representation that provides useful prior knowledge for RGB-to-HSI reconstruction. In the second stage, a lightweight convolutional reconstruction network is deployed to further extract and aggregate local spectral-spatial features. Specifically, an Inter-Stage Feature Fusion module (ISFF) is introduced to effectively exploit the global MAE-based spectral priors learned in the first stage. Experimental results on three spectral reconstruction benchmarks (NTIRE2020-Clean, CAVE, and Harvard) and one real-world hyperspecral dataset (Pavia University) demonstrate the effectiveness of MAE-KTSR. Additionally, MAE-KTSR is experimentally validated to facilitate downstream real-world applications, such as HSI classification.
{"title":"Masked Autoencoder-Based Knowledge Transfer for Spectral Reconstruction From RGB Images","authors":"Lin Feng;Xinying Wang;Zhixiong Huang;Yining Wang;Jiawen Zhu;Paolo Gamba","doi":"10.1109/TCI.2025.3608970","DOIUrl":"https://doi.org/10.1109/TCI.2025.3608970","url":null,"abstract":"Mainstream spectral reconstruction methods typically meticulously design complex and computationally intensive architectures in convolutional neural networks (CNNs) or Transformers to model the mapping from RGB to hyperspectral image (HSI). However, the bottleneck in achieving accurate spectral reconstruction may not lie in model complexity. Direct end-to-end learning on limited training samples struggles to encapsulate discriminative and generalizable feature representations, leading to overfitting and consequently suboptimal reconstruction fidelity. To address these challenges, we propose a new Masked Autoencoder-based Knowledge Transfer network for Spectral Reconstruction from RGB images (MAE-KTSR). MAE-KTSR decouples the feature representation process into a two-stage paradigm, facilitating a holistic comprehension of diverse objects and scenes, thereby enhancing the generalizability of spectral reconstruction. In the first stage, we introduce Spatial-Spectral Masked Autoencoders (S<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>-MAE) to extract discriminative spectral features through masked modeling under constrained spectral conditions. S<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>-MAE reconstructs spectral images from partially masked inputs, learning a generalizable feature representation that provides useful prior knowledge for RGB-to-HSI reconstruction. In the second stage, a lightweight convolutional reconstruction network is deployed to further extract and aggregate local spectral-spatial features. Specifically, an Inter-Stage Feature Fusion module (ISFF) is introduced to effectively exploit the global MAE-based spectral priors learned in the first stage. Experimental results on three spectral reconstruction benchmarks (NTIRE2020-Clean, CAVE, and Harvard) and one real-world hyperspecral dataset (Pavia University) demonstrate the effectiveness of MAE-KTSR. Additionally, MAE-KTSR is experimentally validated to facilitate downstream real-world applications, such as HSI classification.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1336-1348"},"PeriodicalIF":4.8,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1109/TCI.2025.3607150
Daoqi Liu;Tao Shan;Maokun Li;Fan Yang;Shenheng Xu
In this work, we propose a deep learning-based imaging method for addressing the multi-frequency electromagnetic (EM) inverse scattering problem (ISP). By combining deep learning technology with EM computation, we have successfully developed a multi-frequency neural Born iterative method (NeuralBIM), guided by the principles of the single-frequency NeuralBIM. This method integrates multitask learning techniques with NeuralBIM’s efficient iterative inversion process to construct a robust multi-frequency Born iterative inversion model. During training, the model employs a multitask learning approach guided by homoscedastic uncertainty to adaptively allocate the weights of each frequency’s data. Additionally, an unsupervised learning method, constrained by the physics of the ISP, is used to train the multi-frequency NeuralBIM model, eliminating the need for contrast and total field data. The effectiveness of the multi-frequency NeuralBIM is validated through synthetic and experimental data, demonstrating improvements in accuracy and computational efficiency for solving the ISP. Moreover, this method exhibits good generalization capabilities and noise resistance.
{"title":"Multi-Frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems","authors":"Daoqi Liu;Tao Shan;Maokun Li;Fan Yang;Shenheng Xu","doi":"10.1109/TCI.2025.3607150","DOIUrl":"https://doi.org/10.1109/TCI.2025.3607150","url":null,"abstract":"In this work, we propose a deep learning-based imaging method for addressing the multi-frequency electromagnetic (EM) inverse scattering problem (ISP). By combining deep learning technology with EM computation, we have successfully developed a multi-frequency neural Born iterative method (NeuralBIM), guided by the principles of the single-frequency NeuralBIM. This method integrates multitask learning techniques with NeuralBIM’s efficient iterative inversion process to construct a robust multi-frequency Born iterative inversion model. During training, the model employs a multitask learning approach guided by homoscedastic uncertainty to adaptively allocate the weights of each frequency’s data. Additionally, an unsupervised learning method, constrained by the physics of the ISP, is used to train the multi-frequency NeuralBIM model, eliminating the need for contrast and total field data. The effectiveness of the multi-frequency NeuralBIM is validated through synthetic and experimental data, demonstrating improvements in accuracy and computational efficiency for solving the ISP. Moreover, this method exhibits good generalization capabilities and noise resistance.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"11 ","pages":"1243-1257"},"PeriodicalIF":4.8,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}