Pub Date : 2026-01-26DOI: 10.1109/TMI.2026.3658004
Scott S Hsieh, James Day, Xinchen Deng, Magdalena Bazalova-Carter
Ring artifacts in CT are caused by uncalibrated variations in detector pixels and are especially prevalent with emerging photon counting detectors (PCDs). Control of ring artifacts is conventionally accomplished by improving either hardware manufacturing or software correction algorithms. An alternative solution is detector autocalibration, in which two redundant samples of each line integral are acquired and used to dynamically calibrate the PCD. Autocalibration was first proposed by Hounsfield in 1977 and was demonstrated on the EMI Topaz prototype scanner in 1980, but details surrounding this implementation are sparse. We investigate a form of autocalibration that requires just two redundant acquisitions, which could be acquired using flying focal spot on a clinical scanner but is demonstrated here with a detector shift. We formulated autocalibration as an optimization problem to determine the relative gain factor of each pixel and tested it on scans of a chicken thigh specimen, resolution phantom, and a cylindrical phantom. Ring artifacts were significantly reduced. Some residual artifacts remained but could not be discriminated from the intrinsic temporal instability of our PCD modules. Autocalibration could facilitate the adoption of widespread photon counting CT by reducing ring artifacts, thermal management requirements, or stability requirements that are present today. Demonstration of autocalibration on a rotating gantry with flying focal spot remains future work.
{"title":"Ring artifact reduction in photon counting CT using redundant sampling and autocalibration.","authors":"Scott S Hsieh, James Day, Xinchen Deng, Magdalena Bazalova-Carter","doi":"10.1109/TMI.2026.3658004","DOIUrl":"https://doi.org/10.1109/TMI.2026.3658004","url":null,"abstract":"<p><p>Ring artifacts in CT are caused by uncalibrated variations in detector pixels and are especially prevalent with emerging photon counting detectors (PCDs). Control of ring artifacts is conventionally accomplished by improving either hardware manufacturing or software correction algorithms. An alternative solution is detector autocalibration, in which two redundant samples of each line integral are acquired and used to dynamically calibrate the PCD. Autocalibration was first proposed by Hounsfield in 1977 and was demonstrated on the EMI Topaz prototype scanner in 1980, but details surrounding this implementation are sparse. We investigate a form of autocalibration that requires just two redundant acquisitions, which could be acquired using flying focal spot on a clinical scanner but is demonstrated here with a detector shift. We formulated autocalibration as an optimization problem to determine the relative gain factor of each pixel and tested it on scans of a chicken thigh specimen, resolution phantom, and a cylindrical phantom. Ring artifacts were significantly reduced. Some residual artifacts remained but could not be discriminated from the intrinsic temporal instability of our PCD modules. Autocalibration could facilitate the adoption of widespread photon counting CT by reducing ring artifacts, thermal management requirements, or stability requirements that are present today. Demonstration of autocalibration on a rotating gantry with flying focal spot remains future work.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconstructing neurons from large electron microscopy (EM) datasets for connectomic analysis presents a significant challenge, particularly in segmenting neurons of complex morphologies. Previous deep learning-based neuron segmentation methods often rely on pixel-level image context and produce extensive oversegmented fragments. Detecting these split errors and merging the split neuron segments are non-trivial for various neurons in a large-scale EM data volume. In this work, we exploit multimodal features in the full workflow of automatic neuron proofreading. We propose a novel connection point detection network that utilizes both global 3D morphological features and high-resolution local image context to extract candidate segment pairs from massive adjacent segments. To effectively fuse the 3D morphological feature and the dense image features from very different scales, we design a proposal-based image feature sampling to improve the efficiency of multimodal cross-attentions. Integrating the connection point detection network with our connectivity prediction network which also utilizes multimodal features, we make a fully automatic neuron segment merging pipeline, closely imitating human proofreading. Comprehensive experimental results verify the effectiveness of the proposed modules and demonstrate the robustness of the entire pipeline in large-scale neuron reconstruction. The code and data are available at https://github.com/Levishery/ Neuron-Segment-Connection-Prediction.
{"title":"Neuron Segment Connectivity Prediction with Multimodal Features for Connectomics.","authors":"Qihua Chen, Xuejin Chen, Chenxuan Wang, Zhiwei Xiong, Feng Wu","doi":"10.1109/TMI.2026.3658169","DOIUrl":"https://doi.org/10.1109/TMI.2026.3658169","url":null,"abstract":"<p><p>Reconstructing neurons from large electron microscopy (EM) datasets for connectomic analysis presents a significant challenge, particularly in segmenting neurons of complex morphologies. Previous deep learning-based neuron segmentation methods often rely on pixel-level image context and produce extensive oversegmented fragments. Detecting these split errors and merging the split neuron segments are non-trivial for various neurons in a large-scale EM data volume. In this work, we exploit multimodal features in the full workflow of automatic neuron proofreading. We propose a novel connection point detection network that utilizes both global 3D morphological features and high-resolution local image context to extract candidate segment pairs from massive adjacent segments. To effectively fuse the 3D morphological feature and the dense image features from very different scales, we design a proposal-based image feature sampling to improve the efficiency of multimodal cross-attentions. Integrating the connection point detection network with our connectivity prediction network which also utilizes multimodal features, we make a fully automatic neuron segment merging pipeline, closely imitating human proofreading. Comprehensive experimental results verify the effectiveness of the proposed modules and demonstrate the robustness of the entire pipeline in large-scale neuron reconstruction. The code and data are available at https://github.com/Levishery/ Neuron-Segment-Connection-Prediction.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1109/TMI.2026.3654585
Jin Liu, Qing Lin, Zhuang Xiong, Shanshan Shan, Chunyi Liu, Min Li, Feng Liu, G Bruce Pike, Hongfu Sun, Yang Gao
Incoherent k-space undersampling and deep learning-based reconstruction methods have shown great success in accelerating MRI. However, the performance of most previous methods will degrade dramatically under high acceleration factors, e.g., 8× or higher. Recently, denoising diffusion models (DM) have demonstrated promising results in solving this issue; however, one major drawback of the DM methods is the long inference time due to a dramatic number of iterative reverse posterior sampling steps. In this work, a Single Step Diffusion Model-based reconstruction framework, namely SSDM-MRI, is proposed for restoring MRI images from highly undersampled k-space. The proposed method achieves one-step reconstruction by first training a conditional DM and then iteratively distilling this model four times using an iterative selective distillation algorithm, which works synergistically with a shortcut reverse sampling strategy for model inference. Comprehensive experiments were carried out on both publicly available fastMRI brain and knee images, as well as an in-house multi-echo GRE (QSM) subject. Overall, the results showed that SSDM-MRI outperformed other methods in terms of numerical metrics (e.g., PSNR and SSIM), error maps, image fine details, and latent susceptibility information hidden in MRI phase images. In addition, the reconstruction time for a 320×320 brain slice of SSDM-MRI is only 0.45 second, which is only comparable to that of a simple U-net, making it a highly effective solution for MRI reconstruction tasks.
{"title":"Highly Undersampled MRI Reconstruction via a Single Posterior Sampling of Diffusion Models.","authors":"Jin Liu, Qing Lin, Zhuang Xiong, Shanshan Shan, Chunyi Liu, Min Li, Feng Liu, G Bruce Pike, Hongfu Sun, Yang Gao","doi":"10.1109/TMI.2026.3654585","DOIUrl":"https://doi.org/10.1109/TMI.2026.3654585","url":null,"abstract":"<p><p>Incoherent k-space undersampling and deep learning-based reconstruction methods have shown great success in accelerating MRI. However, the performance of most previous methods will degrade dramatically under high acceleration factors, e.g., 8× or higher. Recently, denoising diffusion models (DM) have demonstrated promising results in solving this issue; however, one major drawback of the DM methods is the long inference time due to a dramatic number of iterative reverse posterior sampling steps. In this work, a Single Step Diffusion Model-based reconstruction framework, namely SSDM-MRI, is proposed for restoring MRI images from highly undersampled k-space. The proposed method achieves one-step reconstruction by first training a conditional DM and then iteratively distilling this model four times using an iterative selective distillation algorithm, which works synergistically with a shortcut reverse sampling strategy for model inference. Comprehensive experiments were carried out on both publicly available fastMRI brain and knee images, as well as an in-house multi-echo GRE (QSM) subject. Overall, the results showed that SSDM-MRI outperformed other methods in terms of numerical metrics (e.g., PSNR and SSIM), error maps, image fine details, and latent susceptibility information hidden in MRI phase images. In addition, the reconstruction time for a 320×320 brain slice of SSDM-MRI is only 0.45 second, which is only comparable to that of a simple U-net, making it a highly effective solution for MRI reconstruction tasks.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TMI.2025.3650412
Sebastian Rassmann, David Kugler, Christian Ewert, Martin Reuter
While Generative Adversarial Nets (GANs) and Diffusion Models (DMs) have achieved impressive results in natural image synthesis, their core strengths - creativity and realism - can be detrimental in medical applications, where accuracy and fidelity are paramount. These models instead risk introducing hallucinations and replication of unwanted acquisition noise. Here, we propose YODA (You Only Denoise once - or Average), a 2.5D diffusion-based framework for medical image translation (MIT). Consistent with DM theory, we find that conventional diffusion sampling stochastically replicates noise. To mitigate this, we draw and average multiple samples, akin to physical signal averaging. As this effectively approximates the DM's expected value, we term this Expectation-Approximation (ExpA) sampling. We additionally propose regression sampling YODA, which retains the initial DM prediction and omits iterative refinement to produce noise-free images in a single step. Across five diverse multi-modal datasets - including multi-contrast brain MRI and pelvic MRI-CT - we demonstrate that regression sampling is not only substantially more efficient but also matches or exceeds image quality of full diffusion sampling even with ExpA. Our results reveal that iterative refinement solely enhances perceptual realism without benefiting information translation, which we confirm in relevant downstream tasks. YODA outperforms eight state-of-the-art DMs and GANs and challenges the presumed superiority of DMs and GANs over computationally cheap regression models for high-quality MIT. Furthermore, we show that YODA-translated images are interchangeable with, or even superior to, physical acquisitions for several medical applications.
{"title":"Regression is all you need for medical image translation.","authors":"Sebastian Rassmann, David Kugler, Christian Ewert, Martin Reuter","doi":"10.1109/TMI.2025.3650412","DOIUrl":"https://doi.org/10.1109/TMI.2025.3650412","url":null,"abstract":"<p><p>While Generative Adversarial Nets (GANs) and Diffusion Models (DMs) have achieved impressive results in natural image synthesis, their core strengths - creativity and realism - can be detrimental in medical applications, where accuracy and fidelity are paramount. These models instead risk introducing hallucinations and replication of unwanted acquisition noise. Here, we propose YODA (You Only Denoise once - or Average), a 2.5D diffusion-based framework for medical image translation (MIT). Consistent with DM theory, we find that conventional diffusion sampling stochastically replicates noise. To mitigate this, we draw and average multiple samples, akin to physical signal averaging. As this effectively approximates the DM's expected value, we term this Expectation-Approximation (ExpA) sampling. We additionally propose regression sampling YODA, which retains the initial DM prediction and omits iterative refinement to produce noise-free images in a single step. Across five diverse multi-modal datasets - including multi-contrast brain MRI and pelvic MRI-CT - we demonstrate that regression sampling is not only substantially more efficient but also matches or exceeds image quality of full diffusion sampling even with ExpA. Our results reveal that iterative refinement solely enhances perceptual realism without benefiting information translation, which we confirm in relevant downstream tasks. YODA outperforms eight state-of-the-art DMs and GANs and challenges the presumed superiority of DMs and GANs over computationally cheap regression models for high-quality MIT. Furthermore, we show that YODA-translated images are interchangeable with, or even superior to, physical acquisitions for several medical applications.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TMI.2025.3650126
Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang
The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm achieved an IoU of 0.774, which is promising but not yet sufficient for intra-operative decision-making, reflecting the inherent challenges of fragment overlap in projection imaging. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.
{"title":"Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-Ray: Summary of the PENGWIN 2024 Challenge.","authors":"Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang","doi":"10.1109/TMI.2025.3650126","DOIUrl":"https://doi.org/10.1109/TMI.2025.3650126","url":null,"abstract":"<p><p>The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm achieved an IoU of 0.774, which is promising but not yet sufficient for intra-operative decision-making, reflecting the inherent challenges of fragment overlap in projection imaging. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/TMI.2025.3648299
Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones
Computer vision-based technologies significantly enhance surgical automation by advancing tool tracking, detection, and localization. However, Current data-driven approaches are data-voracious, requiring large, high-quality labeled image datasets. Our Work introduces a novel dynamic Gaussian Splatting technique to address the data scarcity in surgical image datasets. We propose a dynamic Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. We utilize a dynamic training adjustment strategy to address challenges posed by poorly calibrated camera poses from real-world scenarios. Additionally, automatically generate annotations for our synthetic data. For evaluation, we constructed a new dataset featuring seven scenes with 14,000 frames of tool and camera motion and tool jaw articulation, with a background of an exvivo porcine model. Using this dataset, we synthetically replicate the scene deformation from the ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets with the highest PSNR (29.87). We further evaluate the performance of medical-specific neural networks trained on real and synthetic images using an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images generated by the proposed method outperforms those trained with state-of-the-art standard data augmentation by 10%, leading to an overall improvement in model performances by nearly 15%.
{"title":"NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction.","authors":"Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones","doi":"10.1109/TMI.2025.3648299","DOIUrl":"https://doi.org/10.1109/TMI.2025.3648299","url":null,"abstract":"<p><p>Computer vision-based technologies significantly enhance surgical automation by advancing tool tracking, detection, and localization. However, Current data-driven approaches are data-voracious, requiring large, high-quality labeled image datasets. Our Work introduces a novel dynamic Gaussian Splatting technique to address the data scarcity in surgical image datasets. We propose a dynamic Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. We utilize a dynamic training adjustment strategy to address challenges posed by poorly calibrated camera poses from real-world scenarios. Additionally, automatically generate annotations for our synthetic data. For evaluation, we constructed a new dataset featuring seven scenes with 14,000 frames of tool and camera motion and tool jaw articulation, with a background of an exvivo porcine model. Using this dataset, we synthetically replicate the scene deformation from the ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets with the highest PSNR (29.87). We further evaluate the performance of medical-specific neural networks trained on real and synthetic images using an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images generated by the proposed method outperforms those trained with state-of-the-art standard data augmentation by 10%, leading to an overall improvement in model performances by nearly 15%.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1109/TMI.2025.3648852
Langtao Zhou, Xiaoxia Qu, Tianyu Fu, Jiaoyang Wu, Hong Song, Jingfan Fan, Danni Ai, Deqiang Xiao, Junfang Xian, Jian Yang
Synthesizing missing modalities in multi-parametric MRI (mpMRI) is vital for accurate tumor diagnosis, yet remains challenging due to incomplete acquisitions and modality heterogeneity. Diffusion models have shown strong generative capability, but conventional approaches typically operate in the image domain with high memory costs and often rely solely on noise-space supervision, which limits anatomical fidelity. Latent diffusion models (LDMs) improve efficiency by performing denoising in latent space, but standard LDMs lack explicit structural priors and struggle to integrate multiple modalities effectively. To address these limitations, we propose the anatomy-aware sketch-guided latent diffusion model (ASLDM), a novel LDM-based framework designed for flexible and structure-preserving MRI synthesis. ASLDM incorporates an anatomy-aware feature fusion module, which encodes tumor region masks and edge-based anatomical sketches via cross-attention to guide the denoising process with explicit structure priors. A modality synergistic reconstruction strategy enables the joint modeling of available and missing modalities, enhancing cross-modal consistency and supporting arbitrary missing scenarios. Additionally, we introduce image-level losses for pixel-space supervision using L1 and SSIM losses, overcoming the limitations of pure noise-based loss training and improving the anatomical accuracy of synthesized outputs. Extensive experiments on a five-modality orbital tumor mpMRI private dataset and a four-modality public BraTS2024 dataset demonstrate that ASLDM outperforms state-of-the-art methods in both synthesis quality and structural consistency, showing strong potential for clinically reliable multi-modal MRI completion. Our code is publicly available at: https://github.com/zltshadow/ASLDM.git.
{"title":"Anatomy-aware Sketch-guided Latent Diffusion Model for Orbital Tumor Multi-Parametric MRI Missing Modalities Synthesis.","authors":"Langtao Zhou, Xiaoxia Qu, Tianyu Fu, Jiaoyang Wu, Hong Song, Jingfan Fan, Danni Ai, Deqiang Xiao, Junfang Xian, Jian Yang","doi":"10.1109/TMI.2025.3648852","DOIUrl":"https://doi.org/10.1109/TMI.2025.3648852","url":null,"abstract":"<p><p>Synthesizing missing modalities in multi-parametric MRI (mpMRI) is vital for accurate tumor diagnosis, yet remains challenging due to incomplete acquisitions and modality heterogeneity. Diffusion models have shown strong generative capability, but conventional approaches typically operate in the image domain with high memory costs and often rely solely on noise-space supervision, which limits anatomical fidelity. Latent diffusion models (LDMs) improve efficiency by performing denoising in latent space, but standard LDMs lack explicit structural priors and struggle to integrate multiple modalities effectively. To address these limitations, we propose the anatomy-aware sketch-guided latent diffusion model (ASLDM), a novel LDM-based framework designed for flexible and structure-preserving MRI synthesis. ASLDM incorporates an anatomy-aware feature fusion module, which encodes tumor region masks and edge-based anatomical sketches via cross-attention to guide the denoising process with explicit structure priors. A modality synergistic reconstruction strategy enables the joint modeling of available and missing modalities, enhancing cross-modal consistency and supporting arbitrary missing scenarios. Additionally, we introduce image-level losses for pixel-space supervision using L1 and SSIM losses, overcoming the limitations of pure noise-based loss training and improving the anatomical accuracy of synthesized outputs. Extensive experiments on a five-modality orbital tumor mpMRI private dataset and a four-modality public BraTS2024 dataset demonstrate that ASLDM outperforms state-of-the-art methods in both synthesis quality and structural consistency, showing strong potential for clinically reliable multi-modal MRI completion. Our code is publicly available at: https://github.com/zltshadow/ASLDM.git.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1109/TMI.2025.3649239
Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao
Cross-domain few-shot medical image segmentation (CD-FSMIS) offers a promising and data-efficient solution for medical applications where annotations are severely scarce and multimodal analysis is required. However, existing methods typically filter out domain-specific information to improve generalization, which inadvertently limits cross-domain performance and degrades source-domain accuracy. To address this, we present Contrastive Graph Modeling (C-Graph), a framework that leverages the structural consistency of medical images as a reliable domain-transferable prior. We represent image features as graphs, with pixels as nodes and semantic affinities as edges. A Structural Prior Graph (SPG) layer is proposed to capture and transfer target-category node dependencies and enable global structure modeling through explicit node interactions. Building upon SPG layers, we introduce a Subgraph Matching Decoding (SMD) mechanism that exploits semantic relations among nodes to guide prediction. Furthermore, we design a Confusion-minimizing Node Contrast (CNC) loss to mitigate node ambiguity and sub-graph heterogeneity by contrastively enhancing node discriminability in the graph space. Our method significantly outperforms prior CD-FSMIS approaches across multiple cross-domain benchmarks, achieving state-of-the-art performance while simultaneously preserving strong segmentation accuracy on the source domain. Our code is available at https://github.com/primebo1/C-Graph.
{"title":"Contrastive Graph Modeling for Cross-Domain Few-Shot Medical Image Segmentation.","authors":"Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao","doi":"10.1109/TMI.2025.3649239","DOIUrl":"https://doi.org/10.1109/TMI.2025.3649239","url":null,"abstract":"<p><p>Cross-domain few-shot medical image segmentation (CD-FSMIS) offers a promising and data-efficient solution for medical applications where annotations are severely scarce and multimodal analysis is required. However, existing methods typically filter out domain-specific information to improve generalization, which inadvertently limits cross-domain performance and degrades source-domain accuracy. To address this, we present Contrastive Graph Modeling (C-Graph), a framework that leverages the structural consistency of medical images as a reliable domain-transferable prior. We represent image features as graphs, with pixels as nodes and semantic affinities as edges. A Structural Prior Graph (SPG) layer is proposed to capture and transfer target-category node dependencies and enable global structure modeling through explicit node interactions. Building upon SPG layers, we introduce a Subgraph Matching Decoding (SMD) mechanism that exploits semantic relations among nodes to guide prediction. Furthermore, we design a Confusion-minimizing Node Contrast (CNC) loss to mitigate node ambiguity and sub-graph heterogeneity by contrastively enhancing node discriminability in the graph space. Our method significantly outperforms prior CD-FSMIS approaches across multiple cross-domain benchmarks, achieving state-of-the-art performance while simultaneously preserving strong segmentation accuracy on the source domain. Our code is available at https://github.com/primebo1/C-Graph.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/TMI.2025.3642134
Tengya Peng, Ruyi Zha, Zhen Li, Xiaofeng Liu, Qing Zou
Three-Dimensional Gaussian representation (3DGS) has shown substantial promise in the field of computer vision, but remains unexplored in the field of magnetic resonance imaging (MRI). This study explores its potential for the reconstruction of isotropic resolution 3D MRI from undersampled k-space data. We introduce a novel framework termed 3D Gaussian MRI (3DGSMR), which employs 3D Gaussian distributions as an explicit representation for MR volumes. Experimental evaluations indicate that this method can effectively reconstruct voxelized MR images, achieving a quality on par with that of well-established 3D MRI reconstruction techniques found in the literature. Notably, the 3DGSMR scheme operates under a self-supervised framework, obviating the need for extensive training datasets or prior model training. This approach introduces significant innovations to the domain, notably the adaptation of 3DGS to MRI reconstruction and the novel application of the existing 3DGS methodology to decompose MR signals, which are presented in a complex-valued format.
{"title":"Three-Dimensional MRI Reconstruction with 3D Gaussian Representations: Tackling the Undersampling Problem.","authors":"Tengya Peng, Ruyi Zha, Zhen Li, Xiaofeng Liu, Qing Zou","doi":"10.1109/TMI.2025.3642134","DOIUrl":"https://doi.org/10.1109/TMI.2025.3642134","url":null,"abstract":"<p><p>Three-Dimensional Gaussian representation (3DGS) has shown substantial promise in the field of computer vision, but remains unexplored in the field of magnetic resonance imaging (MRI). This study explores its potential for the reconstruction of isotropic resolution 3D MRI from undersampled k-space data. We introduce a novel framework termed 3D Gaussian MRI (3DGSMR), which employs 3D Gaussian distributions as an explicit representation for MR volumes. Experimental evaluations indicate that this method can effectively reconstruct voxelized MR images, achieving a quality on par with that of well-established 3D MRI reconstruction techniques found in the literature. Notably, the 3DGSMR scheme operates under a self-supervised framework, obviating the need for extensive training datasets or prior model training. This approach introduces significant innovations to the domain, notably the adaptation of 3DGS to MRI reconstruction and the novel application of the existing 3DGS methodology to decompose MR signals, which are presented in a complex-valued format.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145717086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TMI.2025.3639398
Matthew A McCready, Xiaozhi Cao, Kawin Setsompop, John M Pauly, Adam B Kerr
A customizable method (OPTIKS) for designing fast trajectory-constrained gradient waveforms with optimized time domain properties was developed. Given a specified multidimensional k-space trajectory, the method optimizes traversal speed (and therefore timing) with position along the trajectory. OPTIKS facilitates optimization of objectives dependent on the time domain gradient waveform and the arc-length domain k-space speed. OPTIKS is applied to design waveforms which limit peripheral nerve stimulation (PNS), minimize mechanical resonance excitation, and reduce acoustic noise. A variety of trajectory examples are presented including spirals, circular echo-planar-imaging, and rosettes. Design performance is evaluated based on duration, standardized PNS models, field measurements, gradient coil back-EMF measurements, and calibrated acoustic measurements. We show reductions in back-EMF of up to 94% and field oscillations up to 91.1%, acoustic noise decreases of up to 9.22 dB, and with efficient use of PNS models speed increases of up to 11.4%. The design method implementation is made available as an open source Python package through GitHub (https://github.com/mamccready/optiks).
{"title":"OPTIKS: Optimized Gradient Properties Through Timing in K-Space.","authors":"Matthew A McCready, Xiaozhi Cao, Kawin Setsompop, John M Pauly, Adam B Kerr","doi":"10.1109/TMI.2025.3639398","DOIUrl":"10.1109/TMI.2025.3639398","url":null,"abstract":"<p><p>A customizable method (OPTIKS) for designing fast trajectory-constrained gradient waveforms with optimized time domain properties was developed. Given a specified multidimensional k-space trajectory, the method optimizes traversal speed (and therefore timing) with position along the trajectory. OPTIKS facilitates optimization of objectives dependent on the time domain gradient waveform and the arc-length domain k-space speed. OPTIKS is applied to design waveforms which limit peripheral nerve stimulation (PNS), minimize mechanical resonance excitation, and reduce acoustic noise. A variety of trajectory examples are presented including spirals, circular echo-planar-imaging, and rosettes. Design performance is evaluated based on duration, standardized PNS models, field measurements, gradient coil back-EMF measurements, and calibrated acoustic measurements. We show reductions in back-EMF of up to 94% and field oscillations up to 91.1%, acoustic noise decreases of up to 9.22 dB, and with efficient use of PNS models speed increases of up to 11.4%. The design method implementation is made available as an open source Python package through GitHub (https://github.com/mamccready/optiks).</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}