Pub Date : 2026-02-01DOI: 10.1109/TMI.2025.3599508
Kaile Chen, Weikang Zhang, Ziheng Deng, Yufu Zhou, Jun Zhao
Obtaining multiple CT scans from the same patient is required in many clinical scenarios, such as lung nodule screening and image-guided radiation therapy. Repeated scans would expose patients to higher radiation dose and increase the risk of cancer. In this study, we aim to achieve ultra-low-dose imaging for subsequent scans by collecting extremely undersampled sinogram via regional few-view scanning, and preserve image quality utilizing the preceding fullsampled scan as prior. To fully exploit prior information, we propose a two-stage framework consisting of diffusion model-based sinogram restoration and deep learning-based unrolled iterative reconstruction. Specifically, the undersampled sinogram is first restored by a conditional diffusion model with sinogram-domain prior guidance. Then, we formulate the undersampled data reconstruction problem as an optimization problem combining fidelity terms for both undersampled and restored data, along with a regularization term based on image-domain prior. Next, we propose Prior-aided Alternate Iterative NeTwork (PAINT) to solve the optimization problem. PAINT alternately updates the undersampled or restored data fidelity term, and unrolls the iterations to integrate neural network-based prior regularization. In the case of 112 mm field of view in simulated data experiments, our proposed framework achieved superior performance in terms of CT value accuracy and image details preservation. Clinical data experiments also demonstrated that our proposed framework outperformed the comparison methods in artifact reduction and structure recovery.
{"title":"PAINT: Prior-Aided Alternate Iterative NeTwork for Ultra-Low-Dose CT Imaging Using Diffusion Model-Restored Sinogram.","authors":"Kaile Chen, Weikang Zhang, Ziheng Deng, Yufu Zhou, Jun Zhao","doi":"10.1109/TMI.2025.3599508","DOIUrl":"10.1109/TMI.2025.3599508","url":null,"abstract":"<p><p>Obtaining multiple CT scans from the same patient is required in many clinical scenarios, such as lung nodule screening and image-guided radiation therapy. Repeated scans would expose patients to higher radiation dose and increase the risk of cancer. In this study, we aim to achieve ultra-low-dose imaging for subsequent scans by collecting extremely undersampled sinogram via regional few-view scanning, and preserve image quality utilizing the preceding fullsampled scan as prior. To fully exploit prior information, we propose a two-stage framework consisting of diffusion model-based sinogram restoration and deep learning-based unrolled iterative reconstruction. Specifically, the undersampled sinogram is first restored by a conditional diffusion model with sinogram-domain prior guidance. Then, we formulate the undersampled data reconstruction problem as an optimization problem combining fidelity terms for both undersampled and restored data, along with a regularization term based on image-domain prior. Next, we propose Prior-aided Alternate Iterative NeTwork (PAINT) to solve the optimization problem. PAINT alternately updates the undersampled or restored data fidelity term, and unrolls the iterations to integrate neural network-based prior regularization. In the case of 112 mm field of view in simulated data experiments, our proposed framework achieved superior performance in terms of CT value accuracy and image details preservation. Clinical data experiments also demonstrated that our proposed framework outperformed the comparison methods in artifact reduction and structure recovery.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":"434-447"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TMI.2025.3607875
Walter Simson, Louise Zhuang, Benjamin N Frey, Sergio J Sanabria, Jeremy J Dahl, Dongwoon Hyun
In ultrasound imaging, propagation of an acoustic wavefront through heterogeneous media causes phase aberrations that degrade the coherence of the reflected wavefront, leading to reduced image resolution and contrast. Adaptive imaging techniques attempt to correct this phase aberration and restore coherence, leading to improved focusing of the image. We propose an autofocusing paradigm for aberration correction in ultrasound imaging by fitting an acoustic velocity field to pressure measurements, via optimization of the common midpoint phase error (CMPE), using a straight-ray wave propagation model for beamforming in diffusely scattering media. We show that CMPE induced by heterogeneous acoustic velocity is a robust measure of phase aberration that can be used for acoustic autofocusing. CMPE is optimized iteratively using a differentiable beamforming approach to simultaneously improve the image focus while estimating the acoustic velocity field of the interrogated medium. The approach relies solely on wavefield measurements using a straight-ray integral solution of the two-way time-of-flight without explicit numerical time-stepping models of wave propagation. We demonstrate method performance through in silico simulations, in vitro phantom measurements, and in vivo mammalian models, showing practical applications in distributed aberration quantification, correction, and velocity estimation for medical ultrasound autofocusing.
{"title":"Ultrasound Autofocusing: Common Midpoint Phase Error Optimization via Differentiable Beamforming.","authors":"Walter Simson, Louise Zhuang, Benjamin N Frey, Sergio J Sanabria, Jeremy J Dahl, Dongwoon Hyun","doi":"10.1109/TMI.2025.3607875","DOIUrl":"10.1109/TMI.2025.3607875","url":null,"abstract":"<p><p>In ultrasound imaging, propagation of an acoustic wavefront through heterogeneous media causes phase aberrations that degrade the coherence of the reflected wavefront, leading to reduced image resolution and contrast. Adaptive imaging techniques attempt to correct this phase aberration and restore coherence, leading to improved focusing of the image. We propose an autofocusing paradigm for aberration correction in ultrasound imaging by fitting an acoustic velocity field to pressure measurements, via optimization of the common midpoint phase error (CMPE), using a straight-ray wave propagation model for beamforming in diffusely scattering media. We show that CMPE induced by heterogeneous acoustic velocity is a robust measure of phase aberration that can be used for acoustic autofocusing. CMPE is optimized iteratively using a differentiable beamforming approach to simultaneously improve the image focus while estimating the acoustic velocity field of the interrogated medium. The approach relies solely on wavefield measurements using a straight-ray integral solution of the two-way time-of-flight without explicit numerical time-stepping models of wave propagation. We demonstrate method performance through in silico simulations, in vitro phantom measurements, and in vivo mammalian models, showing practical applications in distributed aberration quantification, correction, and velocity estimation for medical ultrasound autofocusing.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":"681-692"},"PeriodicalIF":0.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-26DOI: 10.1109/TMI.2026.3658004
Scott S Hsieh, James Day, Xinchen Deng, Magdalena Bazalova-Carter
Ring artifacts in CT are caused by uncalibrated variations in detector pixels and are especially prevalent with emerging photon counting detectors (PCDs). Control of ring artifacts is conventionally accomplished by improving either hardware manufacturing or software correction algorithms. An alternative solution is detector autocalibration, in which two redundant samples of each line integral are acquired and used to dynamically calibrate the PCD. Autocalibration was first proposed by Hounsfield in 1977 and was demonstrated on the EMI Topaz prototype scanner in 1980, but details surrounding this implementation are sparse. We investigate a form of autocalibration that requires just two redundant acquisitions, which could be acquired using flying focal spot on a clinical scanner but is demonstrated here with a detector shift. We formulated autocalibration as an optimization problem to determine the relative gain factor of each pixel and tested it on scans of a chicken thigh specimen, resolution phantom, and a cylindrical phantom. Ring artifacts were significantly reduced. Some residual artifacts remained but could not be discriminated from the intrinsic temporal instability of our PCD modules. Autocalibration could facilitate the adoption of widespread photon counting CT by reducing ring artifacts, thermal management requirements, or stability requirements that are present today. Demonstration of autocalibration on a rotating gantry with flying focal spot remains future work.
{"title":"Ring artifact reduction in photon counting CT using redundant sampling and autocalibration.","authors":"Scott S Hsieh, James Day, Xinchen Deng, Magdalena Bazalova-Carter","doi":"10.1109/TMI.2026.3658004","DOIUrl":"https://doi.org/10.1109/TMI.2026.3658004","url":null,"abstract":"<p><p>Ring artifacts in CT are caused by uncalibrated variations in detector pixels and are especially prevalent with emerging photon counting detectors (PCDs). Control of ring artifacts is conventionally accomplished by improving either hardware manufacturing or software correction algorithms. An alternative solution is detector autocalibration, in which two redundant samples of each line integral are acquired and used to dynamically calibrate the PCD. Autocalibration was first proposed by Hounsfield in 1977 and was demonstrated on the EMI Topaz prototype scanner in 1980, but details surrounding this implementation are sparse. We investigate a form of autocalibration that requires just two redundant acquisitions, which could be acquired using flying focal spot on a clinical scanner but is demonstrated here with a detector shift. We formulated autocalibration as an optimization problem to determine the relative gain factor of each pixel and tested it on scans of a chicken thigh specimen, resolution phantom, and a cylindrical phantom. Ring artifacts were significantly reduced. Some residual artifacts remained but could not be discriminated from the intrinsic temporal instability of our PCD modules. Autocalibration could facilitate the adoption of widespread photon counting CT by reducing ring artifacts, thermal management requirements, or stability requirements that are present today. Demonstration of autocalibration on a rotating gantry with flying focal spot remains future work.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconstructing neurons from large electron microscopy (EM) datasets for connectomic analysis presents a significant challenge, particularly in segmenting neurons of complex morphologies. Previous deep learning-based neuron segmentation methods often rely on pixel-level image context and produce extensive oversegmented fragments. Detecting these split errors and merging the split neuron segments are non-trivial for various neurons in a large-scale EM data volume. In this work, we exploit multimodal features in the full workflow of automatic neuron proofreading. We propose a novel connection point detection network that utilizes both global 3D morphological features and high-resolution local image context to extract candidate segment pairs from massive adjacent segments. To effectively fuse the 3D morphological feature and the dense image features from very different scales, we design a proposal-based image feature sampling to improve the efficiency of multimodal cross-attentions. Integrating the connection point detection network with our connectivity prediction network which also utilizes multimodal features, we make a fully automatic neuron segment merging pipeline, closely imitating human proofreading. Comprehensive experimental results verify the effectiveness of the proposed modules and demonstrate the robustness of the entire pipeline in large-scale neuron reconstruction. The code and data are available at https://github.com/Levishery/ Neuron-Segment-Connection-Prediction.
{"title":"Neuron Segment Connectivity Prediction with Multimodal Features for Connectomics.","authors":"Qihua Chen, Xuejin Chen, Chenxuan Wang, Zhiwei Xiong, Feng Wu","doi":"10.1109/TMI.2026.3658169","DOIUrl":"https://doi.org/10.1109/TMI.2026.3658169","url":null,"abstract":"<p><p>Reconstructing neurons from large electron microscopy (EM) datasets for connectomic analysis presents a significant challenge, particularly in segmenting neurons of complex morphologies. Previous deep learning-based neuron segmentation methods often rely on pixel-level image context and produce extensive oversegmented fragments. Detecting these split errors and merging the split neuron segments are non-trivial for various neurons in a large-scale EM data volume. In this work, we exploit multimodal features in the full workflow of automatic neuron proofreading. We propose a novel connection point detection network that utilizes both global 3D morphological features and high-resolution local image context to extract candidate segment pairs from massive adjacent segments. To effectively fuse the 3D morphological feature and the dense image features from very different scales, we design a proposal-based image feature sampling to improve the efficiency of multimodal cross-attentions. Integrating the connection point detection network with our connectivity prediction network which also utilizes multimodal features, we make a fully automatic neuron segment merging pipeline, closely imitating human proofreading. Comprehensive experimental results verify the effectiveness of the proposed modules and demonstrate the robustness of the entire pipeline in large-scale neuron reconstruction. The code and data are available at https://github.com/Levishery/ Neuron-Segment-Connection-Prediction.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1109/TMI.2026.3654585
Jin Liu, Qing Lin, Zhuang Xiong, Shanshan Shan, Chunyi Liu, Min Li, Feng Liu, G Bruce Pike, Hongfu Sun, Yang Gao
Incoherent k-space undersampling and deep learning-based reconstruction methods have shown great success in accelerating MRI. However, the performance of most previous methods will degrade dramatically under high acceleration factors, e.g., 8× or higher. Recently, denoising diffusion models (DM) have demonstrated promising results in solving this issue; however, one major drawback of the DM methods is the long inference time due to a dramatic number of iterative reverse posterior sampling steps. In this work, a Single Step Diffusion Model-based reconstruction framework, namely SSDM-MRI, is proposed for restoring MRI images from highly undersampled k-space. The proposed method achieves one-step reconstruction by first training a conditional DM and then iteratively distilling this model four times using an iterative selective distillation algorithm, which works synergistically with a shortcut reverse sampling strategy for model inference. Comprehensive experiments were carried out on both publicly available fastMRI brain and knee images, as well as an in-house multi-echo GRE (QSM) subject. Overall, the results showed that SSDM-MRI outperformed other methods in terms of numerical metrics (e.g., PSNR and SSIM), error maps, image fine details, and latent susceptibility information hidden in MRI phase images. In addition, the reconstruction time for a 320×320 brain slice of SSDM-MRI is only 0.45 second, which is only comparable to that of a simple U-net, making it a highly effective solution for MRI reconstruction tasks.
{"title":"Highly Undersampled MRI Reconstruction via a Single Posterior Sampling of Diffusion Models.","authors":"Jin Liu, Qing Lin, Zhuang Xiong, Shanshan Shan, Chunyi Liu, Min Li, Feng Liu, G Bruce Pike, Hongfu Sun, Yang Gao","doi":"10.1109/TMI.2026.3654585","DOIUrl":"https://doi.org/10.1109/TMI.2026.3654585","url":null,"abstract":"<p><p>Incoherent k-space undersampling and deep learning-based reconstruction methods have shown great success in accelerating MRI. However, the performance of most previous methods will degrade dramatically under high acceleration factors, e.g., 8× or higher. Recently, denoising diffusion models (DM) have demonstrated promising results in solving this issue; however, one major drawback of the DM methods is the long inference time due to a dramatic number of iterative reverse posterior sampling steps. In this work, a Single Step Diffusion Model-based reconstruction framework, namely SSDM-MRI, is proposed for restoring MRI images from highly undersampled k-space. The proposed method achieves one-step reconstruction by first training a conditional DM and then iteratively distilling this model four times using an iterative selective distillation algorithm, which works synergistically with a shortcut reverse sampling strategy for model inference. Comprehensive experiments were carried out on both publicly available fastMRI brain and knee images, as well as an in-house multi-echo GRE (QSM) subject. Overall, the results showed that SSDM-MRI outperformed other methods in terms of numerical metrics (e.g., PSNR and SSIM), error maps, image fine details, and latent susceptibility information hidden in MRI phase images. In addition, the reconstruction time for a 320×320 brain slice of SSDM-MRI is only 0.45 second, which is only comparable to that of a simple U-net, making it a highly effective solution for MRI reconstruction tasks.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TMI.2025.3650412
Sebastian Rassmann, David Kugler, Christian Ewert, Martin Reuter
While Generative Adversarial Nets (GANs) and Diffusion Models (DMs) have achieved impressive results in natural image synthesis, their core strengths - creativity and realism - can be detrimental in medical applications, where accuracy and fidelity are paramount. These models instead risk introducing hallucinations and replication of unwanted acquisition noise. Here, we propose YODA (You Only Denoise once - or Average), a 2.5D diffusion-based framework for medical image translation (MIT). Consistent with DM theory, we find that conventional diffusion sampling stochastically replicates noise. To mitigate this, we draw and average multiple samples, akin to physical signal averaging. As this effectively approximates the DM's expected value, we term this Expectation-Approximation (ExpA) sampling. We additionally propose regression sampling YODA, which retains the initial DM prediction and omits iterative refinement to produce noise-free images in a single step. Across five diverse multi-modal datasets - including multi-contrast brain MRI and pelvic MRI-CT - we demonstrate that regression sampling is not only substantially more efficient but also matches or exceeds image quality of full diffusion sampling even with ExpA. Our results reveal that iterative refinement solely enhances perceptual realism without benefiting information translation, which we confirm in relevant downstream tasks. YODA outperforms eight state-of-the-art DMs and GANs and challenges the presumed superiority of DMs and GANs over computationally cheap regression models for high-quality MIT. Furthermore, we show that YODA-translated images are interchangeable with, or even superior to, physical acquisitions for several medical applications.
{"title":"Regression is all you need for medical image translation.","authors":"Sebastian Rassmann, David Kugler, Christian Ewert, Martin Reuter","doi":"10.1109/TMI.2025.3650412","DOIUrl":"https://doi.org/10.1109/TMI.2025.3650412","url":null,"abstract":"<p><p>While Generative Adversarial Nets (GANs) and Diffusion Models (DMs) have achieved impressive results in natural image synthesis, their core strengths - creativity and realism - can be detrimental in medical applications, where accuracy and fidelity are paramount. These models instead risk introducing hallucinations and replication of unwanted acquisition noise. Here, we propose YODA (You Only Denoise once - or Average), a 2.5D diffusion-based framework for medical image translation (MIT). Consistent with DM theory, we find that conventional diffusion sampling stochastically replicates noise. To mitigate this, we draw and average multiple samples, akin to physical signal averaging. As this effectively approximates the DM's expected value, we term this Expectation-Approximation (ExpA) sampling. We additionally propose regression sampling YODA, which retains the initial DM prediction and omits iterative refinement to produce noise-free images in a single step. Across five diverse multi-modal datasets - including multi-contrast brain MRI and pelvic MRI-CT - we demonstrate that regression sampling is not only substantially more efficient but also matches or exceeds image quality of full diffusion sampling even with ExpA. Our results reveal that iterative refinement solely enhances perceptual realism without benefiting information translation, which we confirm in relevant downstream tasks. YODA outperforms eight state-of-the-art DMs and GANs and challenges the presumed superiority of DMs and GANs over computationally cheap regression models for high-quality MIT. Furthermore, we show that YODA-translated images are interchangeable with, or even superior to, physical acquisitions for several medical applications.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TMI.2025.3650126
Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang
The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm achieved an IoU of 0.774, which is promising but not yet sufficient for intra-operative decision-making, reflecting the inherent challenges of fragment overlap in projection imaging. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.
{"title":"Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-Ray: Summary of the PENGWIN 2024 Challenge.","authors":"Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang","doi":"10.1109/TMI.2025.3650126","DOIUrl":"https://doi.org/10.1109/TMI.2025.3650126","url":null,"abstract":"<p><p>The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm achieved an IoU of 0.774, which is promising but not yet sufficient for intra-operative decision-making, reflecting the inherent challenges of fragment overlap in projection imaging. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/TMI.2025.3648299
Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones
Computer vision-based technologies significantly enhance surgical automation by advancing tool tracking, detection, and localization. However, Current data-driven approaches are data-voracious, requiring large, high-quality labeled image datasets. Our Work introduces a novel dynamic Gaussian Splatting technique to address the data scarcity in surgical image datasets. We propose a dynamic Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. We utilize a dynamic training adjustment strategy to address challenges posed by poorly calibrated camera poses from real-world scenarios. Additionally, automatically generate annotations for our synthetic data. For evaluation, we constructed a new dataset featuring seven scenes with 14,000 frames of tool and camera motion and tool jaw articulation, with a background of an exvivo porcine model. Using this dataset, we synthetically replicate the scene deformation from the ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets with the highest PSNR (29.87). We further evaluate the performance of medical-specific neural networks trained on real and synthetic images using an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images generated by the proposed method outperforms those trained with state-of-the-art standard data augmentation by 10%, leading to an overall improvement in model performances by nearly 15%.
{"title":"NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction.","authors":"Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones","doi":"10.1109/TMI.2025.3648299","DOIUrl":"https://doi.org/10.1109/TMI.2025.3648299","url":null,"abstract":"<p><p>Computer vision-based technologies significantly enhance surgical automation by advancing tool tracking, detection, and localization. However, Current data-driven approaches are data-voracious, requiring large, high-quality labeled image datasets. Our Work introduces a novel dynamic Gaussian Splatting technique to address the data scarcity in surgical image datasets. We propose a dynamic Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. We utilize a dynamic training adjustment strategy to address challenges posed by poorly calibrated camera poses from real-world scenarios. Additionally, automatically generate annotations for our synthetic data. For evaluation, we constructed a new dataset featuring seven scenes with 14,000 frames of tool and camera motion and tool jaw articulation, with a background of an exvivo porcine model. Using this dataset, we synthetically replicate the scene deformation from the ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets with the highest PSNR (29.87). We further evaluate the performance of medical-specific neural networks trained on real and synthetic images using an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images generated by the proposed method outperforms those trained with state-of-the-art standard data augmentation by 10%, leading to an overall improvement in model performances by nearly 15%.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1109/TMI.2025.3648852
Langtao Zhou, Xiaoxia Qu, Tianyu Fu, Jiaoyang Wu, Hong Song, Jingfan Fan, Danni Ai, Deqiang Xiao, Junfang Xian, Jian Yang
Synthesizing missing modalities in multi-parametric MRI (mpMRI) is vital for accurate tumor diagnosis, yet remains challenging due to incomplete acquisitions and modality heterogeneity. Diffusion models have shown strong generative capability, but conventional approaches typically operate in the image domain with high memory costs and often rely solely on noise-space supervision, which limits anatomical fidelity. Latent diffusion models (LDMs) improve efficiency by performing denoising in latent space, but standard LDMs lack explicit structural priors and struggle to integrate multiple modalities effectively. To address these limitations, we propose the anatomy-aware sketch-guided latent diffusion model (ASLDM), a novel LDM-based framework designed for flexible and structure-preserving MRI synthesis. ASLDM incorporates an anatomy-aware feature fusion module, which encodes tumor region masks and edge-based anatomical sketches via cross-attention to guide the denoising process with explicit structure priors. A modality synergistic reconstruction strategy enables the joint modeling of available and missing modalities, enhancing cross-modal consistency and supporting arbitrary missing scenarios. Additionally, we introduce image-level losses for pixel-space supervision using L1 and SSIM losses, overcoming the limitations of pure noise-based loss training and improving the anatomical accuracy of synthesized outputs. Extensive experiments on a five-modality orbital tumor mpMRI private dataset and a four-modality public BraTS2024 dataset demonstrate that ASLDM outperforms state-of-the-art methods in both synthesis quality and structural consistency, showing strong potential for clinically reliable multi-modal MRI completion. Our code is publicly available at: https://github.com/zltshadow/ASLDM.git.
{"title":"Anatomy-aware Sketch-guided Latent Diffusion Model for Orbital Tumor Multi-Parametric MRI Missing Modalities Synthesis.","authors":"Langtao Zhou, Xiaoxia Qu, Tianyu Fu, Jiaoyang Wu, Hong Song, Jingfan Fan, Danni Ai, Deqiang Xiao, Junfang Xian, Jian Yang","doi":"10.1109/TMI.2025.3648852","DOIUrl":"https://doi.org/10.1109/TMI.2025.3648852","url":null,"abstract":"<p><p>Synthesizing missing modalities in multi-parametric MRI (mpMRI) is vital for accurate tumor diagnosis, yet remains challenging due to incomplete acquisitions and modality heterogeneity. Diffusion models have shown strong generative capability, but conventional approaches typically operate in the image domain with high memory costs and often rely solely on noise-space supervision, which limits anatomical fidelity. Latent diffusion models (LDMs) improve efficiency by performing denoising in latent space, but standard LDMs lack explicit structural priors and struggle to integrate multiple modalities effectively. To address these limitations, we propose the anatomy-aware sketch-guided latent diffusion model (ASLDM), a novel LDM-based framework designed for flexible and structure-preserving MRI synthesis. ASLDM incorporates an anatomy-aware feature fusion module, which encodes tumor region masks and edge-based anatomical sketches via cross-attention to guide the denoising process with explicit structure priors. A modality synergistic reconstruction strategy enables the joint modeling of available and missing modalities, enhancing cross-modal consistency and supporting arbitrary missing scenarios. Additionally, we introduce image-level losses for pixel-space supervision using L1 and SSIM losses, overcoming the limitations of pure noise-based loss training and improving the anatomical accuracy of synthesized outputs. Extensive experiments on a five-modality orbital tumor mpMRI private dataset and a four-modality public BraTS2024 dataset demonstrate that ASLDM outperforms state-of-the-art methods in both synthesis quality and structural consistency, showing strong potential for clinically reliable multi-modal MRI completion. Our code is publicly available at: https://github.com/zltshadow/ASLDM.git.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1109/TMI.2025.3649239
Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao
Cross-domain few-shot medical image segmentation (CD-FSMIS) offers a promising and data-efficient solution for medical applications where annotations are severely scarce and multimodal analysis is required. However, existing methods typically filter out domain-specific information to improve generalization, which inadvertently limits cross-domain performance and degrades source-domain accuracy. To address this, we present Contrastive Graph Modeling (C-Graph), a framework that leverages the structural consistency of medical images as a reliable domain-transferable prior. We represent image features as graphs, with pixels as nodes and semantic affinities as edges. A Structural Prior Graph (SPG) layer is proposed to capture and transfer target-category node dependencies and enable global structure modeling through explicit node interactions. Building upon SPG layers, we introduce a Subgraph Matching Decoding (SMD) mechanism that exploits semantic relations among nodes to guide prediction. Furthermore, we design a Confusion-minimizing Node Contrast (CNC) loss to mitigate node ambiguity and sub-graph heterogeneity by contrastively enhancing node discriminability in the graph space. Our method significantly outperforms prior CD-FSMIS approaches across multiple cross-domain benchmarks, achieving state-of-the-art performance while simultaneously preserving strong segmentation accuracy on the source domain. Our code is available at https://github.com/primebo1/C-Graph.
{"title":"Contrastive Graph Modeling for Cross-Domain Few-Shot Medical Image Segmentation.","authors":"Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao","doi":"10.1109/TMI.2025.3649239","DOIUrl":"https://doi.org/10.1109/TMI.2025.3649239","url":null,"abstract":"<p><p>Cross-domain few-shot medical image segmentation (CD-FSMIS) offers a promising and data-efficient solution for medical applications where annotations are severely scarce and multimodal analysis is required. However, existing methods typically filter out domain-specific information to improve generalization, which inadvertently limits cross-domain performance and degrades source-domain accuracy. To address this, we present Contrastive Graph Modeling (C-Graph), a framework that leverages the structural consistency of medical images as a reliable domain-transferable prior. We represent image features as graphs, with pixels as nodes and semantic affinities as edges. A Structural Prior Graph (SPG) layer is proposed to capture and transfer target-category node dependencies and enable global structure modeling through explicit node interactions. Building upon SPG layers, we introduce a Subgraph Matching Decoding (SMD) mechanism that exploits semantic relations among nodes to guide prediction. Furthermore, we design a Confusion-minimizing Node Contrast (CNC) loss to mitigate node ambiguity and sub-graph heterogeneity by contrastively enhancing node discriminability in the graph space. Our method significantly outperforms prior CD-FSMIS approaches across multiple cross-domain benchmarks, achieving state-of-the-art performance while simultaneously preserving strong segmentation accuracy on the source domain. Our code is available at https://github.com/primebo1/C-Graph.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}