首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
Deep Few-View High-Resolution Photon-Counting CT at Halved Dose for Extremity Imaging 半剂量深少视野高分辨率光子计数CT在四肢成像中的应用
Pub Date : 2025-10-10 DOI: 10.1109/TMI.2025.3618754
Mengzhou Li;Chuang Niu;Ge Wang;Maya R. Amma;Krishna M. Chapagain;Stefan Gabrielson;Andrew Li;Kevin Jonker;Niels de Ruiter;Jennifer A. Clark;Phil Butler;Anthony Butler;Hengyong Yu
X-ray photon-counting computed tomography (PCCT) for extremity allows multi-energy high-resolution (HR) imaging but its radiation dose can be further improved. Despite the great potential of deep learning techniques, their application in HR volumetric PCCT reconstruction has been challenged by the large memory burden, training data scarcity, and domain gap issues. In this paper, we propose a deep learning-based approach for PCCT image reconstruction at halved dose and doubled speed validated in a New Zealand clinical trial. Specifically, we design a patch-based volumetric refinement network to alleviate the GPU memory limitation, train network with synthetic data, and use model-based iterative refinement to bridge the gap between synthetic and clinical data. Our results in a reader study of 8 patients from the clinical trial demonstrate a great potential to cut the radiation dose to half that of the clinical PCCT standard without compromising image quality and diagnostic value.
四肢x射线光子计数计算机断层扫描(PCCT)可以实现多能量高分辨率(HR)成像,但其辐射剂量还有待提高。尽管深度学习技术具有巨大的潜力,但其在HR体积PCCT重建中的应用受到了巨大的内存负担、训练数据稀缺和域间隙问题的挑战。在本文中,我们提出了一种基于深度学习的方法,用于一半剂量和两倍速度的PCCT图像重建,并在新西兰的临床试验中得到验证。具体来说,我们设计了一个基于补丁的体积细化网络来缓解GPU内存限制,用合成数据训练网络,并使用基于模型的迭代细化来弥合合成数据和临床数据之间的差距。我们对来自临床试验的8名患者的读者研究结果表明,在不影响图像质量和诊断价值的情况下,将辐射剂量降低到临床PCCT标准的一半是有很大潜力的。
{"title":"Deep Few-View High-Resolution Photon-Counting CT at Halved Dose for Extremity Imaging","authors":"Mengzhou Li;Chuang Niu;Ge Wang;Maya R. Amma;Krishna M. Chapagain;Stefan Gabrielson;Andrew Li;Kevin Jonker;Niels de Ruiter;Jennifer A. Clark;Phil Butler;Anthony Butler;Hengyong Yu","doi":"10.1109/TMI.2025.3618754","DOIUrl":"10.1109/TMI.2025.3618754","url":null,"abstract":"X-ray photon-counting computed tomography (PCCT) for extremity allows multi-energy high-resolution (HR) imaging but its radiation dose can be further improved. Despite the great potential of deep learning techniques, their application in HR volumetric PCCT reconstruction has been challenged by the large memory burden, training data scarcity, and domain gap issues. In this paper, we propose a deep learning-based approach for PCCT image reconstruction at halved dose and doubled speed validated in a New Zealand clinical trial. Specifically, we design a patch-based volumetric refinement network to alleviate the GPU memory limitation, train network with synthetic data, and use model-based iterative refinement to bridge the gap between synthetic and clinical data. Our results in a reader study of 8 patients from the clinical trial demonstrate a great potential to cut the radiation dose to half that of the clinical PCCT standard without compromising image quality and diagnostic value.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 3","pages":"1193-1207"},"PeriodicalIF":0.0,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145260747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Source-Free Active Domain Adaptation via Influential-Points-Guided Progressive Teacher for Medical Image Segmentation 基于影响点引导的渐进式教师无源主动域自适应医学图像分割。
Pub Date : 2025-10-09 DOI: 10.1109/TMI.2025.3619837
Yong Chen;Xiangde Luo;Renyi Chen;Yiyue Li;Han Zhang;He Lyu;Huan Song;Kang Li
Domain adaptation in medical image segmentation enables pre-trained models to generalize to new target domains. Given limited annotated data and privacy constraints, Source-Free Active Domain Adaptation (SFADA) methods provide promising solutions by selecting a few target samples for labeling without accessing source samples. However, in a fully source-free setting, existing works have not fully explored how to select these target samples in a class-balanced manner and how to conduct robust model adaptation using both labeled and unlabeled samples. In this study, we discover that boundary samples with source-like semantics but sharp predictive discrepancies are beneficial for SFADA. We define these samples as the most influential points and propose a slice-wise framework using influential points learning to explore them. Specifically, we detect source-like samples to retain source-specific knowledge. For each target sample, an adaptive K-nearest neighbor algorithm based on local density is introduced to construct neighborhoods of source-like samples for knowledge transfer. We then propose a class-balanced Kullback-Leibler divergence for these neighborhoods, calculating it to obtain an influential score ranking. A diverse subset of the highest-ranked target samples (considered influential points) is manually annotated. Furthermore, we design a progressive teacher model to facilitate SFADA for medical image segmentation. With the guidance of influential points, this model independently generates and utilizes pseudo-labels to mitigate error accumulation. To further suppress noise, curriculum learning is incorporated into the model to progressively leverage reliable supervision signals from pseudo-labels. Experiments on multiple benchmarks demonstrate that our method outperforms state-of-the-art methods even with only 2.5% of the labeling budget.
医学图像分割中的领域自适应使预先训练好的模型能够泛化到新的目标领域。考虑到标注数据有限和隐私限制,无源主动域自适应(SFADA)方法在不访问源样本的情况下选择少量目标样本进行标记,提供了很有前途的解决方案。然而,在完全无源的情况下,现有的工作并没有充分探索如何以类平衡的方式选择这些目标样本,以及如何使用标记和未标记的样本进行鲁棒模型自适应。在本研究中,我们发现具有类似源语义但预测差异明显的边界样本有利于SFADA。我们将这些样本定义为最具影响力的点,并提出了一个使用影响力点学习来探索它们的切片框架。具体来说,我们检测类源样本以保留特定于源的知识。针对每个目标样本,引入基于局部密度的自适应k近邻算法,构建类源样本的邻域进行知识转移。然后,我们为这些社区提出了一个阶级平衡的Kullback-Leibler散度,计算它以获得一个有影响力的分数排名。排名最高的目标样本(被认为是影响点)的不同子集被手动注释。此外,我们还设计了一种渐进式教师模型来促进SFADA对医学图像的分割。该模型在影响点的引导下,独立生成并利用伪标签来减少误差积累。为了进一步抑制噪声,课程学习被纳入模型,以逐步利用伪标签的可靠监督信号。多个基准测试的实验表明,我们的方法优于最先进的方法,即使只有2.5%的标签预算。
{"title":"Source-Free Active Domain Adaptation via Influential-Points-Guided Progressive Teacher for Medical Image Segmentation","authors":"Yong Chen;Xiangde Luo;Renyi Chen;Yiyue Li;Han Zhang;He Lyu;Huan Song;Kang Li","doi":"10.1109/TMI.2025.3619837","DOIUrl":"10.1109/TMI.2025.3619837","url":null,"abstract":"Domain adaptation in medical image segmentation enables pre-trained models to generalize to new target domains. Given limited annotated data and privacy constraints, Source-Free Active Domain Adaptation (SFADA) methods provide promising solutions by selecting a few target samples for labeling without accessing source samples. However, in a fully source-free setting, existing works have not fully explored how to select these target samples in a class-balanced manner and how to conduct robust model adaptation using both labeled and unlabeled samples. In this study, we discover that boundary samples with source-like semantics but sharp predictive discrepancies are beneficial for SFADA. We define these samples as the most influential points and propose a slice-wise framework using influential points learning to explore them. Specifically, we detect source-like samples to retain source-specific knowledge. For each target sample, an adaptive K-nearest neighbor algorithm based on local density is introduced to construct neighborhoods of source-like samples for knowledge transfer. We then propose a class-balanced Kullback-Leibler divergence for these neighborhoods, calculating it to obtain an influential score ranking. A diverse subset of the highest-ranked target samples (considered influential points) is manually annotated. Furthermore, we design a progressive teacher model to facilitate SFADA for medical image segmentation. With the guidance of influential points, this model independently generates and utilizes pseudo-labels to mitigate error accumulation. To further suppress noise, curriculum learning is incorporated into the model to progressively leverage reliable supervision signals from pseudo-labels. Experiments on multiple benchmarks demonstrate that our method outperforms state-of-the-art methods even with only 2.5% of the labeling budget.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 3","pages":"1223-1236"},"PeriodicalIF":0.0,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145254811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detailed Delineation of the Fetal Brain in Diffusion MRI via Multi-Task Learning 通过多任务学习的扩散MRI对胎儿大脑的详细描绘
Pub Date : 2025-10-09 DOI: 10.1109/TMI.2025.3619809
Davood Karimi;Camilo Calixto;Haykel Snoussi;Bo Li;Maria Camila Cortes-Albornoz;Clemente Velasco-Annis;Caitlin Rollins;Lana Pierotich;Camilo Jaimes;Ali Gholipour;Simon K. Warfield
Diffusion-weighted MRI (dMRI) is increasingly used to study the normal and abnormal development of fetal brain in-utero. It offers invaluable insights into the neurodevelopmental processes in the fetal stage. However, reliable analysis of fetal dMRI data requires dedicated computational methods that are currently unavailable. The lack of automated methods for fast, accurate, and reproducible data analysis has seriously limited our ability to tap the potential of fetal brain dMRI for medical and scientific applications. In this work, we developed and validated a unified computational framework to:1) segment the brain tissue into white matter, cortical/subcortical gray matter, and cerebrospinal fluid,:2) segment 31 distinct white matter tracts, and:3) parcellate the brain’s cortex, deep gray nuclei, and white matter structures into 96 anatomically meaningful regions. We utilized a set of manual, semi-automatic, and automatic approaches to annotate 97 fetal brains. Using these labels, we developed and validated a multi-task deep learning method to perform the three computations. Evaluations show that the new method can accurately carry out all three tasks, achieving a mean Dice similarity coefficient of 0.865 on tissue segmentation, 0.825 on white matter tract segmentation, and 0.819 on parcellation. Further validation on independent external data shows generalizability of the proposed method. The new method can help advance the field of fetal neuroimaging as it can lead to substantial improvements in fetal brain tractography, tract-specific analysis, and structural connectivity assessment.
弥散加权磁共振成像(dMRI)越来越多地用于研究胎儿脑在子宫内的正常和异常发育。它为胎儿阶段的神经发育过程提供了宝贵的见解。然而,胎儿dMRI数据的可靠分析需要专用的计算方法,目前尚不可用。由于缺乏快速、准确和可重复数据分析的自动化方法,严重限制了我们挖掘胎儿脑dMRI在医学和科学应用方面的潜力。在这项工作中,我们开发并验证了一个统一的计算框架:1)将脑组织分割成白质、皮层/皮层下灰质和脑脊液;2)分割31个不同的白质束;3)将大脑皮层、深灰色核和白质结构分割成96个有解剖学意义的区域。我们使用了一套手动、半自动和自动的方法来注释97个胎儿的大脑。使用这些标签,我们开发并验证了一种多任务深度学习方法来执行这三个计算。评估表明,新方法能够准确地完成这三个任务,组织分割的Dice相似系数均值为0.865,白质束分割的Dice相似系数均值为0.825,包裹分割的Dice相似系数均值为0.819。对独立外部数据的进一步验证表明了所提方法的可推广性。新方法可以帮助推进胎儿神经成像领域,因为它可以导致胎儿脑导管造影,导管特异性分析和结构连通性评估的实质性改进。
{"title":"Detailed Delineation of the Fetal Brain in Diffusion MRI via Multi-Task Learning","authors":"Davood Karimi;Camilo Calixto;Haykel Snoussi;Bo Li;Maria Camila Cortes-Albornoz;Clemente Velasco-Annis;Caitlin Rollins;Lana Pierotich;Camilo Jaimes;Ali Gholipour;Simon K. Warfield","doi":"10.1109/TMI.2025.3619809","DOIUrl":"10.1109/TMI.2025.3619809","url":null,"abstract":"Diffusion-weighted MRI (dMRI) is increasingly used to study the normal and abnormal development of fetal brain in-utero. It offers invaluable insights into the neurodevelopmental processes in the fetal stage. However, reliable analysis of fetal dMRI data requires dedicated computational methods that are currently unavailable. The lack of automated methods for fast, accurate, and reproducible data analysis has seriously limited our ability to tap the potential of fetal brain dMRI for medical and scientific applications. In this work, we developed and validated a unified computational framework to:1) segment the brain tissue into white matter, cortical/subcortical gray matter, and cerebrospinal fluid,:2) segment 31 distinct white matter tracts, and:3) parcellate the brain’s cortex, deep gray nuclei, and white matter structures into 96 anatomically meaningful regions. We utilized a set of manual, semi-automatic, and automatic approaches to annotate 97 fetal brains. Using these labels, we developed and validated a multi-task deep learning method to perform the three computations. Evaluations show that the new method can accurately carry out all three tasks, achieving a mean Dice similarity coefficient of 0.865 on tissue segmentation, 0.825 on white matter tract segmentation, and 0.819 on parcellation. Further validation on independent external data shows generalizability of the proposed method. The new method can help advance the field of fetal neuroimaging as it can lead to substantial improvements in fetal brain tractography, tract-specific analysis, and structural connectivity assessment.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 3","pages":"1208-1222"},"PeriodicalIF":0.0,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MACE Risk Prediction in ARVC Patients via CMR: A Three-Tier Spatiotemporal Transformer With Pericardial Adipose Tissue Embedding 通过CMR预测ARVC患者的MACE风险:心包脂肪组织埋置的三层时空转换器。
Pub Date : 2025-10-07 DOI: 10.1109/TMI.2025.3618711
Xiaoyu Wang;Jinyu Zheng;Chaolu Feng;Lian-Ming Wu
Major adverse cardiac events (MACE) pose a high life-threatening risk to patients with arrhythmogenic right ventricular cardiomyopathy (ARVC). Cardiac magnetic resonance (CMR) has been proven to reflect the risk of MACE, but two challenges remain: limited dataset size due to the rarity of ARVC and overlapping image distributions between non-MACE and MACE patients. To address these challenges by fully leveraging the dynamic and spatial information in the limited CMR dataset, a deep learning-based risk prediction model named Three-Tier Spatiotemporal Transformer (TTST) is proposed in this paper, which utilizes three transformer-based tiers to sequentially extract and fuse features from three domains: the 2D spatial domain of each slice, the temporal dimension of slice sequence and the inter-slice depth dimension. In TTST, a pericardial adipose tissue (PAT) embedding unit is proposed to incorporate the dynamic and positional information of PAT, a key biomarker for distinguishing MACE from non-MACE based on its thickening and reduced motion, as prior knowledge to reduce reliance on large-scale datasets. Additionally, a patch voting unit is introduced to pick out local features that highlight more indicative regions in the heart, guided by the PAT embedding information. Experimental results demonstrate that TTST outperforms existing classification methods in MACE prediction (internal: AUC = 0.89, ACC = 84.02%; external: AUC = 0.87, ACC = 86.21%). Clinically, TTST achieves effective risk prediction performance either independently (C-index = 0.744) or in combination with the existing 5-year risk score model (increasing C-index from 0.686 to 0.777). Code and dataset are accessible at https://github.com/DFLAG-NEU
重大心脏不良事件(MACE)对心律失常性右室心肌病(ARVC)患者具有很高的危及生命的风险。心脏磁共振(CMR)已被证明可以反映MACE的风险,但仍然存在两个挑战:由于ARVC的罕见性,数据集大小有限,以及非MACE和MACE患者之间的重叠图像分布。为了充分利用有限CMR数据集的动态和空间信息来解决这些挑战,本文提出了一种基于深度学习的风险预测模型——三层时空变压器(three - tier Spatiotemporal Transformer, TTST),该模型利用基于三层变压器的层序从三个领域依次提取和融合特征:每个切片的二维空间域、切片序列的时间维度和片间深度维度。在TTST中,提出了一种心包脂肪组织(PAT)嵌入单元,将PAT的动态和位置信息作为先验知识,以减少对大规模数据集的依赖。PAT是基于其增厚和运动减少来区分MACE与非MACE的关键生物标志物。此外,引入了一个补丁投票单元,在PAT嵌入信息的指导下,挑选出心脏中突出更多指示区域的局部特征。实验结果表明,TTST在MACE预测方面优于现有的分类方法(内部:AUC = 0.89, ACC = 84.02%;外部:AUC = 0.87, ACC = 86.21%)。在临床上,无论是单独使用TTST (C-index = 0.744),还是与现有的5年风险评分模型联合使用TTST (C-index由0.686提高到0.777)均能获得有效的风险预测效果。代码和数据集可访问https://github.com/DFLAG-NEU。
{"title":"MACE Risk Prediction in ARVC Patients via CMR: A Three-Tier Spatiotemporal Transformer With Pericardial Adipose Tissue Embedding","authors":"Xiaoyu Wang;Jinyu Zheng;Chaolu Feng;Lian-Ming Wu","doi":"10.1109/TMI.2025.3618711","DOIUrl":"10.1109/TMI.2025.3618711","url":null,"abstract":"Major adverse cardiac events (MACE) pose a high life-threatening risk to patients with arrhythmogenic right ventricular cardiomyopathy (ARVC). Cardiac magnetic resonance (CMR) has been proven to reflect the risk of MACE, but two challenges remain: limited dataset size due to the rarity of ARVC and overlapping image distributions between non-MACE and MACE patients. To address these challenges by fully leveraging the dynamic and spatial information in the limited CMR dataset, a deep learning-based risk prediction model named Three-Tier Spatiotemporal Transformer (TTST) is proposed in this paper, which utilizes three transformer-based tiers to sequentially extract and fuse features from three domains: the 2D spatial domain of each slice, the temporal dimension of slice sequence and the inter-slice depth dimension. In TTST, a pericardial adipose tissue (PAT) embedding unit is proposed to incorporate the dynamic and positional information of PAT, a key biomarker for distinguishing MACE from non-MACE based on its thickening and reduced motion, as prior knowledge to reduce reliance on large-scale datasets. Additionally, a patch voting unit is introduced to pick out local features that highlight more indicative regions in the heart, guided by the PAT embedding information. Experimental results demonstrate that TTST outperforms existing classification methods in MACE prediction (internal: AUC = 0.89, ACC = 84.02%; external: AUC = 0.87, ACC = 86.21%). Clinically, TTST achieves effective risk prediction performance either independently (C-index = 0.744) or in combination with the existing 5-year risk score model (increasing C-index from 0.686 to 0.777). Code and dataset are accessible at <uri>https://github.com/DFLAG-NEU</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 3","pages":"1179-1192"},"PeriodicalIF":0.0,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145241078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Q-Space Guided Multi-Modal Translation Network for Diffusion-Weighted Image Synthesis 扩散加权图像合成的q空间引导多模态平移网络。
Pub Date : 2025-10-07 DOI: 10.1109/TMI.2025.3618683
Pengli Zhu;Yingji Fu;Nanguang Chen;Anqi Qiu
Diffusion-weighted imaging (DWI) enables non-invasive characterization of tissue microstructure, yet acquiring densely sampled q-space data remains time-consuming and impractical in many clinical settings. Existing deep learning methods are typically constrained by fixed q-space sampling, limiting their adaptability to variable sampling scenarios. In this paper, we propose a Q-space Guided Multi-Modal Translation Network (Q-MMTN) for synthesizing multi-shell, high-angular resolution DWI (MS-HARDI) from flexible q-space sampling, leveraging commonly acquired structural data (e.g., T1- and T2-weighted MRI). Q-MMTN integrates the hybrid encoder and multi-modal attention fusion mechanism to effectively extract both local and global complementary information from multiple modalities. This design enhances feature representation and, together with a flexible q-space-aware embedding, enables dynamic modulation of internal features without relying on fixed sampling schemes. Additionally, we introduce a set of task-specific constraints, including adversarial, reconstruction, and anatomical consistency losses, which jointly enforce anatomical fidelity and signal realism. These constraints guide Q-MMTN to accurately capture the intrinsic and nonlinear relationships between directional DWI signals and q-space information. Extensive experiments across four lifespan datasets of children, adolescents, young and older adults demonstrate that Q-MMTN outperforms existing methods, including 1D-qDL, 2D-qDL, MESC-SD, and Q-GAN in estimating parameter maps and fiber tracts with fine-grained anatomical details. Notably, its ability to accommodate flexible q-space sampling highlights its potential as a promising toolkit for clinical and research applications. Our code is available at https://github.com/Idea89560041/Q-MMTN
弥散加权成像(DWI)能够无创地表征组织微观结构,但在许多临床环境中,获取密集采样的q空间数据仍然是耗时且不切实际的。现有的深度学习方法通常受到固定q空间采样的约束,限制了它们对可变采样场景的适应性。在本文中,我们提出了一个q空间引导的多模态平移网络(Q-MMTN),用于从灵活的q空间采样中合成多壳,高角度分辨率DWI (MS-HARDI),利用常用的结构数据(例如,T1和t2加权MRI)。Q-MMTN集成了混合编码器和多模态注意融合机制,有效地从多模态中提取局部和全局互补信息。该设计增强了特征表示,并与灵活的q空间感知嵌入一起,实现了内部特征的动态调制,而不依赖于固定的采样方案。此外,我们引入了一组特定于任务的约束,包括对抗性、重建和解剖一致性损失,这些约束共同增强了解剖保真度和信号真实感。这些约束指导Q-MMTN准确捕捉定向DWI信号与q空间信息之间的内在和非线性关系。针对儿童、青少年、年轻人和老年人的四种寿命数据集进行的广泛实验表明,Q-MMTN在估计参数图和具有细粒度解剖细节的纤维束方面优于现有方法,包括1D-qDL、2D-qDL、MESC-SD和Q-GAN。值得注意的是,其适应灵活q空间采样的能力突出了其作为临床和研究应用的有前途的工具包的潜力。我们的代码可在https://github.com/Idea89560041/Q-MMTN上获得。
{"title":"Q-Space Guided Multi-Modal Translation Network for Diffusion-Weighted Image Synthesis","authors":"Pengli Zhu;Yingji Fu;Nanguang Chen;Anqi Qiu","doi":"10.1109/TMI.2025.3618683","DOIUrl":"10.1109/TMI.2025.3618683","url":null,"abstract":"Diffusion-weighted imaging (DWI) enables non-invasive characterization of tissue microstructure, yet acquiring densely sampled q-space data remains time-consuming and impractical in many clinical settings. Existing deep learning methods are typically constrained by fixed q-space sampling, limiting their adaptability to variable sampling scenarios. In this paper, we propose a Q-space Guided Multi-Modal Translation Network (Q-MMTN) for synthesizing multi-shell, high-angular resolution DWI (MS-HARDI) from <italic>flexible q-space sampling</i>, leveraging commonly acquired structural data (e.g., T1- and T2-weighted MRI). Q-MMTN integrates the <italic>hybrid encoder</i> and <italic>multi-modal attention fusion mechanism</i> to effectively extract both local and global complementary information from multiple modalities. This design enhances feature representation and, together with a <italic>flexible q-space-aware embedding</i>, enables dynamic modulation of internal features <italic>without relying on fixed sampling schemes</i>. Additionally, we introduce a set of <italic>task-specific constraints</i>, including <italic>adversarial</i>, <italic>reconstruction</i>, and <italic>anatomical consistency losses</i>, which jointly enforce anatomical fidelity and signal realism. These constraints guide Q-MMTN to accurately capture the intrinsic and nonlinear relationships between directional DWI signals and q-space information. Extensive experiments across four lifespan datasets of children, adolescents, young and older adults demonstrate that Q-MMTN outperforms existing methods, including 1D-qDL, 2D-qDL, MESC-SD, and Q-GAN in estimating parameter maps and fiber tracts with fine-grained anatomical details. <italic>Notably, its ability to accommodate flexible q-space sampling highlights its potential as a promising toolkit for clinical and research applications.</i> Our code is available at <uri>https://github.com/Idea89560041/Q-MMTN</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 3","pages":"1167-1178"},"PeriodicalIF":0.0,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145240934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional Virtual Imaging for Few-Shot Vascular Image Segmentation 基于条件虚拟成像的少镜头血管图像分割。
Pub Date : 2025-09-25 DOI: 10.1109/TMI.2025.3608467
Yanglong He;Rongjun Ge;Hui Tang;Yuxin Liu;Mengqing Su;Jean-Louis Coatrieux;Huazhong Shu;Yang Chen;Yuting He
In the field of medical image processing, vascular image segmentation plays a crucial role in clinical diagnosis, treatment planning, prognosis, and medical decision-making. Accurate and automated segmentation of vascular images can assist clinicians in understanding the vascular network structure, leading to more informed medical decisions. However, manual annotation of vascular images is time-consuming and challenging due to the fine and low-contrast vascular branches, especially in the medical imaging domain where annotation requires specialized knowledge and clinical expertise. Data-driven deep learning models struggle to achieve good performance when only a small number of annotated vascular images are available. To address this issue, this paper proposes a novel Conditional Virtual Imaging (CVI) framework for few-shot vascular image segmentation learning. The framework combines limited annotated data with extensive unlabeled data to generate high-quality images, effectively improving the accuracy and robustness of segmentation learning. Our approach primarily includes two innovations: First, aligned image-mask pair generation, which leverages the powerful image generation capabilities of large pre-trained models to produce high-quality vascular images with complex structures using only a few training images; Second, the Dual-Consistency Learning (DCL) strategy, which simultaneously trains the generator and segmentation model, allowing them to learn from each other and maximize the utilization of limited data. Experimental results demonstrate that our CVI framework can generate high-quality medical images and effectively enhance the performance of segmentation models in few-shot scenarios. Our code will be made publicly available online.
在医学图像处理领域,血管图像分割在临床诊断、治疗计划、预后和医疗决策中起着至关重要的作用。血管图像的准确和自动分割可以帮助临床医生了解血管网络结构,从而做出更明智的医疗决策。然而,由于血管分支精细且对比度低,手工标注血管图像耗时且具有挑战性,特别是在医学成像领域,标注需要专业知识和临床专业知识。当只有少量带注释的血管图像可用时,数据驱动的深度学习模型难以达到良好的性能。为了解决这一问题,本文提出了一种新的条件虚拟成像(CVI)框架,用于小帧血管图像分割学习。该框架将有限的标注数据与大量的未标注数据相结合,生成高质量的图像,有效提高了分割学习的准确性和鲁棒性。我们的方法主要包括两个创新:第一,对齐图像掩码对生成,它利用大型预训练模型的强大图像生成能力,仅使用少量训练图像就能生成具有复杂结构的高质量血管图像;二是双一致性学习(Dual-Consistency Learning, DCL)策略,该策略同时训练生成器和分割模型,使它们相互学习,最大限度地利用有限的数据。实验结果表明,我们的CVI框架可以生成高质量的医学图像,并有效提高了少镜头场景下分割模型的性能。我们的代码将在网上公开。
{"title":"Conditional Virtual Imaging for Few-Shot Vascular Image Segmentation","authors":"Yanglong He;Rongjun Ge;Hui Tang;Yuxin Liu;Mengqing Su;Jean-Louis Coatrieux;Huazhong Shu;Yang Chen;Yuting He","doi":"10.1109/TMI.2025.3608467","DOIUrl":"10.1109/TMI.2025.3608467","url":null,"abstract":"In the field of medical image processing, vascular image segmentation plays a crucial role in clinical diagnosis, treatment planning, prognosis, and medical decision-making. Accurate and automated segmentation of vascular images can assist clinicians in understanding the vascular network structure, leading to more informed medical decisions. However, manual annotation of vascular images is time-consuming and challenging due to the fine and low-contrast vascular branches, especially in the medical imaging domain where annotation requires specialized knowledge and clinical expertise. Data-driven deep learning models struggle to achieve good performance when only a small number of annotated vascular images are available. To address this issue, this paper proposes a novel Conditional Virtual Imaging (CVI) framework for few-shot vascular image segmentation learning. The framework combines limited annotated data with extensive unlabeled data to generate high-quality images, effectively improving the accuracy and robustness of segmentation learning. Our approach primarily includes two innovations: First, aligned image-mask pair generation, which leverages the powerful image generation capabilities of large pre-trained models to produce high-quality vascular images with complex structures using only a few training images; Second, the Dual-Consistency Learning (DCL) strategy, which simultaneously trains the generator and segmentation model, allowing them to learn from each other and maximize the utilization of limited data. Experimental results demonstrate that our CVI framework can generate high-quality medical images and effectively enhance the performance of segmentation models in few-shot scenarios. Our code will be made publicly available online.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"811-824"},"PeriodicalIF":0.0,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145140266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiASNet: Multimodal Label Noise Robust Framework for the Classification of Aortic Stenosis in Echocardiography 超声心动图主动脉狭窄分类的多模态标签噪声鲁棒框架。
Pub Date : 2025-09-12 DOI: 10.1109/TMI.2025.3609319
Victoria Wu;Andrea Fung;Bahar Khodabakhshian;Baraa Abdelsamad;Hooman Vaseli;Neda Ahmadi;Jamie A. D. Goco;Michael Y. Tsang;Christina Luong;Purang Abolmaesumi;Teresa S. M. Tsang
Aortic stenosis (AS), a prevalent and serious heart valve disorder, requires early detection but remains difficult to diagnose in routine practice. Although echocardiography with Doppler imaging is the clinical standard, these assessments are typically limited to trained specialists. Point-of-care ultrasound (POCUS) offers an accessible alternative for AS screening but is restricted to basic 2D B-mode imaging, often lacking the analysis Doppler provides. Our project introduces MultiASNet, a multimodal machine learning framework designed to enhance AS screening with POCUS by combining 2D B-mode videos with structured data from echocardiography reports, including Doppler parameters. Using contrastive learning, MultiASNet aligns video features with report features in tabular form from the same patient to improve interpretive quality. To address misalignment where a single report corresponds to multiple video views, some irrelevant to AS diagnosis, we use cross-attention in a transformer-based video and tabular network to assign less importance to irrelevant report data. The model integrates structured data only during training, enabling independent use with B-mode videos during inference for broader accessibility. MultiASNet also incorporates sample selection to counteract label noise from observer variability, yielding improved accuracy on two datasets. We achieved balanced accuracy scores of 93.0% on a private dataset and 83.9% on the public TMED-2 dataset for AS detection. For severity classification, balanced accuracy scores were 80.4% and 59.4% on the private and public datasets, respectively. This model facilitates reliable AS screening in non-specialist settings, bridging the gap left by Doppler data while reducing noise-related errors. Our code is publicly available at github.com/DeepRCL/MultiASNet
主动脉瓣狭窄(AS)是一种普遍而严重的心脏瓣膜疾病,需要早期发现,但在常规实践中仍然难以诊断。虽然超声心动图与多普勒成像是临床标准,这些评估通常仅限于训练有素的专家。即时超声(POCUS)为AS筛查提供了一种可行的替代方法,但仅限于基本的2D b模式成像,通常缺乏多普勒提供的分析。我们的项目引入了MultiASNet,这是一个多模式机器学习框架,旨在通过将2D b模式视频与超声心动图报告的结构化数据(包括多普勒参数)相结合,增强对POCUS的AS筛查。使用对比学习,MultiASNet将来自同一患者的视频特征与报告特征以表格形式对齐,以提高解释质量。为了解决单个报告对应多个视频视图的不一致问题,其中一些与AS诊断无关,我们在基于变压器的视频和表格网络中使用交叉注意来分配不相关报告数据的重要性。该模型仅在训练期间集成结构化数据,可以在推理期间与b模式视频独立使用,以获得更广泛的可访问性。MultiASNet还结合了样本选择来抵消观察者可变性带来的标签噪声,从而提高了两个数据集的准确性。对于AS检测,我们在私有数据集上实现了93.0%的平衡准确率,在公共TMED-2数据集上实现了83.9%的平衡准确率。对于严重性分类,在私有和公共数据集上的平衡准确率得分分别为80.4%和59.4%。该模型有助于在非专业环境中进行可靠的AS筛查,弥补了多普勒数据留下的空白,同时减少了与噪声相关的错误。我们的代码可以在github.com/DeepRCL/MultiASNet上公开获得。
{"title":"MultiASNet: Multimodal Label Noise Robust Framework for the Classification of Aortic Stenosis in Echocardiography","authors":"Victoria Wu;Andrea Fung;Bahar Khodabakhshian;Baraa Abdelsamad;Hooman Vaseli;Neda Ahmadi;Jamie A. D. Goco;Michael Y. Tsang;Christina Luong;Purang Abolmaesumi;Teresa S. M. Tsang","doi":"10.1109/TMI.2025.3609319","DOIUrl":"10.1109/TMI.2025.3609319","url":null,"abstract":"Aortic stenosis (AS), a prevalent and serious heart valve disorder, requires early detection but remains difficult to diagnose in routine practice. Although echocardiography with Doppler imaging is the clinical standard, these assessments are typically limited to trained specialists. Point-of-care ultrasound (POCUS) offers an accessible alternative for AS screening but is restricted to basic 2D B-mode imaging, often lacking the analysis Doppler provides. Our project introduces MultiASNet, a multimodal machine learning framework designed to enhance AS screening with POCUS by combining 2D B-mode videos with structured data from echocardiography reports, including Doppler parameters. Using contrastive learning, MultiASNet aligns video features with report features in tabular form from the same patient to improve interpretive quality. To address misalignment where a single report corresponds to multiple video views, some irrelevant to AS diagnosis, we use cross-attention in a transformer-based video and tabular network to assign less importance to irrelevant report data. The model integrates structured data only during training, enabling independent use with B-mode videos during inference for broader accessibility. MultiASNet also incorporates sample selection to counteract label noise from observer variability, yielding improved accuracy on two datasets. We achieved balanced accuracy scores of 93.0% on a private dataset and 83.9% on the public TMED-2 dataset for AS detection. For severity classification, balanced accuracy scores were 80.4% and 59.4% on the private and public datasets, respectively. This model facilitates reliable AS screening in non-specialist settings, bridging the gap left by Doppler data while reducing noise-related errors. Our code is publicly available at github.com/DeepRCL/MultiASNet","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"799-810"},"PeriodicalIF":0.0,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145043575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Label-Efficient Deep Color Deconvolution of Brightfield Multiplex IHC Images 亮场多重IHC图像的标签高效深彩色反卷积
Pub Date : 2025-09-11 DOI: 10.1109/TMI.2025.3609245
Shahira Abousamra;Danielle Fassler;Rajarsi Gupta;Tahsin Kurc;Luisa F. Escobar-Hoyos;Dimitris Samaras;Kenneth R. Shroyer;Joel Saltz;Chao Chen
Brightfield Multiplex Immunohistochemistry (mIHC) provides simultaneous labeling of multiple protein biomarkers in the same tissue section. It enables the exploration of spatial relationships between the inflammatory microenvironment and tumor cells, and to uncover how tumor cell morphology relates to cancer biomarker expression. Color deconvolution is required to analyze and quantify the different cell phenotype populations present as indicated by the biomarkers. However, this becomes a challenging task as the number of multiplexed stains increase. In this work, we present self-supervised and semi-supervised approaches to mIHC color deconvolution. Our proposed methods are based on deep convolutional autoencoders and learn using innovative reconstruction losses inspired by physics. We show how we can integrate weak annotations and the abundant unlabeled data available to train a model to reliably unmix the multiplexed stains and generate stain segmentation maps. We demonstrate the effectiveness of our proposed methods through experiments on mIHC dataset of 7-plexed IHC images.
Brightfield Multiplex Immunohistochemistry (mIHC)提供同一组织切片中多种蛋白质生物标志物的同时标记。它可以探索炎症微环境与肿瘤细胞之间的空间关系,揭示肿瘤细胞形态与癌症生物标志物表达的关系。颜色反褶积需要分析和量化不同的细胞表型群体,如生物标志物所示。然而,随着多路污渍数量的增加,这成为一项具有挑战性的任务。在这项工作中,我们提出了自监督和半监督的mIHC颜色反卷积方法。我们提出的方法基于深度卷积自编码器,并使用受物理启发的创新重建损失进行学习。我们展示了如何整合弱注释和大量可用的未标记数据来训练模型,以可靠地解混多路染色并生成染色分割图。我们通过7-plex IHC图像的mIHC数据集实验证明了我们提出的方法的有效性。
{"title":"Label-Efficient Deep Color Deconvolution of Brightfield Multiplex IHC Images","authors":"Shahira Abousamra;Danielle Fassler;Rajarsi Gupta;Tahsin Kurc;Luisa F. Escobar-Hoyos;Dimitris Samaras;Kenneth R. Shroyer;Joel Saltz;Chao Chen","doi":"10.1109/TMI.2025.3609245","DOIUrl":"10.1109/TMI.2025.3609245","url":null,"abstract":"Brightfield Multiplex Immunohistochemistry (mIHC) provides simultaneous labeling of multiple protein biomarkers in the same tissue section. It enables the exploration of spatial relationships between the inflammatory microenvironment and tumor cells, and to uncover how tumor cell morphology relates to cancer biomarker expression. Color deconvolution is required to analyze and quantify the different cell phenotype populations present as indicated by the biomarkers. However, this becomes a challenging task as the number of multiplexed stains increase. In this work, we present self-supervised and semi-supervised approaches to mIHC color deconvolution. Our proposed methods are based on deep convolutional autoencoders and learn using innovative reconstruction losses inspired by physics. We show how we can integrate weak annotations and the abundant unlabeled data available to train a model to reliably unmix the multiplexed stains and generate stain segmentation maps. We demonstrate the effectiveness of our proposed methods through experiments on mIHC dataset of 7-plexed IHC images.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"853-864"},"PeriodicalIF":0.0,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145035225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying Tumor Microvasculature With Optical Coherence Angiography and Intravoxel Incoherent Motion Diffusion MRI 用光学相干血管造影和体素内非相干运动扩散MRI定量肿瘤微血管。
Pub Date : 2025-09-10 DOI: 10.1109/TMI.2025.3607752
W. Jeffrey Zabel;Héctor Contreras-Sánchez;Warren Foltz;Costel Flueraru;Edward Taylor;Alex Vitkin
Intravoxel Incoherent Motion (IVIM) MRI is a contrast-agent-free microvascular imaging method finding increasing use in biomedicine. However, there is uncertainty in the ability of IVIM-MRI to quantify tissue microvasculature given MRI’s limited spatial resolution (mm scale). Nine NRG mice were subcutaneously inoculated with human pancreatic cancer BxPC-3 cells transfected with DsRed, and MR-compatible plastic window chambers were surgically installed in the dorsal skinfold. Mice were imaged with speckle variance optical coherence tomography (OCT) and colour Doppler OCT, providing high resolution 3D measurements of the vascular volume density (VVD) and average Doppler phase shift ( $overline {Delta phi }text {)}$ respectively. IVIM imaging was performed on a 7T preclinical MRI scanner, to generate maps of the perfusion fraction f, the extravascular diffusion coefficient ${D}_{textit {slow}}$ , and the intravascular diffusion coefficient ${D}_{textit {fast}}$ . The IVIM parameter maps were coregistered with the optical datasets to enable direct spatial correlation. A significant positive correlation was noted between OCT’s VVD and MR’s f (Pearson correlation coefficient ${r}={0}.{34},{p}lt {0}.{0001}text {)}$ . Surprisingly, no significant correlation was found between $overline {Delta phi }$ and ${D}_{textit {fast}}$ . This may be due to larger errors in the determined ${D}_{textit {fast}}$ values compared to f, as confirmed by Monte Carlo simulations. Several other inter- and intra-modality correlations were also quantified. Direct same-animal correlation of clinically applicable IVIM imaging with preclinical OCT microvascular imaging support the biomedical relevance of IVIM-MRI metrics, for example through f’s relationship to the VVD.
体素内非相干运动(IVIM) MRI是一种无造影剂的微血管成像方法,在生物医学中应用越来越广泛。然而,考虑到MRI有限的空间分辨率(毫米尺度),IVIM-MRI量化组织微血管的能力存在不确定性。将转染DsRed的人胰腺癌BxPC-3细胞皮下接种9只NRG小鼠,并在其背部皮肤褶上手术植入与磁共振兼容的塑料窗室。对小鼠进行散斑方差光学相干断层扫描(OCT)和彩色多普勒OCT成像,分别提供血管体积密度(VVD)和平均多普勒相移(Δϕ)的高分辨率3D测量。在7T临床前MRI扫描仪上进行IVIM成像,生成灌注分数f、血管外扩散系数Dslow和血管内扩散系数Dfast的图。IVIM参数图与光学数据集共同注册,以实现直接的空间相关性。OCT的VVD与MR的f呈显著正相关(Pearson相关系数r = 0.34,p < 0.0001)。令人惊讶的是,在Δϕ和Dfast之间没有发现显著的相关性。这可能是由于确定的Dfast值与f相比误差更大,正如蒙特卡罗模拟所证实的那样。其他几个模态间和模态内的相关性也被量化。临床应用的IVIM成像与临床前OCT微血管成像的直接同动物相关性支持了IVIM- mri指标的生物医学相关性,例如通过f与VVD的关系。
{"title":"Quantifying Tumor Microvasculature With Optical Coherence Angiography and Intravoxel Incoherent Motion Diffusion MRI","authors":"W. Jeffrey Zabel;Héctor Contreras-Sánchez;Warren Foltz;Costel Flueraru;Edward Taylor;Alex Vitkin","doi":"10.1109/TMI.2025.3607752","DOIUrl":"10.1109/TMI.2025.3607752","url":null,"abstract":"Intravoxel Incoherent Motion (IVIM) MRI is a contrast-agent-free microvascular imaging method finding increasing use in biomedicine. However, there is uncertainty in the ability of IVIM-MRI to quantify tissue microvasculature given MRI’s limited spatial resolution (mm scale). Nine NRG mice were subcutaneously inoculated with human pancreatic cancer BxPC-3 cells transfected with DsRed, and MR-compatible plastic window chambers were surgically installed in the dorsal skinfold. Mice were imaged with speckle variance optical coherence tomography (OCT) and colour Doppler OCT, providing high resolution 3D measurements of the vascular volume density (VVD) and average Doppler phase shift (<inline-formula> <tex-math>$overline {Delta phi }text {)}$ </tex-math></inline-formula> respectively. IVIM imaging was performed on a 7T preclinical MRI scanner, to generate maps of the perfusion fraction f, the extravascular diffusion coefficient <inline-formula> <tex-math>${D}_{textit {slow}}$ </tex-math></inline-formula>, and the intravascular diffusion coefficient <inline-formula> <tex-math>${D}_{textit {fast}}$ </tex-math></inline-formula>. The IVIM parameter maps were coregistered with the optical datasets to enable direct spatial correlation. A significant positive correlation was noted between OCT’s VVD and MR’s f (Pearson correlation coefficient <inline-formula> <tex-math>${r}={0}.{34},{p}lt {0}.{0001}text {)}$ </tex-math></inline-formula>. Surprisingly, no significant correlation was found between <inline-formula> <tex-math>$overline {Delta phi }$ </tex-math></inline-formula> and <inline-formula> <tex-math>${D}_{textit {fast}}$ </tex-math></inline-formula>. This may be due to larger errors in the determined <inline-formula> <tex-math>${D}_{textit {fast}}$ </tex-math></inline-formula> values compared to f, as confirmed by Monte Carlo simulations. Several other inter- and intra-modality correlations were also quantified. Direct same-animal correlation of clinically applicable IVIM imaging with preclinical OCT microvascular imaging support the biomedical relevance of IVIM-MRI metrics, for example through f’s relationship to the VVD.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"789-798"},"PeriodicalIF":0.0,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CASHNet: Context-Aware Semantics-Driven Hierarchical Network for Hybrid Diffeomorphic CT-CBCT Image Registration 基于上下文感知语义驱动的CT-CBCT混合差分图像配准层次网络。
Pub Date : 2025-09-09 DOI: 10.1109/TMI.2025.3607700
Xiaoru Gao;Housheng Xie;Donghua Hang;Guoyan Zheng
Computed Tomography (CT) to Cone-Beam Computed Tomography (CBCT) image registration is crucial for image-guided radiotherapy and surgical procedures. However, achieving accurate CT-CBCT registration remains challenging due to various factors such as inconsistent intensities, low contrast resolution and imaging artifacts. In this study, we propose a Context-Aware Semantics-driven Hierarchical Network (referred to as CASHNet), which hierarchically integrates context-aware semantics-encoded features into a coarse-to-fine registration scheme, to explicitly enhance semantic structural perception during progressive alignment. Moreover, it leverages diffeomorphisms to integrate rigid and non-rigid registration within a single end-to-end trainable network, enabling anatomically plausible deformations and preserving topological consistency. CASHNet comprises a Siamese Mamba-based multi-scale feature encoder and a coarse-to-fine registration decoder, which integrates a Rigid Registration (RR) module with multiple Semantics-guided Velocity Estimation and Feature Alignment (SVEFA) modules operating at different resolutions. Each SVEFA module comprises three carefully designed components: i) a cross-resolution feature aggregation (CFA) component that synthesizes enhanced global contextual representations, ii) a semantics perception and encoding (SPE) component that captures and encodes local semantic information, and iii) an incremental velocity estimation and feature alignment (IVEFA) component that leverages contextual and semantic features to update velocity fields and to align features. These modules work synergistically to boost the overall registration performance. Extensive experiments on three typical yet challenging CT-CBCT datasets of both soft and hard tissues demonstrate the superiority of our proposed method over other state-of-the-art methods. The code will be publicly available at https://github.com/xiaorugao999/CASHNet
计算机断层扫描(CT)到锥形束计算机断层扫描(CBCT)图像配准对于图像引导的放射治疗和外科手术至关重要。然而,由于各种因素,如强度不一致、对比度分辨率低和成像伪影,实现准确的CT-CBCT配准仍然具有挑战性。在这项研究中,我们提出了一个上下文感知语义驱动的分层网络(CASHNet),它分层地将上下文感知语义编码的特征集成到一个从粗到精的注册方案中,以显式地增强在逐步对齐过程中的语义结构感知。此外,它利用微分同态在单个端到端可训练网络中集成刚性和非刚性注册,从而实现解剖学上合理的变形并保持拓扑一致性。CASHNet包括一个基于暹罗曼巴的多尺度特征编码器和一个粗到细的配准解码器,该解码器集成了一个刚性配准(RR)模块和多个以不同分辨率运行的语义引导的速度估计和特征对齐(SVEFA)模块。每个SVEFA模块由三个精心设计的组件组成:1)合成增强的全局上下文表示的跨分辨率特征聚合(CFA)组件,2)捕获和编码局部语义信息的语义感知和编码(SPE)组件,3)利用上下文和语义特征更新速度场和对齐特征的增量速度估计和特征对齐(IVEFA)组件。这些模块协同工作以提高整体注册性能。在三个典型但具有挑战性的软组织和硬组织CT-CBCT数据集上进行的大量实验表明,我们提出的方法优于其他最先进的方法。代码将在https://github.com/xiaorugao999/CASHNet上公开。
{"title":"CASHNet: Context-Aware Semantics-Driven Hierarchical Network for Hybrid Diffeomorphic CT-CBCT Image Registration","authors":"Xiaoru Gao;Housheng Xie;Donghua Hang;Guoyan Zheng","doi":"10.1109/TMI.2025.3607700","DOIUrl":"10.1109/TMI.2025.3607700","url":null,"abstract":"Computed Tomography (CT) to Cone-Beam Computed Tomography (CBCT) image registration is crucial for image-guided radiotherapy and surgical procedures. However, achieving accurate CT-CBCT registration remains challenging due to various factors such as inconsistent intensities, low contrast resolution and imaging artifacts. In this study, we propose a Context-Aware Semantics-driven Hierarchical Network (referred to as CASHNet), which hierarchically integrates context-aware semantics-encoded features into a coarse-to-fine registration scheme, to explicitly enhance semantic structural perception during progressive alignment. Moreover, it leverages diffeomorphisms to integrate rigid and non-rigid registration within a single end-to-end trainable network, enabling anatomically plausible deformations and preserving topological consistency. CASHNet comprises a Siamese Mamba-based multi-scale feature encoder and a coarse-to-fine registration decoder, which integrates a Rigid Registration (RR) module with multiple Semantics-guided Velocity Estimation and Feature Alignment (SVEFA) modules operating at different resolutions. Each SVEFA module comprises three carefully designed components: i) a cross-resolution feature aggregation (CFA) component that synthesizes enhanced global contextual representations, ii) a semantics perception and encoding (SPE) component that captures and encodes local semantic information, and iii) an incremental velocity estimation and feature alignment (IVEFA) component that leverages contextual and semantic features to update velocity fields and to align features. These modules work synergistically to boost the overall registration performance. Extensive experiments on three typical yet challenging CT-CBCT datasets of both soft and hard tissues demonstrate the superiority of our proposed method over other state-of-the-art methods. The code will be publicly available at <uri>https://github.com/xiaorugao999/CASHNet</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"825-842"},"PeriodicalIF":0.0,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145025299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1