首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
Efficient frequency-decomposed transformer via large vision model guidance for surgical image desmoking 基于大视觉模型引导的高效分频变压器用于手术图像去噪
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-03 DOI: 10.1016/j.compmedimag.2025.102660
Jiaao Li , Diandian Guo , Youyu Wang , Yanhui Wan , Long Ma , Jialun Pei
Surgical image restoration plays a vital clinical role in improving visual quality during surgery, particularly in minimally invasive procedures where the operating field is frequently obscured by surgical smoke. However, surgical image desmoking still has limited progress in algorithm development and customized learning strategies. In this regard, this work focuses on the task of desmoking from both theoretical and practical perspectives. First, we analyze the intrinsic characteristics of surgical smoke degradation: (1) spatial localization and dynamics, (2) distinguishable frequency-domain patterns, and (3) the entangled representation of anatomical content and degradative artifacts. These observations motivated us to propose an efficient frequency-aware Transformer framework, namely SmoRestor, which aims to separate and restore true anatomical structures from complex degradations. Specifically, we introduce a high-order Fourier-embedded neighborhood attention transformer that enhances the model’s ability to capture structured degradation patterns across both spatial and frequency domains. Besides, we utilize the semantic priors encoded by large vision models to disambiguate content from degradation through targeted guidance. Moreover, we propose an innovative transfer learning paradigm that injects knowledge from large models to the main network, enabling it to effectively distinguish meaningful content from ambiguous corruption. Experimental results on both public and in-house datasets demonstrate substantial improvements in quantitative performance and visual quality. The source code will be available.
手术图像恢复在提高手术视觉质量方面起着至关重要的临床作用,特别是在微创手术中,手术视野经常被手术烟雾遮挡。然而,手术图像去吸烟在算法开发和定制学习策略方面仍然进展有限。在这方面,本工作着重从理论和实践两个角度来研究吸烟的任务。首先,我们分析了手术烟雾降解的内在特征:(1)空间定位和动力学;(2)可区分的频域模式;(3)解剖内容和降解伪像的纠缠表示。这些观察促使我们提出了一种有效的频率感知变压器框架,即SmoRestor,旨在从复杂的退化中分离和恢复真实的解剖结构。具体来说,我们引入了一个高阶傅里叶嵌入式邻域注意力转换器,增强了模型在空间和频域捕获结构化退化模式的能力。此外,我们利用大视觉模型编码的语义先验,通过有针对性的引导来消除内容的歧义。此外,我们提出了一种创新的迁移学习范式,将大型模型中的知识注入主网络,使其能够有效区分有意义的内容和模糊的腐败内容。在公共和内部数据集上的实验结果表明,在定量性能和视觉质量方面有了实质性的改进。源代码将可用。
{"title":"Efficient frequency-decomposed transformer via large vision model guidance for surgical image desmoking","authors":"Jiaao Li ,&nbsp;Diandian Guo ,&nbsp;Youyu Wang ,&nbsp;Yanhui Wan ,&nbsp;Long Ma ,&nbsp;Jialun Pei","doi":"10.1016/j.compmedimag.2025.102660","DOIUrl":"10.1016/j.compmedimag.2025.102660","url":null,"abstract":"<div><div>Surgical image restoration plays a vital clinical role in improving visual quality during surgery, particularly in minimally invasive procedures where the operating field is frequently obscured by surgical smoke. However, surgical image desmoking still has limited progress in algorithm development and customized learning strategies. In this regard, this work focuses on the task of desmoking from both theoretical and practical perspectives. First, we analyze the intrinsic characteristics of surgical smoke degradation: (1) spatial localization and dynamics, (2) distinguishable frequency-domain patterns, and (3) the entangled representation of anatomical content and degradative artifacts. These observations motivated us to propose an efficient frequency-aware Transformer framework, namely SmoRestor, which aims to separate and restore true anatomical structures from complex degradations. Specifically, we introduce a high-order Fourier-embedded neighborhood attention transformer that enhances the model’s ability to capture structured degradation patterns across both spatial and frequency domains. Besides, we utilize the semantic priors encoded by large vision models to disambiguate content from degradation through targeted guidance. Moreover, we propose an innovative transfer learning paradigm that injects knowledge from large models to the main network, enabling it to effectively distinguish meaningful content from ambiguous corruption. Experimental results on both public and in-house datasets demonstrate substantial improvements in quantitative performance and visual quality. The source code will be available.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102660"},"PeriodicalIF":4.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Med-SCoT: Structured chain-of-thought reasoning and evaluation for enhancing interpretability in medical visual question answering Med-SCoT:结构化的思维链推理和评估,以提高医学视觉问题回答的可解释性
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 DOI: 10.1016/j.compmedimag.2025.102659
Jinhao Qiao , Sihan Li , Jiang Liu , Heng Yu , Yi Xiao , Hongshan Yu , Yan Zheng
Most existing medical visual question answering (Med-VQA) methods emphasize answer accuracy while neglecting the reasoning process, limiting interpretability and reliability in clinical settings. To address this issue, we introduce Med-SCoT, a vision-language model that performs structured chain-of-thought (SCoT) reasoning by explicitly decomposing inference into four stages: Summary, Caption, Reasoning, and Conclusion. To facilitate training, we propose a multi-model collaborative correction (CoCo) annotation pipeline and construct three Med-VQA datasets with structured reasoning chains. We further develop SCoTEval, a comprehensive evaluation framework combining metric-based scores and large language model (LLM) assessments to enable fine-grained analysis of reasoning quality. Experimental results demonstrate that Med-SCoT achieves advanced answer accuracy while generating structured, clinically aligned and logically coherent reasoning chains. Moreover, SCoTEval exhibits high agreement with expert judgments, validating its reliability for structured reasoning assessment. The code, data, and models are available at: https://github.com/qiaodongxing/Med-SCoT.
大多数现有的医学视觉问答(Med-VQA)方法强调答案的准确性,而忽略了推理过程,限制了临床设置的可解释性和可靠性。为了解决这个问题,我们引入了Med-SCoT,这是一种视觉语言模型,它通过显式地将推理分解为四个阶段来执行结构化思维链(SCoT)推理:摘要、标题、推理和结论。为了方便训练,我们提出了一个多模型协同校正(CoCo)标注管道,并构建了三个具有结构化推理链的Med-VQA数据集。我们进一步开发了SCoTEval,这是一个综合评估框架,结合了基于度量的分数和大型语言模型(LLM)评估,以实现对推理质量的细粒度分析。实验结果表明,Med-SCoT在生成结构化、临床一致和逻辑连贯的推理链的同时,实现了高级的答案准确性。此外,SCoTEval表现出与专家判断的高度一致性,验证了其在结构化推理评估中的可靠性。代码、数据和模型可在https://github.com/qiaodongxing/Med-SCoT上获得。
{"title":"Med-SCoT: Structured chain-of-thought reasoning and evaluation for enhancing interpretability in medical visual question answering","authors":"Jinhao Qiao ,&nbsp;Sihan Li ,&nbsp;Jiang Liu ,&nbsp;Heng Yu ,&nbsp;Yi Xiao ,&nbsp;Hongshan Yu ,&nbsp;Yan Zheng","doi":"10.1016/j.compmedimag.2025.102659","DOIUrl":"10.1016/j.compmedimag.2025.102659","url":null,"abstract":"<div><div>Most existing medical visual question answering (Med-VQA) methods emphasize answer accuracy while neglecting the reasoning process, limiting interpretability and reliability in clinical settings. To address this issue, we introduce Med-SCoT, a vision-language model that performs structured chain-of-thought (SCoT) reasoning by explicitly decomposing inference into four stages: Summary, Caption, Reasoning, and Conclusion. To facilitate training, we propose a multi-model collaborative correction (CoCo) annotation pipeline and construct three Med-VQA datasets with structured reasoning chains. We further develop SCoTEval, a comprehensive evaluation framework combining metric-based scores and large language model (LLM) assessments to enable fine-grained analysis of reasoning quality. Experimental results demonstrate that Med-SCoT achieves advanced answer accuracy while generating structured, clinically aligned and logically coherent reasoning chains. Moreover, SCoTEval exhibits high agreement with expert judgments, validating its reliability for structured reasoning assessment. The code, data, and models are available at: <span><span>https://github.com/qiaodongxing/Med-SCoT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102659"},"PeriodicalIF":4.9,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multistain multicompartment automatic segmentation in renal biopsies with thrombotic microangiopathies and other vasculopathies 血栓性微血管病变和其他血管病变肾活检的多染色多室自动分割。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-22 DOI: 10.1016/j.compmedimag.2025.102658
Nicola Altini , Michela Prunella , Surya V. Seshan , Savino Sciascia , Antonella Barreca , Alessandro Del Gobbo , Stefan Porubsky , Hien Van Nguyen , Claudia Delprete , Berardino Prencipe , Deján Dobi , Daan P.C. van Doorn , Sjoerd A.M.E.G. Timmermans , Pieter van Paassen , Vitoantonio Bevilacqua , Jan Ulrich Becker
Automatic tissue segmentation is a necessary step for the bulk analysis of whole slide images (WSIs) from paraffin histology sections in kidney biopsies. However, existing models often fail to generalize across the main nephropathological staining methods and to capture the severe morphological distortions in arteries, arterioles, and glomeruli common in thrombotic microangiopathy (TMA) or other vasculopathies. Therefore, we developed an automatic multi-staining segmentation pipeline covering six key compartments: Artery, Arteriole, Glomerulus, Cortex, Medulla, and Capsule/Other. This framework enables downstream tasks such as counting and labeling at instance-, WSI- or biopsy-level. Biopsies (n = 158) from seven centers: Cologne, Turin, Milan, Weill-Cornell, Mainz, Maastricht, Budapest, were classified by expert nephropathologists into TMA (n = 87) or Mimickers (n = 71). Ground truth expert segmentation masks were provided for all compartments, and expert binary TMA classification labels for Glomerulus, Artery, Arteriole. The biopsies were divided into training (n = 79), validation (n = 26), and test (n = 53) subsets. We benchmarked six deep learning models for semantic segmentation (U-Net, FPN, DeepLabV3+, Mask2Former, SegFormer, SegNeXt) and five models for classification (ResNet-34, DenseNet-121, EfficientNet-v2-S, ConvNeXt-Small, Swin-v2-B). We obtained robust segmentation results across all compartments. On the test set, the best models achieved Dice coefficients of 0.903 (Cortex), 0.834 (Medulla), 0.816 (Capsule/Other), 0.922 (Glomerulus), 0.822 (Artery), and 0.553 (Arteriole). The best classification models achieved Accuracy of 0.724 and 0.841 for Glomerulus and Artery plus Arteriole compartments, respectively. Furthermore, we release NePathTK (NephroPathology Toolkit), a powerful open-source end-to-end pipeline integrated with QuPath, enabling accurate segmentation for decision support in nephropathology and large-scale analysis of kidney biopsies.
自动组织分割是整个切片图像(wsi)的散装分析的必要步骤,从石蜡组织切片肾活检。然而,现有的模型往往不能概括主要的肾脏病理染色方法,也不能捕捉血栓性微血管病(TMA)或其他血管病变中常见的动脉、小动脉和肾小球的严重形态扭曲。因此,我们开发了一种自动多染色分割管道,涵盖六个关键区室:动脉、小动脉、肾小球、皮质、髓质和胶囊/其他。该框架支持下游任务,例如实例级、WSI级或活检级的计数和标记。来自科隆、都灵、米兰、威尔-康奈尔、美因茨、马斯特里赫特、布达佩斯七个中心的活检(n = 158)由肾病理学专家分为TMA (n = 87)或Mimickers (n = 71)。为所有隔室提供了Ground truth专家分割掩码,并为肾小球、动脉、小动脉提供了专家二元TMA分类标签。将活检分为训练组(n = 79)、验证组(n = 26)和测试组(n = 53)。我们对语义分割的6个深度学习模型(U-Net、FPN、DeepLabV3+、Mask2Former、SegFormer、SegNeXt)和分类的5个模型(ResNet-34、DenseNet-121、EfficientNet-v2-S、ConvNeXt-Small、swun -v2- b)进行了基准测试。我们在所有隔室中获得了稳健的分割结果。在测试集上,最佳模型的Dice系数分别为0.903(皮质)、0.834(髓质)、0.816(胶囊/其他)、0.922(肾小球)、0.822(动脉)和0.553(动脉)。最佳分类模型对肾小球室和动脉+小动脉室的准确率分别为0.724和0.841。此外,我们还发布了NePathTK(肾脏病理学工具包),这是一个强大的开源端到端管道,与QuPath集成,可以为肾脏病理学决策支持和肾脏活检的大规模分析提供准确的分割。
{"title":"Multistain multicompartment automatic segmentation in renal biopsies with thrombotic microangiopathies and other vasculopathies","authors":"Nicola Altini ,&nbsp;Michela Prunella ,&nbsp;Surya V. Seshan ,&nbsp;Savino Sciascia ,&nbsp;Antonella Barreca ,&nbsp;Alessandro Del Gobbo ,&nbsp;Stefan Porubsky ,&nbsp;Hien Van Nguyen ,&nbsp;Claudia Delprete ,&nbsp;Berardino Prencipe ,&nbsp;Deján Dobi ,&nbsp;Daan P.C. van Doorn ,&nbsp;Sjoerd A.M.E.G. Timmermans ,&nbsp;Pieter van Paassen ,&nbsp;Vitoantonio Bevilacqua ,&nbsp;Jan Ulrich Becker","doi":"10.1016/j.compmedimag.2025.102658","DOIUrl":"10.1016/j.compmedimag.2025.102658","url":null,"abstract":"<div><div>Automatic tissue segmentation is a necessary step for the bulk analysis of whole slide images (WSIs) from paraffin histology sections in kidney biopsies. However, existing models often fail to generalize across the main nephropathological staining methods and to capture the severe morphological distortions in arteries, arterioles, and glomeruli common in thrombotic microangiopathy (TMA) or other vasculopathies. Therefore, we developed an automatic multi-staining segmentation pipeline covering six key compartments: Artery, Arteriole, Glomerulus, Cortex, Medulla, and Capsule/Other. This framework enables downstream tasks such as counting and labeling at instance-, WSI- or biopsy-level. Biopsies (n = 158) from seven centers: Cologne, Turin, Milan, Weill-Cornell, Mainz, Maastricht, Budapest, were classified by expert nephropathologists into TMA (n = 87) or Mimickers (n = 71). Ground truth expert segmentation masks were provided for all compartments, and expert binary TMA classification labels for Glomerulus, Artery, Arteriole. The biopsies were divided into training (n = 79), validation (n = 26), and test (n = 53) subsets. We benchmarked six deep learning models for semantic segmentation (U-Net, FPN, DeepLabV3+, Mask2Former, SegFormer, SegNeXt) and five models for classification (ResNet-34, DenseNet-121, EfficientNet-v2-S, ConvNeXt-Small, Swin-v2-B). We obtained robust segmentation results across all compartments. On the test set, the best models achieved Dice coefficients of 0.903 (Cortex), 0.834 (Medulla), 0.816 (Capsule/Other), 0.922 (Glomerulus), 0.822 (Artery), and 0.553 (Arteriole). The best classification models achieved Accuracy of 0.724 and 0.841 for Glomerulus and Artery plus Arteriole compartments, respectively. Furthermore, we release NePathTK (NephroPathology Toolkit), a powerful open-source end-to-end pipeline integrated with QuPath, enabling accurate segmentation for decision support in nephropathology and large-scale analysis of kidney biopsies.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102658"},"PeriodicalIF":4.9,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A CNN-Transformer fusion network for Diabetic retinopathy image classification 用于糖尿病视网膜病变图像分类的CNN-Transformer融合网络。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-21 DOI: 10.1016/j.compmedimag.2025.102655
Xuan Huang , Zhuang Ai , Chongyang She , Qi Li , Qihao Wei , Sha Xu , Yaping Lu , Fanxin Zeng
Diabetic retinopathy (DR) is a leading cause of blindness worldwide, yet current diagnosis relies on labor-intensive and subjective fundus image interpretation. Here we present a convolutional neural network-transformer fusion model (DR-CTFN) that integrates ConvNeXt and Swin Transformer algorithms with a lightweight attention block (LAB) to enhance feature extraction. To address dataset imbalance, we applied standardized preprocessing and extensive image augmentation. On the Kaggle EyePACS dataset, DR-CTFN outperformed ConvNeXt and Swin Transformer in accuracy by 3.14% and 8.39%, while also achieving a superior area under the curve (AUC) by 1% and 26.08%. External validation on APTOS 2019 Blindness Detection and a clinical DR dataset yielded accuracies of 84.45% and 85.31%, with AUC values of 95.22% and 95.79%, respectively. These results demonstrate that DR-CTFN enables rapid, robust, and precise DR detection, offering a scalable approach for early diagnosis and prevention of vision loss, thereby enhancing the quality of life for DR patients.
糖尿病视网膜病变(DR)是世界范围内致盲的主要原因,但目前的诊断依赖于劳动密集型和主观眼底图像解释。本文提出了一种卷积神经网络-变压器融合模型(DR-CTFN),该模型将ConvNeXt和Swin Transformer算法与轻量级注意块(LAB)集成在一起,以增强特征提取。为了解决数据不平衡问题,我们采用了标准化的预处理和广泛的图像增强。在Kaggle EyePACS数据集上,DR-CTFN的准确率分别比ConvNeXt和Swin Transformer高3.14%和8.39%,同时曲线下面积(AUC)也比ConvNeXt和Swin Transformer高1%和26.08%。在APTOS 2019盲目性检测和临床DR数据集上进行外部验证,准确率分别为84.45%和85.31%,AUC值分别为95.22%和95.79%。这些结果表明,DR- ctfn能够实现快速、稳健和精确的DR检测,为早期诊断和预防视力丧失提供了一种可扩展的方法,从而提高了DR患者的生活质量。
{"title":"A CNN-Transformer fusion network for Diabetic retinopathy image classification","authors":"Xuan Huang ,&nbsp;Zhuang Ai ,&nbsp;Chongyang She ,&nbsp;Qi Li ,&nbsp;Qihao Wei ,&nbsp;Sha Xu ,&nbsp;Yaping Lu ,&nbsp;Fanxin Zeng","doi":"10.1016/j.compmedimag.2025.102655","DOIUrl":"10.1016/j.compmedimag.2025.102655","url":null,"abstract":"<div><div>Diabetic retinopathy (DR) is a leading cause of blindness worldwide, yet current diagnosis relies on labor-intensive and subjective fundus image interpretation. Here we present a convolutional neural network-transformer fusion model (DR-CTFN) that integrates ConvNeXt and Swin Transformer algorithms with a lightweight attention block (LAB) to enhance feature extraction. To address dataset imbalance, we applied standardized preprocessing and extensive image augmentation. On the Kaggle EyePACS dataset, DR-CTFN outperformed ConvNeXt and Swin Transformer in accuracy by 3.14% and 8.39%, while also achieving a superior area under the curve (AUC) by 1% and 26.08%. External validation on APTOS 2019 Blindness Detection and a clinical DR dataset yielded accuracies of 84.45% and 85.31%, with AUC values of 95.22% and 95.79%, respectively. These results demonstrate that DR-CTFN enables rapid, robust, and precise DR detection, offering a scalable approach for early diagnosis and prevention of vision loss, thereby enhancing the quality of life for DR patients.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102655"},"PeriodicalIF":4.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145394925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Path and bone-contour regularized unpaired MRI-to-CT translation 路径和骨轮廓正则化非配对mri - ct翻译
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-13 DOI: 10.1016/j.compmedimag.2025.102656
Teng Zhou , Jax Luo , Yuping Sun , Yiheng Tan , Shun Yao , Nazim Haouchine , Scott Raymond
Accurate MRI-to-CT translation promises the integration of complementary imaging information without the need for additional imaging sessions. Given the practical challenges associated with acquiring paired MRI and CT scans, the development of robust methods capable of leveraging unpaired datasets is essential for advancing the MRI-to-CT translation. Current unpaired MRI-to-CT translation methods, which predominantly rely on cycle consistency and contrastive learning frameworks, frequently encounter challenges in accurately translating anatomical features that are highly discernible on CT but less distinguishable on MRI, such as bone structures. This limitation renders these approaches less suitable for applications in radiation therapy, where precise bone representation is essential for accurate treatment planning. To address this challenge, we propose a path- and bone-contour regularized approach for unpaired MRI-to-CT translation. In our method, MRI and CT images are projected to a shared latent space, where the MRI-to-CT mapping is modeled as a continuous flow governed by neural ordinary differential equations. The optimal mapping is obtained by minimizing the transition path length of the flow. To enhance the accuracy of translated bone structures, we introduce a trainable neural network to generate bone contours from MRI and implement mechanisms to directly and indirectly encourage the model to focus on bone contours and their adjacent regions. Evaluations conducted on three datasets demonstrate that our method outperforms existing unpaired MRI-to-CT translation approaches, achieving lower overall error rates. Moreover, in a downstream bone segmentation task, our approach exhibits superior performance in preserving the fidelity of bone structures. Our code is available at: https://github.com/kennysyp/PaBoT.
准确的mri到ct转换保证了互补成像信息的整合,而不需要额外的成像会话。考虑到获取配对MRI和CT扫描相关的实际挑战,开发能够利用非配对数据集的强大方法对于推进MRI到CT的转换至关重要。目前的非配对MRI- CT翻译方法主要依赖于周期一致性和对比学习框架,在准确翻译在CT上高度可识别但在MRI上不易识别的解剖特征(如骨结构)时经常遇到挑战。这种限制使得这些方法不太适合应用于放射治疗,在放射治疗中,精确的骨表示对于准确的治疗计划至关重要。为了解决这一挑战,我们提出了一种非配对mri到ct翻译的路径和骨轮廓正则化方法。在我们的方法中,MRI和CT图像被投影到一个共享的潜在空间,其中MRI到CT的映射被建模为由神经常微分方程控制的连续流。通过最小化流的过渡路径长度来获得最优映射。为了提高翻译骨结构的准确性,我们引入了一个可训练的神经网络来从MRI中生成骨轮廓,并实现了直接和间接鼓励模型关注骨轮廓及其邻近区域的机制。在三个数据集上进行的评估表明,我们的方法优于现有的非成对mri - ct翻译方法,实现了更低的总体错误率。此外,在下游的骨分割任务中,我们的方法在保持骨结构的保真度方面表现出优越的性能。我们的代码可在:https://github.com/kennysyp/PaBoT。
{"title":"Path and bone-contour regularized unpaired MRI-to-CT translation","authors":"Teng Zhou ,&nbsp;Jax Luo ,&nbsp;Yuping Sun ,&nbsp;Yiheng Tan ,&nbsp;Shun Yao ,&nbsp;Nazim Haouchine ,&nbsp;Scott Raymond","doi":"10.1016/j.compmedimag.2025.102656","DOIUrl":"10.1016/j.compmedimag.2025.102656","url":null,"abstract":"<div><div>Accurate MRI-to-CT translation promises the integration of complementary imaging information without the need for additional imaging sessions. Given the practical challenges associated with acquiring paired MRI and CT scans, the development of robust methods capable of leveraging unpaired datasets is essential for advancing the MRI-to-CT translation. Current unpaired MRI-to-CT translation methods, which predominantly rely on cycle consistency and contrastive learning frameworks, frequently encounter challenges in accurately translating anatomical features that are highly discernible on CT but less distinguishable on MRI, such as bone structures. This limitation renders these approaches less suitable for applications in radiation therapy, where precise bone representation is essential for accurate treatment planning. To address this challenge, we propose a path- and bone-contour regularized approach for unpaired MRI-to-CT translation. In our method, MRI and CT images are projected to a shared latent space, where the MRI-to-CT mapping is modeled as a continuous flow governed by neural ordinary differential equations. The optimal mapping is obtained by minimizing the transition path length of the flow. To enhance the accuracy of translated bone structures, we introduce a trainable neural network to generate bone contours from MRI and implement mechanisms to directly and indirectly encourage the model to focus on bone contours and their adjacent regions. Evaluations conducted on three datasets demonstrate that our method outperforms existing unpaired MRI-to-CT translation approaches, achieving lower overall error rates. Moreover, in a downstream bone segmentation task, our approach exhibits superior performance in preserving the fidelity of bone structures. Our code is available at: <span><span>https://github.com/kennysyp/PaBoT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102656"},"PeriodicalIF":4.9,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145290008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESAM2-BLS: Enhanced segment anything model 2 for efficient breast lesion segmentation in ultrasound imaging ESAM2-BLS:用于超声成像中乳腺病变有效分割的增强分段任何模型2。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-10 DOI: 10.1016/j.compmedimag.2025.102654
Lishuang Guo , Haonan Zhang , Chenbin Ma
Ultrasound imaging, as an economical, efficient, and non-invasive diagnostic tool, is widely used for breast lesion screening and diagnosis. However, the segmentation of lesion regions remains a significant challenge due to factors such as noise interference and the variability in image quality. To address this issue, we propose a novel deep learning model named enhanced segment anything model 2 (SAM2) for breast lesion segmentation (ESAM2-BLS). This model is an optimized version of the SAM2 architecture. ESAM2-BLS customizes and fine-tunes the pre-trained SAM2 model by introducing an adapter module, specifically designed to accommodate the unique characteristics of breast ultrasound images. The adapter module directly addresses ultrasound-specific challenges including speckle noise, low contrast boundaries, shadowing artifacts, and anisotropic resolution through targeted architectural elements such as channel attention mechanisms, specialized convolution kernels, and optimized skip connections. This optimization significantly improves segmentation accuracy, particularly for low-contrast and small lesion regions. Compared to traditional methods, ESAM2-BLS fully leverages the generalization capabilities of large models while incorporating multi-scale feature fusion and axial dilated depthwise convolution to effectively capture multi-level information from complex lesions. During the decoding process, the model enhances the identification of fine boundaries and small lesions through depthwise separable convolutions and skip connections, while maintaining a low computational cost. Visualization of the segmentation results and interpretability analysis demonstrate that ESAM2-BLS achieves an average Dice score of 0.9077 and 0.8633 in five-fold cross-validation across two datasets with over 1600 patients. These results significantly improve segmentation accuracy and robustness. This model provides an efficient, reliable, and specialized automated solution for early breast cancer screening and diagnosis.
超声成像作为一种经济、高效、无创的诊断手段,被广泛应用于乳腺病变的筛查和诊断。然而,由于噪声干扰和图像质量的可变性等因素,病灶区域的分割仍然是一个重大挑战。为了解决这个问题,我们提出了一种新的深度学习模型,称为增强分割任何模型2 (SAM2),用于乳腺病变分割(ESAM2-BLS)。该模型是SAM2体系结构的优化版本。ESAM2-BLS定制和微调预训练SAM2模型通过引入一个适配器模块,专门设计以适应乳房超声图像的独特特点。适配器模块通过通道注意机制、专门的卷积核和优化的跳过连接等目标架构元素,直接解决超声波特定的挑战,包括散斑噪声、低对比度边界、阴影伪影和各向异性分辨率。这种优化显著提高了分割精度,特别是对于低对比度和小病变区域。与传统方法相比,ESAM2-BLS充分利用了大型模型的泛化能力,同时结合了多尺度特征融合和轴向扩张深度卷积,有效地捕获了复杂病变的多层次信息。在解码过程中,该模型通过深度可分离卷积和跳跃连接增强了对细边界和小病灶的识别,同时保持了较低的计算成本。分割结果的可视化和可解释性分析表明,ESAM2-BLS在超过1600例患者的两个数据集上进行了五倍交叉验证,平均Dice得分为0.9077和0.8633。这些结果显著提高了分割的准确性和鲁棒性。该模型为早期乳腺癌筛查和诊断提供了高效、可靠、专业化的自动化解决方案。
{"title":"ESAM2-BLS: Enhanced segment anything model 2 for efficient breast lesion segmentation in ultrasound imaging","authors":"Lishuang Guo ,&nbsp;Haonan Zhang ,&nbsp;Chenbin Ma","doi":"10.1016/j.compmedimag.2025.102654","DOIUrl":"10.1016/j.compmedimag.2025.102654","url":null,"abstract":"<div><div>Ultrasound imaging, as an economical, efficient, and non-invasive diagnostic tool, is widely used for breast lesion screening and diagnosis. However, the segmentation of lesion regions remains a significant challenge due to factors such as noise interference and the variability in image quality. To address this issue, we propose a novel deep learning model named enhanced segment anything model 2 (SAM2) for breast lesion segmentation (ESAM2-BLS). This model is an optimized version of the SAM2 architecture. ESAM2-BLS customizes and fine-tunes the pre-trained SAM2 model by introducing an adapter module, specifically designed to accommodate the unique characteristics of breast ultrasound images. The adapter module directly addresses ultrasound-specific challenges including speckle noise, low contrast boundaries, shadowing artifacts, and anisotropic resolution through targeted architectural elements such as channel attention mechanisms, specialized convolution kernels, and optimized skip connections. This optimization significantly improves segmentation accuracy, particularly for low-contrast and small lesion regions. Compared to traditional methods, ESAM2-BLS fully leverages the generalization capabilities of large models while incorporating multi-scale feature fusion and axial dilated depthwise convolution to effectively capture multi-level information from complex lesions. During the decoding process, the model enhances the identification of fine boundaries and small lesions through depthwise separable convolutions and skip connections, while maintaining a low computational cost. Visualization of the segmentation results and interpretability analysis demonstrate that ESAM2-BLS achieves an average Dice score of 0.9077 and 0.8633 in five-fold cross-validation across two datasets with over 1600 patients. These results significantly improve segmentation accuracy and robustness. This model provides an efficient, reliable, and specialized automated solution for early breast cancer screening and diagnosis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102654"},"PeriodicalIF":4.9,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145356747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trends and applications of variational autoencoders in medical imaging analysis 变分自编码器在医学影像分析中的发展趋势及应用
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-09 DOI: 10.1016/j.compmedimag.2025.102647
Pauline Shan Qing Yeoh , Khairunnisa Hasikin , Xiang Wu , Siew Li Goh , Khin Wee Lai
Automated medical imaging analysis plays a crucial role in modern healthcare, with deep learning emerging as a widely adopted solution. However, traditional supervised learning methods often struggle to achieve optimal performance due to increasing challenges such as data scarcity and variability. In response, generative artificial intelligence has gained significant attention, particularly Variational Autoencoders (VAEs), which have been extensively utilized to address various challenges in medical imaging. This review analyzed 118 articles published in the Web of Science database between 2018 and 2024. Bibliometric analysis was conducted to map research trends, while a curated compilation of datasets and evaluation metrics were extracted to underscore the importance of standardization in deep learning workflows. VAEs have been applied across multiple healthcare applications, including anomaly detection, segmentation, classification, synthesis, registration, harmonization, and clustering. Findings suggest that VAE-based models are increasingly applied in medical imaging, with Magnetic Resonance Imaging emerging as the dominant modality and image synthesis as a primary application. The growing interest in this field highlights the potential of VAEs to enhance medical imaging analysis by overcoming existing limitations in data-driven healthcare solutions. This review serves as a valuable resource for researchers looking to integrate VAE models into healthcare applications, offering an overview of current advancements.
自动化医学成像分析在现代医疗保健中发挥着至关重要的作用,深度学习正在成为一种广泛采用的解决方案。然而,由于数据稀缺性和可变性等挑战的增加,传统的监督学习方法往往难以达到最佳性能。因此,生成式人工智能已经获得了极大的关注,特别是变分自编码器(VAEs),它已被广泛用于解决医学成像中的各种挑战。该综述分析了2018年至2024年间发表在Web of Science数据库中的118篇文章。进行文献计量分析以绘制研究趋势,同时提取了精心整理的数据集和评估指标,以强调标准化在深度学习工作流程中的重要性。VAEs已应用于多个医疗保健应用程序,包括异常检测、分割、分类、合成、注册、协调和聚类。研究结果表明,基于vae的模型越来越多地应用于医学成像,磁共振成像正在成为主导模式,图像合成是主要应用。对这一领域日益增长的兴趣凸显了VAEs的潜力,通过克服数据驱动的医疗保健解决方案中的现有限制来增强医学成像分析。这篇综述为希望将VAE模型集成到医疗保健应用程序中的研究人员提供了宝贵的资源,概述了当前的进展。
{"title":"Trends and applications of variational autoencoders in medical imaging analysis","authors":"Pauline Shan Qing Yeoh ,&nbsp;Khairunnisa Hasikin ,&nbsp;Xiang Wu ,&nbsp;Siew Li Goh ,&nbsp;Khin Wee Lai","doi":"10.1016/j.compmedimag.2025.102647","DOIUrl":"10.1016/j.compmedimag.2025.102647","url":null,"abstract":"<div><div>Automated medical imaging analysis plays a crucial role in modern healthcare, with deep learning emerging as a widely adopted solution. However, traditional supervised learning methods often struggle to achieve optimal performance due to increasing challenges such as data scarcity and variability. In response, generative artificial intelligence has gained significant attention, particularly Variational Autoencoders (VAEs), which have been extensively utilized to address various challenges in medical imaging. This review analyzed 118 articles published in the Web of Science database between 2018 and 2024. Bibliometric analysis was conducted to map research trends, while a curated compilation of datasets and evaluation metrics were extracted to underscore the importance of standardization in deep learning workflows. VAEs have been applied across multiple healthcare applications, including anomaly detection, segmentation, classification, synthesis, registration, harmonization, and clustering. Findings suggest that VAE-based models are increasingly applied in medical imaging, with Magnetic Resonance Imaging emerging as the dominant modality and image synthesis as a primary application. The growing interest in this field highlights the potential of VAEs to enhance medical imaging analysis by overcoming existing limitations in data-driven healthcare solutions. This review serves as a valuable resource for researchers looking to integrate VAE models into healthcare applications, offering an overview of current advancements.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102647"},"PeriodicalIF":4.9,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145290006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Twin-ViMReg: DXR driven synthetic dynamic Standing-CBCTs through Twin Vision Mamba-based 2D/3D registration Twin- vimreg:通过Twin Vision mamba基于2D/3D注册的DXR驱动合成动态站立- cbct。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102648
Jiashun Wang , Hao Tang , Zhan Wu , Yikun Zhang , Yan Xi , Yang Chen , Chunfeng Yang , Yixin Zhou , Hui Tang
Medical imaging of the knee joint under physiological weight bearing is crucial for diagnosing and analyzing knee lesions. Existing modalities have limitations: Standing Cone-Beam Computed Tomography (Standing-CBCT) provides high-resolution 3D data but with long acquisition time and only a single static view, while Dynamic X-ray Imaging (DXR) captures continuous motion but lacks 3D structural information. These limitations motivate the need for dynamic 3D knee generation through 2D/3D registration of Standing-CBCT and DXR. Anatomically, although the femur, patella, and tibia–fibula undergo rigid motion, the joint as a whole exhibits non-rigid behavior. Consequently, existing rigid or non-rigid 2D/3D registration methods fail to fully address this scenario. We propose Twin-ViMReg, a twin-stream 2D/3D registration framework for multiple correlated objects in the knee joint. It extends conventional 2D/3D registration paradigm by establishing a pair of twined sub-tasks. By introducing a Multi-Objective Spatial Transformation (MOST) module, it models inter-object correlations and enhances registration robustness. The Vision Mamba-based encoder also strengthens the representation capacity of the method. We used 1,500 simulated data pairs from 10 patients for training and 56 real data pairs from 3 patients for testing. Quantitative evaluation shows that the mean TRE reached 3.36 mm, the RSR was 8.93% higher than the SOTA methods. With an average computation time of 1.22 s per X-ray image, Twin-ViMReg enables efficient 2D/3D knee joint registration within seconds, making it a practical and promising solution.
生理负重下膝关节的医学影像对诊断和分析膝关节病变至关重要。现有的模式存在局限性:立式锥束计算机断层扫描(Standing- cbct)提供高分辨率的3D数据,但采集时间长,只有单一的静态视图,而动态x射线成像(DXR)捕获连续运动,但缺乏3D结构信息。这些限制激发了通过站立cbct和DXR的2D/3D注册进行动态3D膝关节生成的需求。解剖学上,虽然股骨、髌骨和胫腓骨经历刚性运动,但关节作为一个整体表现出非刚性行为。因此,现有的刚性或非刚性2D/3D配准方法不能完全解决这种情况。我们提出了Twin-ViMReg,一种双流二维/三维配准框架,用于膝关节内多个相关物体。它通过建立一对缠绕子任务扩展了传统的2D/3D配准范式。通过引入多目标空间变换(MOST)模块,对目标间的相关性进行建模,增强了配准的鲁棒性。基于视觉曼巴的编码器也加强了该方法的表示能力。我们使用来自10名患者的1500对模拟数据进行训练,使用来自3名患者的56对真实数据进行测试。定量评价结果表明,该方法的平均TRE为3.36 mm, RSR比SOTA方法高8.93%。Twin-ViMReg每张x射线图像的平均计算时间为1.22秒,可以在几秒钟内实现高效的2D/3D膝关节注册,使其成为一种实用而有前途的解决方案。
{"title":"Twin-ViMReg: DXR driven synthetic dynamic Standing-CBCTs through Twin Vision Mamba-based 2D/3D registration","authors":"Jiashun Wang ,&nbsp;Hao Tang ,&nbsp;Zhan Wu ,&nbsp;Yikun Zhang ,&nbsp;Yan Xi ,&nbsp;Yang Chen ,&nbsp;Chunfeng Yang ,&nbsp;Yixin Zhou ,&nbsp;Hui Tang","doi":"10.1016/j.compmedimag.2025.102648","DOIUrl":"10.1016/j.compmedimag.2025.102648","url":null,"abstract":"<div><div>Medical imaging of the knee joint under physiological weight bearing is crucial for diagnosing and analyzing knee lesions. Existing modalities have limitations: Standing Cone-Beam Computed Tomography (Standing-CBCT) provides high-resolution 3D data but with long acquisition time and only a single static view, while Dynamic X-ray Imaging (DXR) captures continuous motion but lacks 3D structural information. These limitations motivate the need for dynamic 3D knee generation through 2D/3D registration of Standing-CBCT and DXR. Anatomically, although the femur, patella, and tibia–fibula undergo rigid motion, the joint as a whole exhibits non-rigid behavior. Consequently, existing rigid or non-rigid 2D/3D registration methods fail to fully address this scenario. We propose Twin-ViMReg, a twin-stream 2D/3D registration framework for multiple correlated objects in the knee joint. It extends conventional 2D/3D registration paradigm by establishing a pair of twined sub-tasks. By introducing a Multi-Objective Spatial Transformation (MOST) module, it models inter-object correlations and enhances registration robustness. The Vision Mamba-based encoder also strengthens the representation capacity of the method. We used 1,500 simulated data pairs from 10 patients for training and 56 real data pairs from 3 patients for testing. Quantitative evaluation shows that the mean TRE reached 3.36 mm, the RSR was 8.93% higher than the SOTA methods. With an average computation time of 1.22 s per X-ray image, Twin-ViMReg enables efficient 2D/3D knee joint registration within seconds, making it a practical and promising solution.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102648"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation 收集血管标本在一个柜子:一个层次快速引导的三维血管分割通用模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102650
Yinuo Wang , Cai Meng , Zhe Xu
Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present VasBench, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce VasCab, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at https://github.com/mileswyn/VasCab.
体积医学图像中血管结构的准确分割对疾病诊断和手术计划至关重要。虽然深度神经网络已经显示出显著的有效性,但现有的方法往往依赖于针对特定模式和解剖区域定制的单独模型,导致参数冗余和泛化受限。最近的通用模型解决了更广泛的分割任务,但与血管结构的独特挑战作斗争。为了克服这些限制,我们首先提出了VasBench,这是一个新的综合血管分割基准,包括跨越不同模式和解剖区域的9个子数据集。在此基础上,我们介绍了VasCab,一种新颖的快速引导的通用模型,用于体积血管分割,旨在“在一个柜子里收集血管标本”。具体来说,VasCab配备了可学习的域和拓扑提示,以捕获跨不同数据域的共享和独特的血管特征,并辅以形态感知损失来解决复杂的形态变化。实验结果表明,VasCab在所有测试数据集上都超越了个体模型和最先进的医学基础模型,展示了卓越的跨域集成和血管形态变化的精确建模。此外,VasCab在下游任务中表现出强大的性能,强调了其通用性和统一血管分析的潜力。这项研究标志着向通用血管分割迈出了重要的一步,为跨异构数据集的统一血管分析提供了一个有希望的解决方案。代码和数据集可从https://github.com/mileswyn/VasCab获得。
{"title":"Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation","authors":"Yinuo Wang ,&nbsp;Cai Meng ,&nbsp;Zhe Xu","doi":"10.1016/j.compmedimag.2025.102650","DOIUrl":"10.1016/j.compmedimag.2025.102650","url":null,"abstract":"<div><div>Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present <strong>VasBench</strong>, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce <strong>VasCab</strong>, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at <span><span>https://github.com/mileswyn/VasCab</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102650"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography 增强颅内血管分割使用扩散模型无需手动注释的三维飞行时间磁共振血管成像。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102651
Jonghun Kim , Inye Na , Jiwon Chung , Ha-Na Song , Kyungseo Kim , Seongvin Ju , Mi-Yeon Eun , Woo-Keun Seo , Hyunjin Park
Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at github.com/jongdory/Vessel-Diffusion.
颅内血管分割是管理脑疾病,促进早期发现和精确干预中风和动脉瘤的必要条件。飞行时间磁共振血管成像(TOF-MRA)是一种常用的血管成像技术。传统的基于规则的MRA分割方法效率高,但存在不稳定和性能差的问题。近年来,包括扩散模型在内的深度学习模型在医学图像分割中得到了广泛的关注。然而,他们需要训练的基础真理,这是劳动密集型和耗时的。我们提出了一种新的分割方法,它结合了基于规则和扩散模型的优势,在不依赖显式标签的情况下改进分割。我们的模型采用Frangi滤波器来帮助血管检测,并修改扩散模型以排除内存密集型注意力模块以提高效率。我们的条件网络将特征映射连接起来,以进一步增强分割过程。对两个数据集的定量和定性评估表明,我们的方法不仅保持了血管区域的完整性,而且大大降低了噪声,为颅内血管分割提供了一个强大的解决方案。我们的研究结果为改善涉及脑血管疾病的患者护理提供了基础。我们的代码可在github.com/jongdory/Vessel-Diffusion上获得。
{"title":"Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography","authors":"Jonghun Kim ,&nbsp;Inye Na ,&nbsp;Jiwon Chung ,&nbsp;Ha-Na Song ,&nbsp;Kyungseo Kim ,&nbsp;Seongvin Ju ,&nbsp;Mi-Yeon Eun ,&nbsp;Woo-Keun Seo ,&nbsp;Hyunjin Park","doi":"10.1016/j.compmedimag.2025.102651","DOIUrl":"10.1016/j.compmedimag.2025.102651","url":null,"abstract":"<div><div>Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at <span><span>github.com/jongdory/Vessel-Diffusion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102651"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145259815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1