首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
Coronary artery calcification segmentation with sparse annotations in intravascular OCT: Leveraging self-supervised learning and consistency regularization 血管内OCT稀疏注释冠状动脉钙化分割:利用自监督学习和一致性正则化
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102653
Chao Li , Zhifeng Qin , Zhenfei Tang , Yidan Wang , Bo Zhang , Jinwei Tian , Zhao Wang
Assessing coronary artery calcification (CAC) is crucial in evaluating the progression of atherosclerosis and planning percutaneous coronary intervention (PCI). Intravascular Optical Coherence Tomography (OCT) is a commonly used imaging tool for evaluating CAC at micrometer-scale level and in three-dimensions for optimizing PCI. While existing deep learning methods have proven effective in OCT image analysis, they are hindered by the lack of large-scale, high-quality labels to train deep neural networks that can reach human level performance in practice. In this work, we propose an annotation-efficient approach for segmenting CAC in intravascular OCT images, leveraging self-supervised learning and consistency regularization. We employ a transformer encoder paired with a simple linear projection layer for self-supervised pre-training on unlabeled OCT data. Subsequently, a transformer-based segmentation model is fine-tuned on sparsely annotated OCT pullbacks with a contrast loss using a combination of unlabeled and labeled data. We collected 2,549,073 unlabeled OCT images from 7,108 OCT pullbacks for pre-training, and 1,106,347 sparsely annotated OCT images from 3,025 OCT pullbacks for model training and testing. The proposed approach consistently outperformed existing sparsely supervised methods on both internal and external datasets. In addition, extensive comparisons under full, partial, and sparse annotation schemes substantiated its high annotation efficiency. With 80% reduction in image labeling efforts, our method has the potential to expedite the development of deep learning models for processing large-scale medical image data.
评估冠状动脉钙化(CAC)是评估动脉粥样硬化进展和计划经皮冠状动脉介入治疗(PCI)的关键。血管内光学相干断层扫描(OCT)是一种常用的成像工具,用于在微米尺度上评估CAC,并在三维空间上优化PCI。虽然现有的深度学习方法已被证明在OCT图像分析中是有效的,但由于缺乏大规模、高质量的标签来训练能够在实践中达到人类水平性能的深度神经网络,它们受到了阻碍。在这项工作中,我们提出了一种注释有效的方法来分割血管内OCT图像中的CAC,利用自监督学习和一致性正则化。我们使用一个变压器编码器与一个简单的线性投影层配对,对未标记的OCT数据进行自监督预训练。随后,基于变压器的分割模型对稀疏注释的OCT回调进行微调,并使用未标记和标记数据的组合进行对比度损失。我们从7,108个OCT回调中收集了2,549,073张未标记的OCT图像用于预训练,从3,025个OCT回调中收集了1,106,347张稀疏注释的OCT图像用于模型训练和测试。所提出的方法在内部和外部数据集上始终优于现有的稀疏监督方法。通过对全标注、部分标注和稀疏标注方案的比较,证明了其标注效率高。由于图像标记工作减少了80%,我们的方法有可能加速深度学习模型的开发,以处理大规模医学图像数据。
{"title":"Coronary artery calcification segmentation with sparse annotations in intravascular OCT: Leveraging self-supervised learning and consistency regularization","authors":"Chao Li ,&nbsp;Zhifeng Qin ,&nbsp;Zhenfei Tang ,&nbsp;Yidan Wang ,&nbsp;Bo Zhang ,&nbsp;Jinwei Tian ,&nbsp;Zhao Wang","doi":"10.1016/j.compmedimag.2025.102653","DOIUrl":"10.1016/j.compmedimag.2025.102653","url":null,"abstract":"<div><div>Assessing coronary artery calcification (CAC) is crucial in evaluating the progression of atherosclerosis and planning percutaneous coronary intervention (PCI). Intravascular Optical Coherence Tomography (OCT) is a commonly used imaging tool for evaluating CAC at micrometer-scale level and in three-dimensions for optimizing PCI. While existing deep learning methods have proven effective in OCT image analysis, they are hindered by the lack of large-scale, high-quality labels to train deep neural networks that can reach human level performance in practice. In this work, we propose an annotation-efficient approach for segmenting CAC in intravascular OCT images, leveraging self-supervised learning and consistency regularization. We employ a transformer encoder paired with a simple linear projection layer for self-supervised pre-training on unlabeled OCT data. Subsequently, a transformer-based segmentation model is fine-tuned on sparsely annotated OCT pullbacks with a contrast loss using a combination of unlabeled and labeled data. We collected 2,549,073 unlabeled OCT images from 7,108 OCT pullbacks for pre-training, and 1,106,347 sparsely annotated OCT images from 3,025 OCT pullbacks for model training and testing. The proposed approach consistently outperformed existing sparsely supervised methods on both internal and external datasets. In addition, extensive comparisons under full, partial, and sparse annotation schemes substantiated its high annotation efficiency. With 80% reduction in image labeling efforts, our method has the potential to expedite the development of deep learning models for processing large-scale medical image data.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102653"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145267115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging 基于尺度自适应结构亲和变换的超声体积投影成像脊柱分割。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102649
Hao Xie , Zixun Huang , Yushen Zuo , Yakun Ju , Frank H.F. Leung , N.F. Law , Kin-Man Lam , Yong-Ping Zheng , Sai Ho Ling
Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA2Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA2Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA2Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.
基于超声体积投影成像(VPI)的脊柱分割对脊柱侧凸智能诊断具有重要的临床应用价值。然而,这项任务面临着几个重大挑战。首先,如果我们忽略了不同骨骼特征的高空间相关性,那么脊柱的全局上下文知识可能无法很好地学习。其次,脊柱骨骼包含丰富的形状和位置结构知识,值得编码到分割过程中。为了解决这些挑战,我们提出了一种新的规模自适应结构感知网络(SA2Net),用于有效的脊柱分割。首先,我们提出了一种尺度自适应互补策略来学习脊柱图像的跨维远距离相关特征。其次,基于Transformer中多头自注意与语义级亲和力的一致性,提出了结构-亲和力转换,将语义特征转换为类特定亲和力,并将其与Transformer解码器结合,实现结构感知推理。此外,我们采用特征混合损失聚合方法来增强模型训练。该方法提高了分割过程的鲁棒性和准确性。实验结果表明,与其他最先进的方法相比,我们的SA2Net实现了优越的分割性能。此外,SA2Net对各种脊柱的适应性增强了其作为智能脊柱图像分析高级脊柱侧凸诊断工具的潜力。
{"title":"SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging","authors":"Hao Xie ,&nbsp;Zixun Huang ,&nbsp;Yushen Zuo ,&nbsp;Yakun Ju ,&nbsp;Frank H.F. Leung ,&nbsp;N.F. Law ,&nbsp;Kin-Man Lam ,&nbsp;Yong-Ping Zheng ,&nbsp;Sai Ho Ling","doi":"10.1016/j.compmedimag.2025.102649","DOIUrl":"10.1016/j.compmedimag.2025.102649","url":null,"abstract":"<div><div>Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA<sup>2</sup>Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA<sup>2</sup>Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA<sup>2</sup>Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102649"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145193971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning for automatic vertebra analysis: A methodological survey of recent advances 用于自动椎体分析的深度学习:最近进展的方法学调查。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102652
Zhuofan Xie , Zishan Lin , Enlong Sun , Fengyi Ding , Jie Qi , Shen Zhao
Automated vertebra analysis (AVA), encompassing vertebra detection and segmentation, plays a critical role in computer-aided diagnosis, surgical planning, and postoperative evaluation in spine-related clinical workflows. Despite notable progress, AVA continues to face key challenges, including variations in the field of view (FOV), complex vertebral morphology, limited availability of high-quality annotated data, and performance degradation under domain shifts. Over the past decade, numerous studies have employed deep learning (DL) to tackle these issues, introducing advanced network architectures and innovative learning paradigms. However, the rapid evolution of these methods has not been comprehensively captured by existing surveys, resulting in a knowledge gap regarding the current state of the field. To address this, this paper presents an up-to-date review that systematically summarizes recent advances. The review begins by consolidating publicly available datasets and evaluation metrics to support standardized benchmarking. Recent DL-based AVA approaches are then analyzed from two methodological perspectives: network architecture improvement and learning strategies design. Finally, an examination of persistent technical barriers and emerging clinical needs that are shaping future research directions is provided. These include multimodal learning, domain generalization, and the integration of foundation models. As the most current survey in the field, this review provides a comprehensive and structured synthesis aimed at guiding future research toward the development of robust, generalizable, and clinically deployable AVA systems in the era of intelligent medical imaging.
自动椎体分析(AVA),包括椎体检测和分割,在脊柱相关临床工作流程的计算机辅助诊断、手术计划和术后评估中起着至关重要的作用。尽管取得了显著进展,但AVA仍然面临着关键挑战,包括视场(FOV)的变化,复杂的椎体形态,高质量注释数据的可用性有限,以及域转移下的性能下降。在过去的十年中,许多研究都采用深度学习(DL)来解决这些问题,引入了先进的网络架构和创新的学习范式。然而,这些方法的快速发展并没有被现有的调查全面地捕捉到,从而导致了关于该领域现状的知识差距。为了解决这个问题,本文提出了一个最新的评论,系统地总结了最近的进展。审查从整合公开可用的数据集和评估指标开始,以支持标准化基准测试。然后从两个方法学角度分析了最近基于dl的AVA方法:网络架构改进和学习策略设计。最后,对持续存在的技术障碍和正在形成未来研究方向的新临床需求进行了检查。这些包括多模态学习、领域泛化和基础模型的集成。作为该领域最新的调查,本综述提供了一个全面和结构化的综合,旨在指导未来研究在智能医学成像时代开发健壮、通用和临床可部署的AVA系统。
{"title":"Deep learning for automatic vertebra analysis: A methodological survey of recent advances","authors":"Zhuofan Xie ,&nbsp;Zishan Lin ,&nbsp;Enlong Sun ,&nbsp;Fengyi Ding ,&nbsp;Jie Qi ,&nbsp;Shen Zhao","doi":"10.1016/j.compmedimag.2025.102652","DOIUrl":"10.1016/j.compmedimag.2025.102652","url":null,"abstract":"<div><div>Automated vertebra analysis (AVA), encompassing vertebra detection and segmentation, plays a critical role in computer-aided diagnosis, surgical planning, and postoperative evaluation in spine-related clinical workflows. Despite notable progress, AVA continues to face key challenges, including variations in the field of view (FOV), complex vertebral morphology, limited availability of high-quality annotated data, and performance degradation under domain shifts. Over the past decade, numerous studies have employed deep learning (DL) to tackle these issues, introducing advanced network architectures and innovative learning paradigms. However, the rapid evolution of these methods has not been comprehensively captured by existing surveys, resulting in a knowledge gap regarding the current state of the field. To address this, this paper presents an up-to-date review that systematically summarizes recent advances. The review begins by consolidating publicly available datasets and evaluation metrics to support standardized benchmarking. Recent DL-based AVA approaches are then analyzed from two methodological perspectives: network architecture improvement and learning strategies design. Finally, an examination of persistent technical barriers and emerging clinical needs that are shaping future research directions is provided. These include multimodal learning, domain generalization, and the integration of foundation models. As the most current survey in the field, this review provides a comprehensive and structured synthesis aimed at guiding future research toward the development of robust, generalizable, and clinically deployable AVA systems in the era of intelligent medical imaging.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102652"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145208514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation SGRRG:利用放射学场景图来改进和异常感知放射学报告生成。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-15 DOI: 10.1016/j.compmedimag.2025.102644
Jun Wang , Lixing Zhu , Abhir Bhalerao , Yulan He
Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at https://github.com/Markin-Wang/SGRRG.
放射学报告生成(RRG)方法往往缺乏足够的医学知识,以产生临床准确的报告。场景图为描述图像中的对象提供了全面的信息。然而,自动生成的放射场景图(RSG)可能包含噪声注释和高度重叠的区域,这给利用RSG增强RRG带来了挑战。为此,我们提出了场景图辅助RRG (SGRRG)框架,该框架利用自动生成的RSG,并使用基于变压器的模块处理RSG中的噪声监督问题,有效地以端到端方式提取医学知识。SGRRG由一个专门的场景图编码器和一个场景图辅助解码器组成,前者负责将射线照相转换为RSG,而前者利用了补丁级和区域级视觉信息,并减轻了RSG中的噪声注释问题。结合补丁级和区域级功能,以及基本RSG构建模块的集成,增强了我们框架的灵活性和稳健性,使其能够轻松利用先前的先进RRG技术。为了更好地提取RSG信息,设计了一种细粒度的句子级注意方法。此外,我们引入了两个代理任务,以提高模型产生临床准确报告的能力。大量的实验表明,SGRRG在报告生成方面优于以前最先进的方法,可以更好地捕获异常发现。代码可从https://github.com/Markin-Wang/SGRRG获得。
{"title":"SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation","authors":"Jun Wang ,&nbsp;Lixing Zhu ,&nbsp;Abhir Bhalerao ,&nbsp;Yulan He","doi":"10.1016/j.compmedimag.2025.102644","DOIUrl":"10.1016/j.compmedimag.2025.102644","url":null,"abstract":"<div><div>Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at <span><span>https://github.com/Markin-Wang/SGRRG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102644"},"PeriodicalIF":4.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145103172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling hidden risks: A Holistically-Driven Weak Supervision framework for ultra-short-term ACS prediction using CCTA 揭示隐藏的风险:使用CCTA进行超短期ACS预测的整体驱动弱监管框架。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-15 DOI: 10.1016/j.compmedimag.2025.102636
Zhen Liu , Bangkang Fu , Jiahui Mao , Junjie He , Jiangyue Xiang , Hongjin Li , Yunsong Peng , Bangguo Li , Rongpin Wang
This paper proposes MH-STR, a novel end-to-end framework for predicting the three-month risk of Acute Coronary Syndrome (ACS) from Coronary CT Angiography (CCTA) images. The model combines hybrid attention mechanisms with convolutional networks to capture subtle and irregular lesion patterns that are difficult to detect visually. A stage-wise transfer learning strategy helps distill general features and transfer vascular-specific knowledge. To reconcile feature scale mismatches in the dual-branch architecture, we introduce a wavelet-based multi-scale fusion module for effective integration across scales. Experiments show that MH-STR achieves an AUC of 0.834, an F1 score of 0.82, and a precision of 0.92, outperforming existing methods and highlighting its potential for improving ACS risk prediction.
本文提出了MH-STR,一个新的端到端框架,用于从冠状动脉CT血管造影(CCTA)图像预测三个月的急性冠脉综合征(ACS)风险。该模型将混合注意机制与卷积网络相结合,以捕获难以视觉检测的细微和不规则病变模式。分阶段迁移学习策略有助于提取一般特征并迁移血管特定知识。为了解决双分支结构中特征尺度不匹配的问题,我们引入了基于小波的多尺度融合模块,实现了多尺度的有效融合。实验结果表明,MH-STR的AUC为0.834,F1分数为0.82,精度为0.92,优于现有方法,具有提高ACS风险预测的潜力。
{"title":"Unveiling hidden risks: A Holistically-Driven Weak Supervision framework for ultra-short-term ACS prediction using CCTA","authors":"Zhen Liu ,&nbsp;Bangkang Fu ,&nbsp;Jiahui Mao ,&nbsp;Junjie He ,&nbsp;Jiangyue Xiang ,&nbsp;Hongjin Li ,&nbsp;Yunsong Peng ,&nbsp;Bangguo Li ,&nbsp;Rongpin Wang","doi":"10.1016/j.compmedimag.2025.102636","DOIUrl":"10.1016/j.compmedimag.2025.102636","url":null,"abstract":"<div><div>This paper proposes MH-STR, a novel end-to-end framework for predicting the three-month risk of Acute Coronary Syndrome (ACS) from Coronary CT Angiography (CCTA) images. The model combines hybrid attention mechanisms with convolutional networks to capture subtle and irregular lesion patterns that are difficult to detect visually. A stage-wise transfer learning strategy helps distill general features and transfer vascular-specific knowledge. To reconcile feature scale mismatches in the dual-branch architecture, we introduce a wavelet-based multi-scale fusion module for effective integration across scales. Experiments show that MH-STR achieves an AUC of 0.834, an F1 score of 0.82, and a precision of 0.92, outperforming existing methods and highlighting its potential for improving ACS risk prediction.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102636"},"PeriodicalIF":4.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145088019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A segmentation-based hierarchical feature interaction attention model for gene mutation status identification in colorectal cancer 基于分段的结直肠癌基因突变状态识别层次特征交互注意模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-13 DOI: 10.1016/j.compmedimag.2025.102646
Yu Miao , Sijie Song , Lin Zhao , Jun Zhao , Yingsen Wang , Ran Gong , Yan Qiang , Hua Zhang , Juanjuan Zhao
Precise identification of Kirsten Rat Sarcoma (KRAS) gene mutation status is critical for both qualitative analysis of colorectal cancer and formulation of personalized therapeutic regimens. In this paper, we propose a Segmentation-based Hierarchical feature Interaction Attention Model (SHIAM) that synergizes multi-task learning with hierarchical feature integration, aiming to achieve accurate prediction of the KRAS gene mutation status. Specifically, we integrate segmentation and classification tasks, sharing feature representations between them. To fully focus on the lesion areas at different levels and their potential associations, we design a multi-level synergistic attention block that enables adaptive fusion of lesion characteristics of varying granularity with their contextual associations. To transcend the constraints of conventional methodologies in modeling long-range relationships, we design a global collaborative interaction attention module, an efficient improved long-range perception Transformer. As the core component of module, the long-range perception block provides robust support for mining feature integrity with its excellent perception ability. Furthermore, we introduce a hybrid feature engineering strategy that integrates hand-crafted features encoded as statistical information entropy with automatically learned deep representations, thereby establishing a complementary feature space. Our SHIAM has been rigorously trained and verified on the colorectal cancer dataset provided by Shanxi Cancer Hospital. The results show that it achieves an accuracy of 89.42% and an AUC value of 95.89% in KRAS gene mutation status prediction, with comprehensive performance superior to all current non-invasive assays. In clinical practice, our model possesses the capability to enable computer-aided diagnosis, effectively assisting physicians in formulating suitable personalized treatment plans for patients.
准确识别KRAS基因突变状态对于大肠癌的定性分析和个性化治疗方案的制定至关重要。本文提出了一种基于分割的分层特征交互注意模型(SHIAM),该模型将多任务学习与分层特征集成相结合,以实现对KRAS基因突变状态的准确预测。具体来说,我们集成了分割和分类任务,在它们之间共享特征表示。为了充分关注不同层次的病变区域及其潜在关联,我们设计了一个多层次的协同注意块,使不同粒度的病变特征与其上下文关联能够自适应融合。为了超越传统方法对远程关系建模的限制,我们设计了一个全局协作交互注意模块,一个高效的改进远程感知转换器。远程感知块作为模块的核心组件,以其优异的感知能力为特征完整性挖掘提供了强大的支持。此外,我们引入了一种混合特征工程策略,该策略将编码为统计信息熵的手工特征与自动学习的深度表征相结合,从而建立了互补的特征空间。我们的SHIAM已经在山西肿瘤医院提供的结直肠癌数据集上进行了严格的训练和验证。结果表明,该方法预测KRAS基因突变状态的准确率为89.42%,AUC值为95.89%,综合性能优于目前所有的无创检测方法。在临床实践中,我们的模型具有计算机辅助诊断的能力,有效地协助医生为患者制定适合的个性化治疗方案。
{"title":"A segmentation-based hierarchical feature interaction attention model for gene mutation status identification in colorectal cancer","authors":"Yu Miao ,&nbsp;Sijie Song ,&nbsp;Lin Zhao ,&nbsp;Jun Zhao ,&nbsp;Yingsen Wang ,&nbsp;Ran Gong ,&nbsp;Yan Qiang ,&nbsp;Hua Zhang ,&nbsp;Juanjuan Zhao","doi":"10.1016/j.compmedimag.2025.102646","DOIUrl":"10.1016/j.compmedimag.2025.102646","url":null,"abstract":"<div><div>Precise identification of Kirsten Rat Sarcoma (KRAS) gene mutation status is critical for both qualitative analysis of colorectal cancer and formulation of personalized therapeutic regimens. In this paper, we propose a Segmentation-based Hierarchical feature Interaction Attention Model (SHIAM) that synergizes multi-task learning with hierarchical feature integration, aiming to achieve accurate prediction of the KRAS gene mutation status. Specifically, we integrate segmentation and classification tasks, sharing feature representations between them. To fully focus on the lesion areas at different levels and their potential associations, we design a multi-level synergistic attention block that enables adaptive fusion of lesion characteristics of varying granularity with their contextual associations. To transcend the constraints of conventional methodologies in modeling long-range relationships, we design a global collaborative interaction attention module, an efficient improved long-range perception Transformer. As the core component of module, the long-range perception block provides robust support for mining feature integrity with its excellent perception ability. Furthermore, we introduce a hybrid feature engineering strategy that integrates hand-crafted features encoded as statistical information entropy with automatically learned deep representations, thereby establishing a complementary feature space. Our SHIAM has been rigorously trained and verified on the colorectal cancer dataset provided by Shanxi Cancer Hospital. The results show that it achieves an accuracy of 89.42% and an AUC value of 95.89% in KRAS gene mutation status prediction, with comprehensive performance superior to all current non-invasive assays. In clinical practice, our model possesses the capability to enable computer-aided diagnosis, effectively assisting physicians in formulating suitable personalized treatment plans for patients.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102646"},"PeriodicalIF":4.9,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145103150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A self-attention model for robust rigid slice-to-volume registration of functional MRI 一种功能MRI稳健刚性切片-体积配准的自注意模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-13 DOI: 10.1016/j.compmedimag.2025.102643
Samah Khawaled , Onur Afacan , Simon K. Warfield , Moti Freiman
Functional Magnetic Resonance Imaging (fMRI) is vital in neuroscience, enabling investigations into brain disorders, treatment monitoring, and brain function mapping. However, head motion during fMRI scans, occurring between shots of slice acquisition, can result in distortion, biased analyses, and increased costs due to the need for scan repetitions. Therefore, retrospective slice-level motion correction through slice-to-volume registration (SVR) is crucial. Previous studies have utilized deep learning (DL) based models to address the SVR task; however, they overlooked the uncertainty stemming from the input stack of slices and did not assign weighting or scoring to each slice. Treating all slices equally ignores the variability in their relevance, leading to suboptimal predictions. In this work, we introduce an end-to-end SVR model for aligning 2D fMRI slices with a 3D reference volume, incorporating a self-attention mechanism to enhance robustness against input data variations and uncertainties. Our SVR model utilizes independent slice and volume encoders and a self-attention module to assign pixel-wise scores for each slice. We used the publicly available Healthy Brain Network (HBN) dataset. We split the volumes into training (64%), validation (16%), and test (20%) sets. To conduct the simulated motion study, we synthesized rigid transformations across a wide range of parameters and applied them to the reference volumes. Slices were then sampled according to the acquisition protocol to generate 2,000, 500, and 200 3D volume–2D slice pairs for the training, validation, and test sets, respectively. Our experimental results demonstrate that our model achieves competitive performance in terms of alignment accuracy compared to state-of-the-art deep learning-based methods (Euclidean distance of 0.93 [mm] vs. 1.86 [mm], a paired t-test with a p-value of p<0.03). Furthermore, our approach exhibits faster registration speed compared to conventional iterative methods (0.096 s vs. 1.17 s). Our end-to-end SVR model facilitates real-time head motion tracking during fMRI acquisition, ensuring reliability and robustness against uncertainties in the inputs.
功能磁共振成像(fMRI)在神经科学中是至关重要的,它使研究大脑疾病、治疗监测和脑功能绘图成为可能。然而,在fMRI扫描期间,头部运动发生在切片采集之间,由于需要重复扫描,可能导致失真、分析偏差和成本增加。因此,通过切片-体积配准(SVR)进行回顾性切片级运动校正至关重要。以前的研究利用基于深度学习(DL)的模型来解决SVR任务;然而,他们忽略了来自切片输入堆栈的不确定性,并且没有为每个切片分配权重或评分。平等地对待所有切片忽略了它们相关性的可变性,导致次优预测。在这项工作中,我们引入了一个端到端的SVR模型,用于将2D fMRI切片与3D参考体积对准,该模型结合了一个自注意机制,以增强对输入数据变化和不确定性的鲁棒性。我们的SVR模型利用独立的切片和音量编码器以及自关注模块为每个切片分配像素级分数。我们使用了公开可用的健康大脑网络(HBN)数据集。我们将这些数据集分成训练集(64%)、验证集(16%)和测试集(20%)。为了进行模拟运动研究,我们在广泛的参数范围内合成了刚性变换,并将它们应用于参考体积。然后根据采集协议对切片进行采样,分别为训练集、验证集和测试集生成2,000、500和200个3D体- 2d切片对。我们的实验结果表明,与最先进的基于深度学习的方法相比,我们的模型在对齐精度方面取得了具有竞争力的性能(欧几里得距离为0.93 [mm] vs. 1.86 [mm],配对t检验,p值为p
{"title":"A self-attention model for robust rigid slice-to-volume registration of functional MRI","authors":"Samah Khawaled ,&nbsp;Onur Afacan ,&nbsp;Simon K. Warfield ,&nbsp;Moti Freiman","doi":"10.1016/j.compmedimag.2025.102643","DOIUrl":"10.1016/j.compmedimag.2025.102643","url":null,"abstract":"<div><div>Functional Magnetic Resonance Imaging (fMRI) is vital in neuroscience, enabling investigations into brain disorders, treatment monitoring, and brain function mapping. However, head motion during fMRI scans, occurring between shots of slice acquisition, can result in distortion, biased analyses, and increased costs due to the need for scan repetitions. Therefore, retrospective slice-level motion correction through slice-to-volume registration (SVR) is crucial. Previous studies have utilized deep learning (DL) based models to address the SVR task; however, they overlooked the uncertainty stemming from the input stack of slices and did not assign weighting or scoring to each slice. Treating all slices equally ignores the variability in their relevance, leading to suboptimal predictions. In this work, we introduce an end-to-end SVR model for aligning 2D fMRI slices with a 3D reference volume, incorporating a self-attention mechanism to enhance robustness against input data variations and uncertainties. Our SVR model utilizes independent slice and volume encoders and a self-attention module to assign pixel-wise scores for each slice. We used the publicly available Healthy Brain Network (HBN) dataset. We split the volumes into training (64%), validation (16%), and test (20%) sets. To conduct the simulated motion study, we synthesized rigid transformations across a wide range of parameters and applied them to the reference volumes. Slices were then sampled according to the acquisition protocol to generate 2,000, 500, and 200 3D volume–2D slice pairs for the training, validation, and test sets, respectively. Our experimental results demonstrate that our model achieves competitive performance in terms of alignment accuracy compared to state-of-the-art deep learning-based methods (Euclidean distance of 0.93 [mm] vs. 1.86 [mm], a paired t-test with a <span><math><mi>p</mi></math></span>-value of <span><math><mrow><mi>p</mi><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>03</mn></mrow></math></span>). Furthermore, our approach exhibits faster registration speed compared to conventional iterative methods (0.096 s vs. 1.17 s). Our end-to-end SVR model facilitates real-time head motion tracking during fMRI acquisition, ensuring reliability and robustness against uncertainties in the inputs.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102643"},"PeriodicalIF":4.9,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mamba-based context-aware local feature network for vessel detail enhancement 基于mamba的上下文感知局部特征网络,用于船舶细节增强
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-12 DOI: 10.1016/j.compmedimag.2025.102645
Keyi Han , Anqi Xiao , Jie Tian , Zhenhua Hu

Objective

Blood vessel analysis is essential in various clinical fields. Detailed vascular imaging enables clinicians to assess abnormalities and make timely, effective interventions. Near-infrared-II (NIR-II, 1000–1700 nm) fluorescence imaging offers superior resolution, sensitivity, and deeper tissue visualization, making it highly promising for vascular imaging. However, deep vessels exhibit relatively low contrast, making differentiation challenging, and accurate vessel segmentation remains a difficult task.

Methods

We propose CALFNet, a context-aware local feature network based on the Mamba module, which can segment more vascular details in low-contrast regions. CALFNet overall follows a UNet-like architectures, with a ResNet-based encoder for extracting local features and a Mamba-based context-aware module in the latent space for the awareness of the global context. By incorporating the global vessel contextual information, the network can enhance segmentation performance in locally low-contrast areas, capturing finer vessel structures more effectively. Furthermore, a feature-enhance module between the encoder and decoder is designed to preserve critical historical local features from the encoder and use them to further refine the vascular details in the decoder's feature representations.

Results

We conducted experiments on two types of clinical datasets, including an NIR-II fluorescent vascular imaging dataset and retinal vessel datasets captured under visible light. The results show that CALFNet outperforms the comparison methods, demonstrating superior robustness and achieving more accurate vessel segmentation, particularly in low-contrast regions.

Conclusion and Significance

CALFNet is an effective vessel segmentation network showing better performance in accurately segmenting vessels within low-contrast regions. It can enhance the capability of NIR-II fluorescence imaging for vascular analysis, providing valuable support for clinical diagnosis and medical intervention.
目的血管分析在临床各个领域都是必不可少的。详细的血管成像使临床医生能够评估异常并及时有效地进行干预。近红外- ii (NIR-II, 1000-1700 nm)荧光成像提供卓越的分辨率,灵敏度和更深层次的组织可视化,使其在血管成像方面非常有前途。然而,深层血管的对比度相对较低,使得分化具有挑战性,并且准确的血管分割仍然是一项艰巨的任务。方法提出基于Mamba模块的上下文感知局部特征网络CALFNet,该网络可以在低对比度区域分割出更多的血管细节。CALFNet总体上遵循类似unet的架构,使用基于resnet的编码器来提取本地特征,在潜在空间中使用基于mamba的上下文感知模块来感知全局上下文。通过整合全局船舶上下文信息,网络可以增强局部低对比度区域的分割性能,更有效地捕获更精细的船舶结构。此外,在编码器和解码器之间设计了一个特征增强模块,用于保留编码器的关键历史局部特征,并使用它们进一步细化解码器特征表示中的血管细节。结果我们对两种类型的临床数据集进行了实验,包括NIR-II荧光血管成像数据集和可见光下捕获的视网膜血管数据集。结果表明,CALFNet优于对比方法,表现出优越的鲁棒性,实现了更准确的血管分割,特别是在低对比度区域。结论与意义alfnet是一种有效的血管分割网络,在低对比度区域内具有较好的血管准确分割效果。可增强NIR-II荧光成像血管分析能力,为临床诊断和医学干预提供有价值的支持。
{"title":"Mamba-based context-aware local feature network for vessel detail enhancement","authors":"Keyi Han ,&nbsp;Anqi Xiao ,&nbsp;Jie Tian ,&nbsp;Zhenhua Hu","doi":"10.1016/j.compmedimag.2025.102645","DOIUrl":"10.1016/j.compmedimag.2025.102645","url":null,"abstract":"<div><h3>Objective</h3><div>Blood vessel analysis is essential in various clinical fields. Detailed vascular imaging enables clinicians to assess abnormalities and make timely, effective interventions. Near-infrared-II (NIR-II, 1000–1700 nm) fluorescence imaging offers superior resolution, sensitivity, and deeper tissue visualization, making it highly promising for vascular imaging. However, deep vessels exhibit relatively low contrast, making differentiation challenging, and accurate vessel segmentation remains a difficult task.</div></div><div><h3>Methods</h3><div>We propose CALFNet, a context-aware local feature network based on the Mamba module, which can segment more vascular details in low-contrast regions. CALFNet overall follows a UNet-like architectures, with a ResNet-based encoder for extracting local features and a Mamba-based context-aware module in the latent space for the awareness of the global context. By incorporating the global vessel contextual information, the network can enhance segmentation performance in locally low-contrast areas, capturing finer vessel structures more effectively. Furthermore, a feature-enhance module between the encoder and decoder is designed to preserve critical historical local features from the encoder and use them to further refine the vascular details in the decoder's feature representations.</div></div><div><h3>Results</h3><div>We conducted experiments on two types of clinical datasets, including an NIR-II fluorescent vascular imaging dataset and retinal vessel datasets captured under visible light. The results show that CALFNet outperforms the comparison methods, demonstrating superior robustness and achieving more accurate vessel segmentation, particularly in low-contrast regions.</div></div><div><h3>Conclusion and Significance</h3><div>CALFNet is an effective vessel segmentation network showing better performance in accurately segmenting vessels within low-contrast regions. It can enhance the capability of NIR-II fluorescence imaging for vascular analysis, providing valuable support for clinical diagnosis and medical intervention.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102645"},"PeriodicalIF":4.9,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate and fast monocular endoscopic depth estimation of structure-content integrated diffusion 结构-含量集成扩散的准确快速单眼内窥镜深度估计
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-06 DOI: 10.1016/j.compmedimag.2025.102640
Min Tan , Yushun Tao , Boyun Zheng , Gaosheng Xie , Zeyang Xia , Jing Xiong
Endoscopic depth estimation is crucial for video understanding, robotic navigation, and 3D reconstruction in minimally invasive surgeries. However, existing methods for monocular depth estimation often struggle with the challenging conditions of endoscopic imagery, such as complex illumination, narrow luminal spaces, and low-contrast surfaces, resulting in inaccurate depth predictions. To address these challenges, we propose the Structure-Content Integrated Diffusion Estimation (SCIDE) for accurate and fast endoscopic depth estimation. Specifically, we introduce the Structure Content Extractor (SC-Extractor), a module specifically designed to extract structure and content priors to guide the depth estimation process in endoscopic environments. Additionally, we propose the Fast Optimized Diffusion Sampler (FODS) to meet the real-time needs in endoscopic surgery scenarios. FODS is a general sampling mechanism that optimizes selection of time steps in diffusion models. Our method (SCIDE) shows remarkable performance with an RMSE value of 0.0875 and a reduction of 74.2% in inference time when using FODS. These results demonstrate that our SCIDE framework achieves state-of-the-art accuracy of endoscopic depth estimation, and making real-time application feasible in endoscopic surgeries. https://misrobotx.github.io/scide/
内镜深度估计对于微创手术中的视频理解、机器人导航和3D重建至关重要。然而,现有的单目深度估计方法经常与内窥镜图像的挑战性条件作斗争,例如复杂的照明,狭窄的腔空间和低对比度的表面,导致深度预测不准确。为了解决这些挑战,我们提出了结构-内容集成扩散估计(SCIDE),用于准确快速的内窥镜深度估计。具体来说,我们介绍了结构内容提取器(SC-Extractor),这是一个专门用于提取结构和内容先验的模块,以指导内镜环境下的深度估计过程。此外,我们提出了快速优化扩散采样器(FODS),以满足内镜手术场景的实时需求。FODS是一种通用的采样机制,可以优化扩散模型中时间步长的选择。我们的方法(SCIDE)在使用FODS时表现出了显著的性能,RMSE值为0.0875,推理时间减少了74.2%。这些结果表明,我们的SCIDE框架达到了最先进的内窥镜深度估计精度,并使实时应用于内窥镜手术成为可能。https://misrobotx.github.io/scide/
{"title":"Accurate and fast monocular endoscopic depth estimation of structure-content integrated diffusion","authors":"Min Tan ,&nbsp;Yushun Tao ,&nbsp;Boyun Zheng ,&nbsp;Gaosheng Xie ,&nbsp;Zeyang Xia ,&nbsp;Jing Xiong","doi":"10.1016/j.compmedimag.2025.102640","DOIUrl":"10.1016/j.compmedimag.2025.102640","url":null,"abstract":"<div><div>Endoscopic depth estimation is crucial for video understanding, robotic navigation, and 3D reconstruction in minimally invasive surgeries. However, existing methods for monocular depth estimation often struggle with the challenging conditions of endoscopic imagery, such as complex illumination, narrow luminal spaces, and low-contrast surfaces, resulting in inaccurate depth predictions. To address these challenges, we propose the Structure-Content Integrated Diffusion Estimation (SCIDE) for accurate and fast endoscopic depth estimation. Specifically, we introduce the Structure Content Extractor (SC-Extractor), a module specifically designed to extract structure and content priors to guide the depth estimation process in endoscopic environments. Additionally, we propose the Fast Optimized Diffusion Sampler (FODS) to meet the real-time needs in endoscopic surgery scenarios. FODS is a general sampling mechanism that optimizes selection of time steps in diffusion models. Our method (SCIDE) shows remarkable performance with an RMSE value of 0.0875 and a reduction of 74.2% in inference time when using FODS. These results demonstrate that our SCIDE framework achieves state-of-the-art accuracy of endoscopic depth estimation, and making real-time application feasible in endoscopic surgeries. <span><span>https://misrobotx.github.io/scide/</span><svg><path></path></svg></span></div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102640"},"PeriodicalIF":4.9,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145026858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Generic Abdominal Multi-Organ Segmentation with multiple partially labeled datasets 基于多个部分标记数据集的通用腹部多器官分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-01 DOI: 10.1016/j.compmedimag.2025.102642
Xiang Li , Faming Fang , Liyan Ma , Tieyong Zeng , Guixu Zhang , Ming Xu
An increasing number of publicly available datasets have facilitated the exploration of building universal medical segmentation models. Existing approaches address partially labeled problem of each dataset by harmonizing labels across datasets and independently focusing on the labeled foreground regions. However, significant challenges persist, particularly in the form of cross-site domain shifts and the limited utilization of partially labeled datasets. In this paper, we propose a GAMOS (Generic Abdominal Multi-Organ Segmentation) framework. Specifically, GAMOS integrates a self-guidance strategy to adopt diffusion models for partial labeling issue, while employing a self-distillation mechanism to effectively leverage unlabeled data. A sparse semantic memory is introduced to mitigate domain shifts by ensuring consistent representations in the latent space. To further enhance performance, we design a sparse similarity loss to align multi-view memory representations and enhance the discriminability and compactness of the memory vectors. Extensive experiments on real-world medical datasets demonstrate the superiority and generalization ability of GAMOS. It achieves a mean Dice Similarity Coefficient (DSC) of 91.33% and a mean 95th percentile Hausdorff Distance (HD95) of 1.83 on labeled foreground regions. For unlabeled foreground regions, GAMOS obtains a mean DSC of 86.88% and a mean HD95 of 3.85, outperforming existing state-of-the-art methods.
越来越多的公开数据集促进了构建通用医学分割模型的探索。现有方法通过协调数据集之间的标签和独立关注标记的前景区域来解决每个数据集的部分标记问题。然而,重大的挑战仍然存在,特别是在跨站点域转移和部分标记数据集的有限利用方面。在本文中,我们提出了一个GAMOS(通用腹部多器官分割)框架。具体而言,GAMOS集成了自引导策略,采用扩散模型解决部分标记问题,同时采用自蒸馏机制有效利用未标记数据。引入了稀疏语义记忆,通过确保潜在空间中的一致表示来缓解域偏移。为了进一步提高性能,我们设计了一个稀疏相似损失来对齐多视图内存表示,并增强了内存向量的可辨别性和紧凑性。在实际医疗数据集上的大量实验证明了GAMOS的优越性和泛化能力。在标记的前景区域上,平均骰子相似系数(DSC)为91.33%,平均第95百分位豪斯多夫距离(HD95)为1.83。对于未标记的前景区域,GAMOS的平均DSC为86.88%,平均HD95为3.85,优于现有的最先进的方法。
{"title":"Towards Generic Abdominal Multi-Organ Segmentation with multiple partially labeled datasets","authors":"Xiang Li ,&nbsp;Faming Fang ,&nbsp;Liyan Ma ,&nbsp;Tieyong Zeng ,&nbsp;Guixu Zhang ,&nbsp;Ming Xu","doi":"10.1016/j.compmedimag.2025.102642","DOIUrl":"10.1016/j.compmedimag.2025.102642","url":null,"abstract":"<div><div>An increasing number of publicly available datasets have facilitated the exploration of building universal medical segmentation models. Existing approaches address partially labeled problem of each dataset by harmonizing labels across datasets and independently focusing on the labeled foreground regions. However, significant challenges persist, particularly in the form of cross-site domain shifts and the limited utilization of partially labeled datasets. In this paper, we propose a GAMOS (<strong>G</strong>eneric <strong>A</strong>bdominal <strong>M</strong>ulti-<strong>O</strong>rgan <strong>S</strong>egmentation) framework. Specifically, GAMOS integrates a self-guidance strategy to adopt diffusion models for partial labeling issue, while employing a self-distillation mechanism to effectively leverage unlabeled data. A sparse semantic memory is introduced to mitigate domain shifts by ensuring consistent representations in the latent space. To further enhance performance, we design a sparse similarity loss to align multi-view memory representations and enhance the discriminability and compactness of the memory vectors. Extensive experiments on real-world medical datasets demonstrate the superiority and generalization ability of GAMOS. It achieves a mean Dice Similarity Coefficient (DSC) of 91.33% and a mean 95th percentile Hausdorff Distance (HD95) of 1.83 on labeled foreground regions. For unlabeled foreground regions, GAMOS obtains a mean DSC of 86.88% and a mean HD95 of 3.85, outperforming existing state-of-the-art methods.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102642"},"PeriodicalIF":4.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144932397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1