首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
C3MT: Confidence-Calibrated Contrastive Mean Teacher for semi-supervised medical image segmentation C3MT:半监督医学图像分割的置信度校准对比均值教师。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 DOI: 10.1016/j.compmedimag.2026.102721
Xianmin Wang , Mingfeng Lin , Jing Li
Semi-supervised learning is crucial for medical image segmentation due to the scarcity of labeled data. However, existing methods that combine consistency regularization and pseudo-labeling often suffer from inadequate feature representation, suboptimal subnetwork disagreement, and noisy pseudo-labels. To address these limitations, this paper proposed a novel Confidence-Calibrated Contrastive Mean Teacher (C3MT) framework. First, C3MT introduces a Contrastive Learning-based co-training strategy, where an adaptive disagreement adjustment mechanism dynamically regulates the divergence between student models. This not only preserves representation diversity but also stabilizes the training process. Second, C3MT introduces a Confidence-Calibrated and Category-Aligned uncertainty-guided region mixing strategy. The confidence-calibrated mechanism filters out unreliable pseudo-labels, whereas the category-aligned design restricts region swapping to patches of the same semantic category, preserving anatomical coherence and preventing semantic inconsistency in the mixed samples. Together, these components significantly enhance feature representation, training stability, and segmentation quality, especially in challenging low-annotation scenarios. Extensive experiments on ACDC, Synapse, and LA datasets show that C3MT consistently outperforms recent state-of-the-art methods. For example, on the ACDC dataset with 20% labeled data, C3MT achieves up to a 4.3% improvement in average Dice score and a reduction in HD95 of more than 1.0 mm compared with strong baselines. The implementation is publicly available at https://github.com/l1654485/C3MT.
由于标记数据的稀缺性,半监督学习对于医学图像分割至关重要。然而,现有的一致性正则化与伪标记相结合的方法往往存在特征表示不充分、子网不一致次优和伪标记噪声等问题。为了解决这些局限性,本文提出了一种新的可信度校准对比平均教师(C3MT)框架。首先,C3MT引入了一种基于对比学习的协同训练策略,其中自适应分歧调整机制动态调节学生模型之间的分歧。这不仅保持了表征的多样性,而且稳定了训练过程。其次,C3MT引入了一种置信度校准和类别对齐不确定性引导的区域混合策略。置信度校准机制过滤掉不可靠的伪标签,而类别对齐设计限制区域交换到相同语义类别的补丁,保持解剖一致性并防止混合样本中的语义不一致。总之,这些组件显著增强了特征表示、训练稳定性和分割质量,特别是在具有挑战性的低注释场景中。在ACDC、Synapse和LA数据集上进行的大量实验表明,C3MT始终优于最新的最先进的方法。例如,在具有20%标记数据的ACDC数据集上,与强基线相比,C3MT在平均Dice得分方面提高了4.3%,HD95降低了1.0 mm以上。该实现可在https://github.com/l1654485/C3MT上公开获得。
{"title":"C3MT: Confidence-Calibrated Contrastive Mean Teacher for semi-supervised medical image segmentation","authors":"Xianmin Wang ,&nbsp;Mingfeng Lin ,&nbsp;Jing Li","doi":"10.1016/j.compmedimag.2026.102721","DOIUrl":"10.1016/j.compmedimag.2026.102721","url":null,"abstract":"<div><div>Semi-supervised learning is crucial for medical image segmentation due to the scarcity of labeled data. However, existing methods that combine consistency regularization and pseudo-labeling often suffer from inadequate feature representation, suboptimal subnetwork disagreement, and noisy pseudo-labels. To address these limitations, this paper proposed a novel <strong>C</strong>onfidence-<strong>C</strong>alibrated <strong>C</strong>ontrastive <strong>M</strong>ean <strong>T</strong>eacher (C3MT) framework. First, C3MT introduces a Contrastive Learning-based co-training strategy, where an adaptive disagreement adjustment mechanism dynamically regulates the divergence between student models. This not only preserves representation diversity but also stabilizes the training process. Second, C3MT introduces a Confidence-Calibrated and Category-Aligned uncertainty-guided region mixing strategy. The confidence-calibrated mechanism filters out unreliable pseudo-labels, whereas the category-aligned design restricts region swapping to patches of the same semantic category, preserving anatomical coherence and preventing semantic inconsistency in the mixed samples. Together, these components significantly enhance feature representation, training stability, and segmentation quality, especially in challenging low-annotation scenarios. Extensive experiments on ACDC, Synapse, and LA datasets show that C3MT consistently outperforms recent state-of-the-art methods. For example, on the ACDC dataset with 20% labeled data, C3MT achieves up to a 4.3% improvement in average Dice score and a reduction in HD95 of more than 1.0 mm compared with strong baselines. The implementation is publicly available at <span><span>https://github.com/l1654485/C3MT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102721"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified model with random penalty entropy loss for robust nasogastric tube placement analysis in X-ray 基于随机惩罚熵损失的x线鼻胃管鲁棒放置分析统一模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 DOI: 10.1016/j.compmedimag.2026.102715
GwiSeong Moon , Kyoung Min Moon , Inseo Park , Kanghee Lee , Doohee Lee , Woo Jin Kim , Yoon Kim , Ji Young Hong , Hyun-Soo Choi

Background and objective:

An accurate nasogastric (NG) tube placement assessment is essential to prevent serious complications. However, manual chest X-ray verification is prone to human error and variability. We propose a unified deep learning model that jointly performs segmentation and classification to improve the generalization and reliability of automated NG tube placement assessment.

Methods:

We developed a unified architecture based on nnUNet, which was optimized simultaneously for segmentation and classification. To enhance robustness and reduce overconfidence, we introduce Random Penalty Entropy Loss, which dynamically scales entropy penalties during training. The model was evaluated on internal datasets (5674 chest X-rays from three South Korean hospitals) and an external dataset from MIMIC-CXR.

Results:

On the internal test set, the proposed model outperformed the Wang 2-Stage method (F1: 93.94% vs. 87.39%), particularly in ambiguous cases. Baseline models using Focal Loss or Label Smoothing performed well internally but showed substantial performance drops and miscalibration externally. In contrast, our model with Random Penalty Entropy Loss achieved the highest external classification accuracy (F1: 66.34%, AUROC: 84.82%) and superior calibration (MCE: 0.429, ECE: 0.274).

Conclusion:

The proposed unified model surpasses existing two-stage approaches in classification and calibration. Incorporating Random Penalty Entropy Loss improves robustness and generalization across diverse clinical settings. These results highlight the model’s potential to reduce diagnostic errors and enhance patient safety in NG tube placement assessment.
背景与目的:准确的鼻胃管放置评估对预防严重并发症至关重要。然而,手动胸部x线验证容易出现人为错误和可变性。我们提出了一个统一的深度学习模型,联合进行分割和分类,以提高自动NG管放置评估的泛化和可靠性。方法:建立基于nnUNet的统一架构,并对该架构进行分割和分类的同时优化。为了增强鲁棒性和减少过度自信,我们引入了随机惩罚熵损失,它在训练过程中动态缩放熵惩罚。该模型在内部数据集(来自三家韩国医院的5674张胸部x光片)和MIMIC-CXR的外部数据集上进行了评估。结果:在内部测试集上,该模型优于Wang 2-Stage方法(F1: 93.94% vs. 87.39%),特别是在歧义情况下。使用焦损或标签平滑的基线模型在内部表现良好,但在外部表现出明显的性能下降和校准错误。相比之下,我们的随机惩罚熵损失模型获得了最高的外部分类准确率(F1: 66.34%, AUROC: 84.82%)和更好的校准(MCE: 0.429, ECE: 0.274)。结论:提出的统一模型在分类和标定方面优于现有的两阶段方法。结合随机惩罚熵损失提高鲁棒性和泛化在不同的临床设置。这些结果突出了该模型在减少NG管放置评估中的诊断错误和提高患者安全性方面的潜力。
{"title":"Unified model with random penalty entropy loss for robust nasogastric tube placement analysis in X-ray","authors":"GwiSeong Moon ,&nbsp;Kyoung Min Moon ,&nbsp;Inseo Park ,&nbsp;Kanghee Lee ,&nbsp;Doohee Lee ,&nbsp;Woo Jin Kim ,&nbsp;Yoon Kim ,&nbsp;Ji Young Hong ,&nbsp;Hyun-Soo Choi","doi":"10.1016/j.compmedimag.2026.102715","DOIUrl":"10.1016/j.compmedimag.2026.102715","url":null,"abstract":"<div><h3>Background and objective:</h3><div>An accurate nasogastric (NG) tube placement assessment is essential to prevent serious complications. However, manual chest X-ray verification is prone to human error and variability. We propose a unified deep learning model that jointly performs segmentation and classification to improve the generalization and reliability of automated NG tube placement assessment.</div></div><div><h3>Methods:</h3><div>We developed a unified architecture based on nnUNet, which was optimized simultaneously for segmentation and classification. To enhance robustness and reduce overconfidence, we introduce Random Penalty Entropy Loss, which dynamically scales entropy penalties during training. The model was evaluated on internal datasets (5674 chest X-rays from three South Korean hospitals) and an external dataset from MIMIC-CXR.</div></div><div><h3>Results:</h3><div>On the internal test set, the proposed model outperformed the Wang 2-Stage method (F1: 93.94% vs. 87.39%), particularly in ambiguous cases. Baseline models using Focal Loss or Label Smoothing performed well internally but showed substantial performance drops and miscalibration externally. In contrast, our model with Random Penalty Entropy Loss achieved the highest external classification accuracy (F1: 66.34%, AUROC: 84.82%) and superior calibration (MCE: 0.429, ECE: 0.274).</div></div><div><h3>Conclusion:</h3><div>The proposed unified model surpasses existing two-stage approaches in classification and calibration. Incorporating Random Penalty Entropy Loss improves robustness and generalization across diverse clinical settings. These results highlight the model’s potential to reduce diagnostic errors and enhance patient safety in NG tube placement assessment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102715"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Initial evaluation of a mixed-reality system for image-guided navigation during percutaneous liver tumor ablation 经皮肝肿瘤消融过程中图像引导导航混合现实系统的初步评估
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 DOI: 10.1016/j.compmedimag.2026.102714
Dominik Spinczyk , Grzegorz Rosiak , Jarosław Żyłkowski , Krzysztof Milczarek , Dariusz Konecki , Karol Zaczkowski , Agata Tomaszewska , Łukasz Przepióra , Anna Wolińska-Sołtys , Piotr Sperka , Dawid Hajda , Ewa Piętka
Minimally invasive ablation is a challenge for contemporary interventional radiology. This study aimed to investigate the feasibility of utilizing a mixed-reality system for this type of treatment. A HoloLens mixed reality and optical tracking system, which supports diagnosis, planning, and procedure implementation, was used for percutaneous liver tumor ablation. The system differentiated pathological liver changes at the diagnostic stage, allowing for the selection of the entry point and target during planning. Meanwhile, it provided a real-time fusion of intraoperative ultrasound images with a pre-operative hologram during the procedure. Additionally, the collision detection module enabled the detection of collisions between the ablative needle and anatomical structures, utilizing the actual needle trajectory. The system was evaluated in 11 patients with cancerous liver lesions. The mean accuracy of target point registration, selected at the planning stage, was 2.8 mm during the procedure's supporting stage. Additionally, operator depth perception improved, the effective needle trajectory was shortened, and the radiation dose was reduced for both the patient and the operator due to improved visibility of the needle within the patient’s body. A generally improved understanding of the mutual spatial relationship between anatomical structures was observed compared to the classical two-dimensional view, along with improved depth perception of the operating field. An additional advantage indicated by the operators was the real-time highlighting of anatomical structures susceptible to damage by the needle trajectory, such as blood vessels, bile ducts, and the lungs, which lowers the risk of complications.
微创消融是当代介入放射学面临的挑战。本研究旨在探讨利用混合现实系统进行此类治疗的可行性。HoloLens混合现实和光学跟踪系统,支持诊断、计划和程序实施,用于经皮肝肿瘤消融。该系统在诊断阶段对病理肝脏变化进行区分,允许在计划过程中选择切入点和靶点。同时,它提供了术中超声图像与术中全息图的实时融合。此外,碰撞检测模块能够检测烧蚀针与解剖结构之间的碰撞,利用实际的烧蚀针轨迹。该系统在11例肝癌患者中进行了评估。在计划阶段选择的目标点配准的平均精度在程序支持阶段为2.8 mm。此外,由于针头在患者体内的可见性提高,操作者的深度感知能力提高,有效针头轨迹缩短,患者和操作者的辐射剂量都降低了。与经典的二维视图相比,观察到对解剖结构之间相互空间关系的理解普遍提高,同时对手术视野的深度感知也有所提高。操作员指出的另一个优势是实时突出易受针头轨迹损伤的解剖结构,如血管、胆管和肺部,从而降低了并发症的风险。
{"title":"Initial evaluation of a mixed-reality system for image-guided navigation during percutaneous liver tumor ablation","authors":"Dominik Spinczyk ,&nbsp;Grzegorz Rosiak ,&nbsp;Jarosław Żyłkowski ,&nbsp;Krzysztof Milczarek ,&nbsp;Dariusz Konecki ,&nbsp;Karol Zaczkowski ,&nbsp;Agata Tomaszewska ,&nbsp;Łukasz Przepióra ,&nbsp;Anna Wolińska-Sołtys ,&nbsp;Piotr Sperka ,&nbsp;Dawid Hajda ,&nbsp;Ewa Piętka","doi":"10.1016/j.compmedimag.2026.102714","DOIUrl":"10.1016/j.compmedimag.2026.102714","url":null,"abstract":"<div><div>Minimally invasive ablation is a challenge for contemporary interventional radiology. This study aimed to investigate the feasibility of utilizing a mixed-reality system for this type of treatment. A HoloLens mixed reality and optical tracking system, which supports diagnosis, planning, and procedure implementation, was used for percutaneous liver tumor ablation. The system differentiated pathological liver changes at the diagnostic stage, allowing for the selection of the entry point and target during planning. Meanwhile, it provided a real-time fusion of intraoperative ultrasound images with a pre-operative hologram during the procedure. Additionally, the collision detection module enabled the detection of collisions between the ablative needle and anatomical structures, utilizing the actual needle trajectory. The system was evaluated in 11 patients with cancerous liver lesions. The mean accuracy of target point registration, selected at the planning stage, was 2.8 mm during the procedure's supporting stage. Additionally, operator depth perception improved, the effective needle trajectory was shortened, and the radiation dose was reduced for both the patient and the operator due to improved visibility of the needle within the patient’s body. A generally improved understanding of the mutual spatial relationship between anatomical structures was observed compared to the classical two-dimensional view, along with improved depth perception of the operating field. An additional advantage indicated by the operators was the real-time highlighting of anatomical structures susceptible to damage by the needle trajectory, such as blood vessels, bile ducts, and the lungs, which lowers the risk of complications.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102714"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146078445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interpretable machine learning framework with data-informed imaging biomarkers for diagnosis and prediction of Alzheimer’s disease 一个可解释的机器学习框架,具有数据知情的成像生物标志物,用于阿尔茨海默病的诊断和预测。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 DOI: 10.1016/j.compmedimag.2026.102722
Wenjie Kang , Bo Li , Lize C. Jiskoot , Peter Paul De Deyn , Geert Jan Biessels , Huiberdina L. Koek , Jurgen A.H.R. Claassen , Huub A.M. Middelkoop , Wiesje M. van der Flier , Willemijn J. Jansen , Stefan Klein , Esther E. Bron , Alzheimer’s Disease Neuroimaging Initiative , on behalf of the Parelsnoer Neurodegenerative Diseases study group
Machine learning methods based on imaging and other clinical data have shown great potential for improving the early and accurate diagnosis of Alzheimer’s disease (AD). However, for most deep learning models, especially those including high-dimensional imaging data, the decision-making process remains largely opaque which limits clinical applicability. Explainable Boosting Machines (EBMs) are inherently interpretable machine learning models, but are typically applied to low-dimensional data. In this study, we propose an interpretable machine learning framework that integrates data-driven feature extraction based on Convolutional Neural Networks (CNNs) with the intrinsic transparency of EBMs for AD diagnosis and prediction. The framework enables interpretation at both the group-level and individual-level by identifying imaging biomarkers contributing to predictions. We validated the framework on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort, achieving an area-under-the-curve (AUC) of 0.969 for AD vs. control classification and 0.750 for MCI conversion prediction. External validation was performed on an independent cohort, yielding AUCs of 0.871 for AD vs. subjective cognitive decline (SCD) classification and 0.666 for MCI conversion prediction. The proposed framework achieves performance comparable to state-of-the-art black-box models while offering transparent decision-making, a critical requirement for clinical translation. Our code is available at: https://gitlab.com/radiology/neuro/interpretable_ad_classification.
基于成像和其他临床数据的机器学习方法在提高阿尔茨海默病(AD)的早期和准确诊断方面显示出巨大的潜力。然而,对于大多数深度学习模型,特别是那些包含高维成像数据的模型,决策过程在很大程度上仍然是不透明的,这限制了临床应用。可解释的增强机器(EBMs)本质上是可解释的机器学习模型,但通常应用于低维数据。在这项研究中,我们提出了一个可解释的机器学习框架,该框架将基于卷积神经网络(cnn)的数据驱动特征提取与EBMs的内在透明度相结合,用于AD的诊断和预测。通过识别有助于预测的成像生物标志物,该框架能够在群体水平和个人水平上进行解释。我们在阿尔茨海默病神经影像学倡议(ADNI)队列中验证了该框架,实现了AD与对照分类的曲线下面积(AUC)为0.969,MCI转换预测为0.750。在独立队列中进行外部验证,得出AD与主观认知衰退(SCD)分类的auc为0.871,MCI转换预测的auc为0.666。所提出的框架实现了与最先进的黑箱模型相当的性能,同时提供透明的决策,这是临床翻译的关键要求。我们的代码可在:https://gitlab.com/radiology/neuro/interpretable_ad_classification。
{"title":"An interpretable machine learning framework with data-informed imaging biomarkers for diagnosis and prediction of Alzheimer’s disease","authors":"Wenjie Kang ,&nbsp;Bo Li ,&nbsp;Lize C. Jiskoot ,&nbsp;Peter Paul De Deyn ,&nbsp;Geert Jan Biessels ,&nbsp;Huiberdina L. Koek ,&nbsp;Jurgen A.H.R. Claassen ,&nbsp;Huub A.M. Middelkoop ,&nbsp;Wiesje M. van der Flier ,&nbsp;Willemijn J. Jansen ,&nbsp;Stefan Klein ,&nbsp;Esther E. Bron ,&nbsp;Alzheimer’s Disease Neuroimaging Initiative ,&nbsp;on behalf of the Parelsnoer Neurodegenerative Diseases study group","doi":"10.1016/j.compmedimag.2026.102722","DOIUrl":"10.1016/j.compmedimag.2026.102722","url":null,"abstract":"<div><div>Machine learning methods based on imaging and other clinical data have shown great potential for improving the early and accurate diagnosis of Alzheimer’s disease (AD). However, for most deep learning models, especially those including high-dimensional imaging data, the decision-making process remains largely opaque which limits clinical applicability. Explainable Boosting Machines (EBMs) are inherently interpretable machine learning models, but are typically applied to low-dimensional data. In this study, we propose an interpretable machine learning framework that integrates data-driven feature extraction based on Convolutional Neural Networks (CNNs) with the intrinsic transparency of EBMs for AD diagnosis and prediction. The framework enables interpretation at both the group-level and individual-level by identifying imaging biomarkers contributing to predictions. We validated the framework on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort, achieving an area-under-the-curve (AUC) of 0.969 for AD vs. control classification and 0.750 for MCI conversion prediction. External validation was performed on an independent cohort, yielding AUCs of 0.871 for AD vs. subjective cognitive decline (SCD) classification and 0.666 for MCI conversion prediction. The proposed framework achieves performance comparable to state-of-the-art black-box models while offering transparent decision-making, a critical requirement for clinical translation. Our code is available at: <span><span>https://gitlab.com/radiology/neuro/interpretable_ad_classification</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102722"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HF-VLP: A multimodal vision-language pre-trained model for diagnosing heart failure HF-VLP:用于诊断心力衰竭的多模态视觉语言预训练模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 DOI: 10.1016/j.compmedimag.2026.102719
Huiting Ma , Dengao Li , Guiji Zhao , Li Liu , Jian Fu , Xiaole Fan , Zhe Zhang , Yuchen Liang
A considerable increase in the incidence of heart failure (HF) has recently posed major challenges to the medical field, underscoring the urgent need for early detection and intervention. Medical vision-and-language pretraining learns general representation from medical images and texts and shows great prospects in multimodal data-based diagnosis. During model pretraining, patient images may contain multiple symptoms simultaneously and medical research faces a considerable challenge from noisy labels owing to factors such as differences among experts and machine-extracted labels. Furthermore, parameter-efficient fine-tuning (PEFT) is important for promoting model development. To address these challenges, we developed a multimodal vision–language pretrained model for HF, called HF-VLP. In particular, label calibration loss is adopted to solve the labeling noise problem of multisource pretraining data. During pretraining, the data labels are calibrated in real time using the correlation between the labels and prediction confidence. Considering the efficient migratability of the pretrained model, a PEFT method called decomposed singular value weight-decomposed low-rank adaptation is developed. It can learn the downstream data distribution quickly by fine-tuning <1% of the parameters to obtain a better diagnosis rate than zero-shot. Simultaneously, the developed model fuses chest X-ray image features and radiology report features through the dynamic fusion graph module, enhancing the interaction and expression ability of multimodal information. The validity of the model is verified on multiple medical datasets. The average AUC of multisymptom prediction in the Open-I dataset and the hospital dataset PPL-CXR reached 83.67% and 91.28%, respectively. The developed model can accurately classify the symptoms of patients, thereby assisting doctors in diagnosis.
最近,心力衰竭(HF)发病率的大幅增加对医疗领域构成了重大挑战,强调了早期发现和干预的迫切需要。医学视觉语言预训练从医学图像和文本中学习通用表征,在基于多模态数据的诊断中具有广阔的应用前景。在模型预训练过程中,患者图像可能同时包含多个症状,由于专家之间的差异和机器提取的标签等因素,医学研究面临着来自噪声标签的相当大的挑战。此外,参数有效微调(PEFT)对于促进模型的发展非常重要。为了应对这些挑战,我们开发了一种用于高频的多模态视觉语言预训练模型,称为HF- vlp。特别地,采用标签校准损失来解决多源预训练数据的标记噪声问题。在预训练过程中,利用标签与预测置信度之间的相关性实时校准数据标签。考虑到预训练模型的可迁移性,提出了一种分解奇异值权重分解低秩自适应PEFT方法。它可以通过微调快速学习下游数据分布
{"title":"HF-VLP: A multimodal vision-language pre-trained model for diagnosing heart failure","authors":"Huiting Ma ,&nbsp;Dengao Li ,&nbsp;Guiji Zhao ,&nbsp;Li Liu ,&nbsp;Jian Fu ,&nbsp;Xiaole Fan ,&nbsp;Zhe Zhang ,&nbsp;Yuchen Liang","doi":"10.1016/j.compmedimag.2026.102719","DOIUrl":"10.1016/j.compmedimag.2026.102719","url":null,"abstract":"<div><div>A considerable increase in the incidence of heart failure (HF) has recently posed major challenges to the medical field, underscoring the urgent need for early detection and intervention. Medical vision-and-language pretraining learns general representation from medical images and texts and shows great prospects in multimodal data-based diagnosis. During model pretraining, patient images may contain multiple symptoms simultaneously and medical research faces a considerable challenge from noisy labels owing to factors such as differences among experts and machine-extracted labels. Furthermore, parameter-efficient fine-tuning (PEFT) is important for promoting model development. To address these challenges, we developed a multimodal vision–language pretrained model for HF, called HF-VLP. In particular, label calibration loss is adopted to solve the labeling noise problem of multisource pretraining data. During pretraining, the data labels are calibrated in real time using the correlation between the labels and prediction confidence. Considering the efficient migratability of the pretrained model, a PEFT method called decomposed singular value weight-decomposed low-rank adaptation is developed. It can learn the downstream data distribution quickly by fine-tuning &lt;1% of the parameters to obtain a better diagnosis rate than zero-shot. Simultaneously, the developed model fuses chest X-ray image features and radiology report features through the dynamic fusion graph module, enhancing the interaction and expression ability of multimodal information. The validity of the model is verified on multiple medical datasets. The average AUC of multisymptom prediction in the Open-I dataset and the hospital dataset PPL-CXR reached 83.67% and 91.28%, respectively. The developed model can accurately classify the symptoms of patients, thereby assisting doctors in diagnosis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102719"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fuzzy rough set loss for deep learning-based precise medical image segmentation 基于深度学习的模糊粗糙集损失医学图像精确分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-23 DOI: 10.1016/j.compmedimag.2026.102716
Mohsin Furkh Dar , Avatharam Ganivada
Accurate segmentation of medical images is crucial for diagnosis and treatment planning, yet it remains challenging due to ambiguous lesion boundaries, class imbalance, and complex anatomical structures. We propose a novel Fuzzy Rough Set-inspired (FRS) loss function that addresses these challenges by integrating pixels’ fuzzy similarity relations with a boundary uncertainty model in a convex combination method. To obtain the boundary uncertainty model, the fuzzy lower and upper approximations of a set of pixels and membership weights are utilized. The FRS loss function enhances boundary sensitivity and handles prediction uncertainty through its dual components: a fuzzy similarity term that captures gradual transitions at lesion boundaries, and boundary uncertainty model that deals with uncertainty and mitigates class imbalance. Extensive experiments across five diverse medical imaging datasets — breast ultrasound, gastrointestinal polyps, brain Magnetic Resonance Imaging (MRI), chest Computed Tomography (CT), and skin lesions — demonstrate the effectiveness of our approach. The FRS loss achieves superior segmentation performance with an average improvement of 2.1% in Dice score compared to the best baseline method, while demonstrating statistically significant improvements across all evaluated metrics (p < 0.001). The FRS loss shows its robustness to moderate class imbalance while maintaining computational efficiency (mean inference time 0.075–0.12 s per image, 4.5 MB memory). These results suggest that the FRS loss function provides a robust and interpretable framework for precise medical image segmentation, particularly in cases with ambiguous boundaries and moderate imbalance. Code: https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss.
医学图像的准确分割对于诊断和治疗计划至关重要,但由于病变边界模糊、分类不平衡和复杂的解剖结构,仍然具有挑战性。我们提出了一种新的模糊粗糙集启发(FRS)损失函数,该函数通过凸组合方法将像素的模糊相似关系与边界不确定性模型集成来解决这些挑战。为了获得边界不确定性模型,利用了一组像素和隶属权的模糊上下近似。FRS损失函数增强了边界敏感性,并通过其双重组成部分处理预测不确定性:模糊相似项捕获病变边界的逐渐转变,边界不确定性模型处理不确定性并减轻类不平衡。在五种不同的医学成像数据集(乳腺超声、胃肠道息肉、脑磁共振成像(MRI)、胸部计算机断层扫描(CT)和皮肤病变)上进行的广泛实验证明了我们方法的有效性。与最佳基线方法相比,FRS损失获得了卓越的分割性能,Dice得分平均提高2.1%,同时在所有评估指标上显示出统计学上显著的改进(p < 0.001)。在保持计算效率(平均推理时间0.075-0.12 s /图像,4.5 MB内存)的情况下,FRS损失对适度的类不平衡具有鲁棒性。这些结果表明,FRS损失函数为精确的医学图像分割提供了一个鲁棒和可解释的框架,特别是在边界模糊和中度不平衡的情况下。代码:https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss。
{"title":"Fuzzy rough set loss for deep learning-based precise medical image segmentation","authors":"Mohsin Furkh Dar ,&nbsp;Avatharam Ganivada","doi":"10.1016/j.compmedimag.2026.102716","DOIUrl":"10.1016/j.compmedimag.2026.102716","url":null,"abstract":"<div><div>Accurate segmentation of medical images is crucial for diagnosis and treatment planning, yet it remains challenging due to ambiguous lesion boundaries, class imbalance, and complex anatomical structures. We propose a novel Fuzzy Rough Set-inspired (FRS) loss function that addresses these challenges by integrating pixels’ fuzzy similarity relations with a boundary uncertainty model in a convex combination method. To obtain the boundary uncertainty model, the fuzzy lower and upper approximations of a set of pixels and membership weights are utilized. The FRS loss function enhances boundary sensitivity and handles prediction uncertainty through its dual components: a fuzzy similarity term that captures gradual transitions at lesion boundaries, and boundary uncertainty model that deals with uncertainty and mitigates class imbalance. Extensive experiments across five diverse medical imaging datasets — breast ultrasound, gastrointestinal polyps, brain Magnetic Resonance Imaging (MRI), chest Computed Tomography (CT), and skin lesions — demonstrate the effectiveness of our approach. The FRS loss achieves superior segmentation performance with an average improvement of 2.1% in Dice score compared to the best baseline method, while demonstrating statistically significant improvements across all evaluated metrics (p <span><math><mo>&lt;</mo></math></span> 0.001). The FRS loss shows its robustness to moderate class imbalance while maintaining computational efficiency (mean inference time 0.075–0.12 s per image, 4.5 MB memory). These results suggest that the FRS loss function provides a robust and interpretable framework for precise medical image segmentation, particularly in cases with ambiguous boundaries and moderate imbalance. Code: <span><span>https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102716"},"PeriodicalIF":4.9,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentiable Neural Architecture Search for medical image segmentation: A systematic review and field audit 医学图像分割的可微分神经结构搜索:系统回顾和现场审计
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-20 DOI: 10.1016/j.compmedimag.2026.102713
Emil Benedykciuk, Marcin Denkowski, Grzegorz M. Wójcik
Medical image segmentation is critical for diagnosis, treatment planning, and disease monitoring, yet differs from generic semantic segmentation due to volumetric data, modality-specific artifacts, costly and uncertain expert annotations, and domain shift across scanners and institutions. Neural Architecture Search (NAS) can automate model design, but many NAS paradigms become impractical for 3D segmentation because evaluating large numbers of candidate architectures is computationally prohibitive. Differentiable NAS (DNAS) alleviates this barrier by optimizing relaxed architectural choices with gradients in a weight-sharing supernet, making search feasible under realistic compute and memory budgets. However, DNAS introduces distinct methodological risks (e.g., optimization instability and discretization gap) and raises challenges in reproducibility and clinical deployability. We conduct a PRISMA-inspired systematic review of DNAS for medical image segmentation (multi-database screening, 2018-2025), retaining 33 papers representing 31 unique methods for quantitative analysis. Across the included studies, external validation on independent-site data is rare (10%), full code release (including search procedures) is limited (26%), and only a minority substantively addresses search stability (23%). Despite clear clinical relevance, multi-objective search that explicitly optimizes latency or memory is also uncommon (23%). We position DNAS within the broader NAS landscape, introduce a segmentation-focused taxonomy, and propose a NAS Reporting Card tailored to medical segmentation to improve transparency, comparability, and reproducibility.
医学图像分割对于诊断、治疗计划和疾病监测至关重要,但由于体积数据、模式特定的工件、昂贵且不确定的专家注释以及扫描仪和机构之间的域转移,它与一般的语义分割不同。神经结构搜索(NAS)可以实现模型设计的自动化,但由于评估大量候选结构在计算上令人望而却步,许多NAS范式对于3D分割变得不切实际。可微分NAS (DNAS)通过在权重共享超级网络中使用梯度优化轻松的体系结构选择,使搜索在实际计算和内存预算下可行,从而减轻了这一障碍。然而,dna引入了独特的方法风险(例如,优化不稳定性和离散化差距),并在可重复性和临床可部署性方面提出了挑战。我们对用于医学图像分割的dna进行了prisma启发的系统综述(多数据库筛选,2018-2025),保留了33篇论文,代表31种独特的定量分析方法。在纳入的研究中,对独立站点数据的外部验证很少(约10%),完整的代码发布(包括搜索过程)有限(约26%),只有少数研究实质性地解决了搜索稳定性(约23%)。尽管有明确的临床相关性,明确优化潜伏期或记忆的多目标搜索也不常见(约23%)。我们将dna定位在更广泛的NAS领域,引入以细分为重点的分类法,并提出针对医疗细分量身定制的NAS报告卡,以提高透明度、可比性和可重复性。
{"title":"Differentiable Neural Architecture Search for medical image segmentation: A systematic review and field audit","authors":"Emil Benedykciuk,&nbsp;Marcin Denkowski,&nbsp;Grzegorz M. Wójcik","doi":"10.1016/j.compmedimag.2026.102713","DOIUrl":"10.1016/j.compmedimag.2026.102713","url":null,"abstract":"<div><div>Medical image segmentation is critical for diagnosis, treatment planning, and disease monitoring, yet differs from generic semantic segmentation due to volumetric data, modality-specific artifacts, costly and uncertain expert annotations, and domain shift across scanners and institutions. Neural Architecture Search (NAS) can automate model design, but many NAS paradigms become impractical for 3D segmentation because evaluating large numbers of candidate architectures is computationally prohibitive. Differentiable NAS (DNAS) alleviates this barrier by optimizing relaxed architectural choices with gradients in a weight-sharing supernet, making search feasible under realistic compute and memory budgets. However, DNAS introduces distinct methodological risks (e.g., optimization instability and discretization gap) and raises challenges in reproducibility and clinical deployability. We conduct a PRISMA-inspired systematic review of DNAS for medical image segmentation (multi-database screening, 2018-2025), retaining 33 papers representing 31 unique methods for quantitative analysis. Across the included studies, external validation on independent-site data is rare (<span><math><mo>∼</mo></math></span>10%), full code release (including search procedures) is limited (<span><math><mo>∼</mo></math></span>26%), and only a minority substantively addresses search stability (<span><math><mo>∼</mo></math></span>23%). Despite clear clinical relevance, multi-objective search that explicitly optimizes latency or memory is also uncommon (<span><math><mo>∼</mo></math></span>23%). We position DNAS within the broader NAS landscape, introduce a segmentation-focused taxonomy, and propose a NAS Reporting Card tailored to medical segmentation to improve transparency, comparability, and reproducibility.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102713"},"PeriodicalIF":4.9,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoPromptSeg: Automated Decoupling of Uncertainty Prompts with SAM for semi-supervised medical image segmentation AutoPromptSeg:用于半监督医学图像分割的不确定性提示与SAM的自动解耦
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-17 DOI: 10.1016/j.compmedimag.2026.102708
Junan Zhu, Zhizhe Tang, Ping Ma, Zheng Liang, Chuanjian Wang
The scarcity of high-quality annotated data limits the application of supervised learning in disease diagnosis. Semi-supervised learning (SSL) offers a promising solution to this challenge, utilizing both limited labeled data and large-scale unlabeled data to significantly boost segmentation accuracy. While existing SSL methods focus on model-centric regularization strategies, the emergence of promptable foundation models like Segment Anything (SAM) presents new opportunities for paradigmatic advancement. However, SAM requires datasets with additional prompt annotations to guide the segmentation process, yet most existing medical imaging datasets do not contain them. To address this limitation, we introduce a novel semi-supervised 3D medical image segmentation method called AutoPromptSeg, which generates effective prompts with the Decoupled Uncertainty Prompt Generator (DUPG) while maintaining superior segmentation performance in data-scarce scenarios. Concurrently, we employ the Channel Alignment and Fusion Architecture (CAFA) to align features obtained from different branches, thereby bolstering the representational capacity of unlabeled data. Our proposed approach achieves state-of-the-art performance on three benchmarks: the multi-modality abdominal multi-organ segmentation challenge 2022 dataset (Amos 2022), the left atrium dataset (LA), and the brain tumor segmentation challenge 2020 dataset (BraTS 2020). AutoPromptSeg achieves Dice Score of 68.78% on Amos 2022, 90.02% on LA, and 86.63% on BraTS 2020 under only 10% labeled data setting, demonstrating the excellent performance of our semi-supervised learning framework in limited annotated data.
缺乏高质量的带注释数据限制了监督学习在疾病诊断中的应用。半监督学习(SSL)为这一挑战提供了一个很有前途的解决方案,它利用有限的标记数据和大规模的未标记数据来显著提高分割的准确性。虽然现有的SSL方法侧重于以模型为中心的正则化策略,但像分段任意(SAM)这样的快速基础模型的出现为范式的进步提供了新的机会。然而,SAM需要带有额外提示注释的数据集来指导分割过程,然而大多数现有的医学成像数据集不包含这些注释。为了解决这一限制,我们引入了一种名为AutoPromptSeg的新型半监督3D医学图像分割方法,该方法使用解耦不确定性提示生成器(decoupling Uncertainty Prompt Generator, DUPG)生成有效提示,同时在数据稀缺的情况下保持优越的分割性能。同时,我们采用通道对齐和融合架构(CAFA)来对齐从不同分支获得的特征,从而增强未标记数据的表示能力。我们提出的方法在三个基准上实现了最先进的性能:多模态腹部多器官分割挑战2022数据集(Amos 2022)、左心房数据集(LA)和脑肿瘤分割挑战2020数据集(BraTS 2020)。AutoPromptSeg在仅10%的标注数据设置下,在Amos 2022上获得68.78%的Dice Score,在LA上获得90.02%的Dice Score,在BraTS 2020上获得86.63%的Dice Score,证明了我们的半监督学习框架在有限标注数据下的优异性能。
{"title":"AutoPromptSeg: Automated Decoupling of Uncertainty Prompts with SAM for semi-supervised medical image segmentation","authors":"Junan Zhu,&nbsp;Zhizhe Tang,&nbsp;Ping Ma,&nbsp;Zheng Liang,&nbsp;Chuanjian Wang","doi":"10.1016/j.compmedimag.2026.102708","DOIUrl":"10.1016/j.compmedimag.2026.102708","url":null,"abstract":"<div><div>The scarcity of high-quality annotated data limits the application of supervised learning in disease diagnosis. Semi-supervised learning (SSL) offers a promising solution to this challenge, utilizing both limited labeled data and large-scale unlabeled data to significantly boost segmentation accuracy. While existing SSL methods focus on model-centric regularization strategies, the emergence of promptable foundation models like Segment Anything (SAM) presents new opportunities for paradigmatic advancement. However, SAM requires datasets with additional prompt annotations to guide the segmentation process, yet most existing medical imaging datasets do not contain them. To address this limitation, we introduce a novel semi-supervised 3D medical image segmentation method called AutoPromptSeg, which generates effective prompts with the Decoupled Uncertainty Prompt Generator (DUPG) while maintaining superior segmentation performance in data-scarce scenarios. Concurrently, we employ the Channel Alignment and Fusion Architecture (CAFA) to align features obtained from different branches, thereby bolstering the representational capacity of unlabeled data. Our proposed approach achieves state-of-the-art performance on three benchmarks: the multi-modality abdominal multi-organ segmentation challenge 2022 dataset (Amos 2022), the left atrium dataset (LA), and the brain tumor segmentation challenge 2020 dataset (BraTS 2020). AutoPromptSeg achieves Dice Score of 68.78% on Amos 2022, 90.02% on LA, and 86.63% on BraTS 2020 under only 10% labeled data setting, demonstrating the excellent performance of our semi-supervised learning framework in limited annotated data.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102708"},"PeriodicalIF":4.9,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFMFI: ultrasound breast cancer detection method based on dynamic fusion multi-scale feature interaction model DFMFI:基于动态融合多尺度特征交互模型的超声乳腺癌检测方法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.compmedimag.2026.102710
Chenbin Ma , Haonan Zhang , Lishuang Guo
Ultrasound imaging has become an important method of breast cancer screening due to its non-invasive, low-cost, and ionizing radiation-free characteristics. However, the complexity and uncertainty of ultrasound images (such as speckle noise, morphological diversity of lesion areas, and inter-class similarity) pose challenges to traditional computer-aided diagnosis systems. In response to these issues, this paper proposes a Dynamic Fusion Multi-Scale Feature Interaction Model (DFMFI), specifically designed for the task of benign and malignant breast cancer detection in ultrasound imaging. DFMFI enhances the model's ability to model complex lesion features by combining dynamic feature fusion, multi-scale feature aggregation, and nonlinear dynamic interaction mechanisms. The model includes three core modules: the dynamic feature mixer uses overlapped spatial reduction attention and dynamic depth convolution to efficiently integrate global and local information; the efficient multi-scale feature aggregator captures multi-scale lesion features through a multi-branch structure; the dynamic gated feed forward network enhances the adaptability of feature flow through gating mechanisms and nonlinear reconstruction. Experimental results show that DFMFI significantly outperforms existing methods in terms of classification accuracy, robustness, and computational efficiency, providing an efficient and robust solution for the early screening and diagnosis of breast cancer.
超声成像具有无创、低成本、无电离辐射等特点,已成为乳腺癌筛查的重要手段。然而,超声图像的复杂性和不确定性(如斑点噪声、病变区域的形态多样性和类间相似性)给传统的计算机辅助诊断系统带来了挑战。针对这些问题,本文提出了一种动态融合多尺度特征交互模型(DFMFI),专门用于超声成像中乳腺癌良恶性检测任务。DFMFI结合动态特征融合、多尺度特征聚合和非线性动态交互机制,增强了模型对复杂病变特征的建模能力。该模型包括三个核心模块:动态特征混合器使用重叠的空间约简关注和动态深度卷积来有效地整合全局和局部信息;高效的多尺度特征聚合器通过多分支结构捕获多尺度病变特征;动态门控前馈网络通过门控机制和非线性重构增强了特征流的自适应性。实验结果表明,DFMFI在分类准确率、鲁棒性和计算效率方面明显优于现有方法,为乳腺癌的早期筛查和诊断提供了高效、鲁棒的解决方案。
{"title":"DFMFI: ultrasound breast cancer detection method based on dynamic fusion multi-scale feature interaction model","authors":"Chenbin Ma ,&nbsp;Haonan Zhang ,&nbsp;Lishuang Guo","doi":"10.1016/j.compmedimag.2026.102710","DOIUrl":"10.1016/j.compmedimag.2026.102710","url":null,"abstract":"<div><div>Ultrasound imaging has become an important method of breast cancer screening due to its non-invasive, low-cost, and ionizing radiation-free characteristics. However, the complexity and uncertainty of ultrasound images (such as speckle noise, morphological diversity of lesion areas, and inter-class similarity) pose challenges to traditional computer-aided diagnosis systems. In response to these issues, this paper proposes a Dynamic Fusion Multi-Scale Feature Interaction Model (DFMFI), specifically designed for the task of benign and malignant breast cancer detection in ultrasound imaging. DFMFI enhances the model's ability to model complex lesion features by combining dynamic feature fusion, multi-scale feature aggregation, and nonlinear dynamic interaction mechanisms. The model includes three core modules: the dynamic feature mixer uses overlapped spatial reduction attention and dynamic depth convolution to efficiently integrate global and local information; the efficient multi-scale feature aggregator captures multi-scale lesion features through a multi-branch structure; the dynamic gated feed forward network enhances the adaptability of feature flow through gating mechanisms and nonlinear reconstruction. Experimental results show that DFMFI significantly outperforms existing methods in terms of classification accuracy, robustness, and computational efficiency, providing an efficient and robust solution for the early screening and diagnosis of breast cancer.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102710"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral attribute reasoning for interpretable multi-modal pathological segmentation 可解释的多模态病理分割的光谱属性推理
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-14 DOI: 10.1016/j.compmedimag.2026.102707
Lixin Zhang , Qian Wang , Zhao Chen , Ying Chen
Accurate segmentation of diverse histological entities is fundamental in computational pathology and critical for clinical diagnosis. Advances in microscopic imaging provide complementary information, particularly microscopic hyperspectral images (MHSIs) capture pathological differences through distinct spectral signatures, while RGB images offer high-resolution spatial and texture details. However, most multi-modal methods emphasize representation learning and modality alignment, but they offer limited insight into how the modalities interact to inform segmentation. This lack of explicit reasoning limits interpretability, and existing approaches, largely based on text prompts or spatial patterns, fail to exploit the pathology-relevant spectral signatures in MHSIs. To address these gaps, we propose Pisa-Net, a Pathology-Interpretable Spectral Attribute Learning Network for MHSI–RGB segmentation. Pisa-Net performs interpretable spectral reasoning through knowledge-driven attribute learning, incorporating pathology knowledge via pathologist-selected spectral signatures from key histological entities. These spectral attributes and the MHSI inputs are encoded through a frequency-domain representation into attribute embeddings and MHSI representations, whose similarities provide explicit pathology-grounded spectral evidence. The frequency components are further decomposed into low-, mid-, and high-frequency ranges and adaptively re-weighted via learned phase and magnitude, enabling the model to capture global semantics, structural patterns, and fine discriminative details. Guided by this spectral evidence, Pisa-Net integrates RGB and MHSI features through sparse spatial compression, ensuring that multi-modal fusion remains consistent with the underlying pathological reasoning. Experiments on public multi-modal pathology datasets demonstrate that Pisa-Net achieves superior segmentation performance in cells, glands, and tumors while improving interpretability by explicitly linking predictions to spectral evidence aligned with pathology knowledge.
不同组织实体的准确分割是计算病理学的基础,也是临床诊断的关键。显微成像技术的进步提供了互补的信息,特别是显微高光谱图像(MHSIs)通过不同的光谱特征捕捉病理差异,而RGB图像提供高分辨率的空间和纹理细节。然而,大多数多模态方法强调表征学习和模态对齐,但它们对模态如何相互作用以通知分割提供了有限的见解。这种明确推理的缺乏限制了可解释性,现有的方法主要基于文本提示或空间模式,无法利用mhsi中与病理相关的频谱特征。为了解决这些差距,我们提出了一种用于MHSI-RGB分割的病理可解释光谱属性学习网络Pisa-Net。Pisa-Net通过知识驱动的属性学习执行可解释的频谱推理,通过病理学家选择的关键组织学实体的频谱特征结合病理知识。这些频谱属性和MHSI输入通过频域表示编码为属性嵌入和MHSI表示,其相似性提供了明确的基于病理的频谱证据。频率分量被进一步分解为低、中、高频范围,并通过学习到的相位和幅度自适应地重新加权,使模型能够捕获全局语义、结构模式和精细的判别细节。在这一光谱证据的指导下,Pisa-Net通过稀疏空间压缩整合RGB和MHSI特征,确保多模态融合与潜在的病理推理保持一致。在公共多模态病理数据集上的实验表明,Pisa-Net在细胞、腺体和肿瘤中实现了卓越的分割性能,同时通过明确地将预测与与病理知识一致的光谱证据联系起来,提高了可解释性。
{"title":"Spectral attribute reasoning for interpretable multi-modal pathological segmentation","authors":"Lixin Zhang ,&nbsp;Qian Wang ,&nbsp;Zhao Chen ,&nbsp;Ying Chen","doi":"10.1016/j.compmedimag.2026.102707","DOIUrl":"10.1016/j.compmedimag.2026.102707","url":null,"abstract":"<div><div>Accurate segmentation of diverse histological entities is fundamental in computational pathology and critical for clinical diagnosis. Advances in microscopic imaging provide complementary information, particularly microscopic hyperspectral images (MHSIs) capture pathological differences through distinct spectral signatures, while RGB images offer high-resolution spatial and texture details. However, most multi-modal methods emphasize representation learning and modality alignment, but they offer limited insight into how the modalities interact to inform segmentation. This lack of explicit reasoning limits interpretability, and existing approaches, largely based on text prompts or spatial patterns, fail to exploit the pathology-relevant spectral signatures in MHSIs. To address these gaps, we propose Pisa-Net, a <strong>P</strong>athology-<strong>I</strong>nterpretable <strong>S</strong>pectral <strong>A</strong>ttribute Learning <strong>Net</strong>work for MHSI–RGB segmentation. Pisa-Net performs interpretable spectral reasoning through knowledge-driven attribute learning, incorporating pathology knowledge via pathologist-selected spectral signatures from key histological entities. These spectral attributes and the MHSI inputs are encoded through a frequency-domain representation into attribute embeddings and MHSI representations, whose similarities provide explicit pathology-grounded spectral evidence. The frequency components are further decomposed into low-, mid-, and high-frequency ranges and adaptively re-weighted via learned phase and magnitude, enabling the model to capture global semantics, structural patterns, and fine discriminative details. Guided by this spectral evidence, Pisa-Net integrates RGB and MHSI features through sparse spatial compression, ensuring that multi-modal fusion remains consistent with the underlying pathological reasoning. Experiments on public multi-modal pathology datasets demonstrate that Pisa-Net achieves superior segmentation performance in cells, glands, and tumors while improving interpretability by explicitly linking predictions to spectral evidence aligned with pathology knowledge.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102707"},"PeriodicalIF":4.9,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1