首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
Spectral attribute reasoning for interpretable multi-modal pathological segmentation 可解释的多模态病理分割的光谱属性推理
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-14 DOI: 10.1016/j.compmedimag.2026.102707
Lixin Zhang , Qian Wang , Zhao Chen , Ying Chen
Accurate segmentation of diverse histological entities is fundamental in computational pathology and critical for clinical diagnosis. Advances in microscopic imaging provide complementary information, particularly microscopic hyperspectral images (MHSIs) capture pathological differences through distinct spectral signatures, while RGB images offer high-resolution spatial and texture details. However, most multi-modal methods emphasize representation learning and modality alignment, but they offer limited insight into how the modalities interact to inform segmentation. This lack of explicit reasoning limits interpretability, and existing approaches, largely based on text prompts or spatial patterns, fail to exploit the pathology-relevant spectral signatures in MHSIs. To address these gaps, we propose Pisa-Net, a Pathology-Interpretable Spectral Attribute Learning Network for MHSI–RGB segmentation. Pisa-Net performs interpretable spectral reasoning through knowledge-driven attribute learning, incorporating pathology knowledge via pathologist-selected spectral signatures from key histological entities. These spectral attributes and the MHSI inputs are encoded through a frequency-domain representation into attribute embeddings and MHSI representations, whose similarities provide explicit pathology-grounded spectral evidence. The frequency components are further decomposed into low-, mid-, and high-frequency ranges and adaptively re-weighted via learned phase and magnitude, enabling the model to capture global semantics, structural patterns, and fine discriminative details. Guided by this spectral evidence, Pisa-Net integrates RGB and MHSI features through sparse spatial compression, ensuring that multi-modal fusion remains consistent with the underlying pathological reasoning. Experiments on public multi-modal pathology datasets demonstrate that Pisa-Net achieves superior segmentation performance in cells, glands, and tumors while improving interpretability by explicitly linking predictions to spectral evidence aligned with pathology knowledge.
不同组织实体的准确分割是计算病理学的基础,也是临床诊断的关键。显微成像技术的进步提供了互补的信息,特别是显微高光谱图像(MHSIs)通过不同的光谱特征捕捉病理差异,而RGB图像提供高分辨率的空间和纹理细节。然而,大多数多模态方法强调表征学习和模态对齐,但它们对模态如何相互作用以通知分割提供了有限的见解。这种明确推理的缺乏限制了可解释性,现有的方法主要基于文本提示或空间模式,无法利用mhsi中与病理相关的频谱特征。为了解决这些差距,我们提出了一种用于MHSI-RGB分割的病理可解释光谱属性学习网络Pisa-Net。Pisa-Net通过知识驱动的属性学习执行可解释的频谱推理,通过病理学家选择的关键组织学实体的频谱特征结合病理知识。这些频谱属性和MHSI输入通过频域表示编码为属性嵌入和MHSI表示,其相似性提供了明确的基于病理的频谱证据。频率分量被进一步分解为低、中、高频范围,并通过学习到的相位和幅度自适应地重新加权,使模型能够捕获全局语义、结构模式和精细的判别细节。在这一光谱证据的指导下,Pisa-Net通过稀疏空间压缩整合RGB和MHSI特征,确保多模态融合与潜在的病理推理保持一致。在公共多模态病理数据集上的实验表明,Pisa-Net在细胞、腺体和肿瘤中实现了卓越的分割性能,同时通过明确地将预测与与病理知识一致的光谱证据联系起来,提高了可解释性。
{"title":"Spectral attribute reasoning for interpretable multi-modal pathological segmentation","authors":"Lixin Zhang ,&nbsp;Qian Wang ,&nbsp;Zhao Chen ,&nbsp;Ying Chen","doi":"10.1016/j.compmedimag.2026.102707","DOIUrl":"10.1016/j.compmedimag.2026.102707","url":null,"abstract":"<div><div>Accurate segmentation of diverse histological entities is fundamental in computational pathology and critical for clinical diagnosis. Advances in microscopic imaging provide complementary information, particularly microscopic hyperspectral images (MHSIs) capture pathological differences through distinct spectral signatures, while RGB images offer high-resolution spatial and texture details. However, most multi-modal methods emphasize representation learning and modality alignment, but they offer limited insight into how the modalities interact to inform segmentation. This lack of explicit reasoning limits interpretability, and existing approaches, largely based on text prompts or spatial patterns, fail to exploit the pathology-relevant spectral signatures in MHSIs. To address these gaps, we propose Pisa-Net, a <strong>P</strong>athology-<strong>I</strong>nterpretable <strong>S</strong>pectral <strong>A</strong>ttribute Learning <strong>Net</strong>work for MHSI–RGB segmentation. Pisa-Net performs interpretable spectral reasoning through knowledge-driven attribute learning, incorporating pathology knowledge via pathologist-selected spectral signatures from key histological entities. These spectral attributes and the MHSI inputs are encoded through a frequency-domain representation into attribute embeddings and MHSI representations, whose similarities provide explicit pathology-grounded spectral evidence. The frequency components are further decomposed into low-, mid-, and high-frequency ranges and adaptively re-weighted via learned phase and magnitude, enabling the model to capture global semantics, structural patterns, and fine discriminative details. Guided by this spectral evidence, Pisa-Net integrates RGB and MHSI features through sparse spatial compression, ensuring that multi-modal fusion remains consistent with the underlying pathological reasoning. Experiments on public multi-modal pathology datasets demonstrate that Pisa-Net achieves superior segmentation performance in cells, glands, and tumors while improving interpretability by explicitly linking predictions to spectral evidence aligned with pathology knowledge.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102707"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TGIAlign: Text-guided dual-branch bidirectional framework for cross-modal semantic alignment in medical vision-language TGIAlign:用于医学视觉语言中跨模态语义对齐的文本引导双分支双向框架
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-13 DOI: 10.1016/j.compmedimag.2025.102694
Wenhua Li, Lifang Wang, Min Zhao, Xingzhang Lü, Linwen Yi
Medical image–text alignment remains challenging due to subtle lesion patterns, heterogeneous vision–language semantics, and the lack of lesion-aware guidance during visual encoding. Existing methods typically introduce textual information only after visual features have been computed, leaving early and mid-level representations insufficiently conditioned on diagnostic semantics. This limits the model’s ability to capture fine-grained abnormalities and maintain stable alignment across heterogeneous chest X-ray datasets. To address these limitations, we propose TGIAlign, a text-guided dual-branch bidirectional alignment framework that applies structured, lesion-centric cues to intermediate visual representations obtained from the frozen encoder. A large language model (LLM) is used to extract normalized, attribute-based lesion descriptions, providing consistent semantic guidance across samples. These cues are incorporated through the Text-Guided Image Feature Weighting (TGIF) module, which reweights intermediate feature outputs using similarity-derived weights, enabling multi-scale semantic conditioning without modifying the frozen backbone. To capture complementary visual cues, TGIAlign integrates multi-scale text-guided features with high-level visual representations through a Dual-Branch Bidirectional Alignment (DBBA) mechanism. Experiments on six public chest X-ray datasets demonstrate that TGIAlign achieves stable top-K retrieval and reliable text-guided lesion localization, highlighting the effectiveness of early semantic conditioning combined with dual-branch alignment for improving medical vision–language correspondence within chest X-ray settings.
由于细微的病变模式、异构的视觉语言语义以及在视觉编码过程中缺乏病变感知引导,医学图像-文本对齐仍然具有挑战性。现有的方法通常只在计算了视觉特征之后才引入文本信息,使得早期和中期的表示没有充分地依赖于诊断语义。这限制了该模型捕获细粒度异常并在异构胸部x射线数据集上保持稳定对齐的能力。为了解决这些限制,我们提出了TGIAlign,这是一个文本引导的双分支双向对齐框架,它将结构化的、以病变为中心的线索应用于从冻结编码器获得的中间视觉表示。使用大型语言模型(LLM)提取归一化的、基于属性的病变描述,在样本之间提供一致的语义指导。这些线索通过文本引导图像特征加权(TGIF)模块合并,该模块使用相似度衍生的权重重新加权中间特征输出,从而在不修改冻结主干的情况下实现多尺度语义调节。为了捕获互补的视觉线索,TGIAlign通过双分支双向对齐(Dual-Branch Bidirectional Alignment, DBBA)机制将多尺度文本引导特征与高级视觉表示相结合。在6个公共胸片数据集上的实验表明,TGIAlign实现了稳定的top-K检索和可靠的文本引导病变定位,突出了早期语义条件反射结合双分支对齐在改善胸片环境下医学视觉语言对应性方面的有效性。
{"title":"TGIAlign: Text-guided dual-branch bidirectional framework for cross-modal semantic alignment in medical vision-language","authors":"Wenhua Li,&nbsp;Lifang Wang,&nbsp;Min Zhao,&nbsp;Xingzhang Lü,&nbsp;Linwen Yi","doi":"10.1016/j.compmedimag.2025.102694","DOIUrl":"10.1016/j.compmedimag.2025.102694","url":null,"abstract":"<div><div>Medical image–text alignment remains challenging due to subtle lesion patterns, heterogeneous vision–language semantics, and the lack of lesion-aware guidance during visual encoding. Existing methods typically introduce textual information only after visual features have been computed, leaving early and mid-level representations insufficiently conditioned on diagnostic semantics. This limits the model’s ability to capture fine-grained abnormalities and maintain stable alignment across heterogeneous chest X-ray datasets. To address these limitations, we propose TGIAlign, a text-guided dual-branch bidirectional alignment framework that applies structured, lesion-centric cues to intermediate visual representations obtained from the frozen encoder. A large language model (LLM) is used to extract normalized, attribute-based lesion descriptions, providing consistent semantic guidance across samples. These cues are incorporated through the Text-Guided Image Feature Weighting (TGIF) module, which reweights intermediate feature outputs using similarity-derived weights, enabling multi-scale semantic conditioning without modifying the frozen backbone. To capture complementary visual cues, TGIAlign integrates multi-scale text-guided features with high-level visual representations through a Dual-Branch Bidirectional Alignment (DBBA) mechanism. Experiments on six public chest X-ray datasets demonstrate that TGIAlign achieves stable top-K retrieval and reliable text-guided lesion localization, highlighting the effectiveness of early semantic conditioning combined with dual-branch alignment for improving medical vision–language correspondence within chest X-ray settings.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102694"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFMFI: ultrasound breast cancer detection method based on dynamic fusion multi-scale feature interaction model DFMFI:基于动态融合多尺度特征交互模型的超声乳腺癌检测方法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-16 DOI: 10.1016/j.compmedimag.2026.102710
Chenbin Ma , Haonan Zhang , Lishuang Guo
Ultrasound imaging has become an important method of breast cancer screening due to its non-invasive, low-cost, and ionizing radiation-free characteristics. However, the complexity and uncertainty of ultrasound images (such as speckle noise, morphological diversity of lesion areas, and inter-class similarity) pose challenges to traditional computer-aided diagnosis systems. In response to these issues, this paper proposes a Dynamic Fusion Multi-Scale Feature Interaction Model (DFMFI), specifically designed for the task of benign and malignant breast cancer detection in ultrasound imaging. DFMFI enhances the model's ability to model complex lesion features by combining dynamic feature fusion, multi-scale feature aggregation, and nonlinear dynamic interaction mechanisms. The model includes three core modules: the dynamic feature mixer uses overlapped spatial reduction attention and dynamic depth convolution to efficiently integrate global and local information; the efficient multi-scale feature aggregator captures multi-scale lesion features through a multi-branch structure; the dynamic gated feed forward network enhances the adaptability of feature flow through gating mechanisms and nonlinear reconstruction. Experimental results show that DFMFI significantly outperforms existing methods in terms of classification accuracy, robustness, and computational efficiency, providing an efficient and robust solution for the early screening and diagnosis of breast cancer.
超声成像具有无创、低成本、无电离辐射等特点,已成为乳腺癌筛查的重要手段。然而,超声图像的复杂性和不确定性(如斑点噪声、病变区域的形态多样性和类间相似性)给传统的计算机辅助诊断系统带来了挑战。针对这些问题,本文提出了一种动态融合多尺度特征交互模型(DFMFI),专门用于超声成像中乳腺癌良恶性检测任务。DFMFI结合动态特征融合、多尺度特征聚合和非线性动态交互机制,增强了模型对复杂病变特征的建模能力。该模型包括三个核心模块:动态特征混合器使用重叠的空间约简关注和动态深度卷积来有效地整合全局和局部信息;高效的多尺度特征聚合器通过多分支结构捕获多尺度病变特征;动态门控前馈网络通过门控机制和非线性重构增强了特征流的自适应性。实验结果表明,DFMFI在分类准确率、鲁棒性和计算效率方面明显优于现有方法,为乳腺癌的早期筛查和诊断提供了高效、鲁棒的解决方案。
{"title":"DFMFI: ultrasound breast cancer detection method based on dynamic fusion multi-scale feature interaction model","authors":"Chenbin Ma ,&nbsp;Haonan Zhang ,&nbsp;Lishuang Guo","doi":"10.1016/j.compmedimag.2026.102710","DOIUrl":"10.1016/j.compmedimag.2026.102710","url":null,"abstract":"<div><div>Ultrasound imaging has become an important method of breast cancer screening due to its non-invasive, low-cost, and ionizing radiation-free characteristics. However, the complexity and uncertainty of ultrasound images (such as speckle noise, morphological diversity of lesion areas, and inter-class similarity) pose challenges to traditional computer-aided diagnosis systems. In response to these issues, this paper proposes a Dynamic Fusion Multi-Scale Feature Interaction Model (DFMFI), specifically designed for the task of benign and malignant breast cancer detection in ultrasound imaging. DFMFI enhances the model's ability to model complex lesion features by combining dynamic feature fusion, multi-scale feature aggregation, and nonlinear dynamic interaction mechanisms. The model includes three core modules: the dynamic feature mixer uses overlapped spatial reduction attention and dynamic depth convolution to efficiently integrate global and local information; the efficient multi-scale feature aggregator captures multi-scale lesion features through a multi-branch structure; the dynamic gated feed forward network enhances the adaptability of feature flow through gating mechanisms and nonlinear reconstruction. Experimental results show that DFMFI significantly outperforms existing methods in terms of classification accuracy, robustness, and computational efficiency, providing an efficient and robust solution for the early screening and diagnosis of breast cancer.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102710"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fuzzy rough set loss for deep learning-based precise medical image segmentation 基于深度学习的模糊粗糙集损失医学图像精确分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-23 DOI: 10.1016/j.compmedimag.2026.102716
Mohsin Furkh Dar , Avatharam Ganivada
Accurate segmentation of medical images is crucial for diagnosis and treatment planning, yet it remains challenging due to ambiguous lesion boundaries, class imbalance, and complex anatomical structures. We propose a novel Fuzzy Rough Set-inspired (FRS) loss function that addresses these challenges by integrating pixels’ fuzzy similarity relations with a boundary uncertainty model in a convex combination method. To obtain the boundary uncertainty model, the fuzzy lower and upper approximations of a set of pixels and membership weights are utilized. The FRS loss function enhances boundary sensitivity and handles prediction uncertainty through its dual components: a fuzzy similarity term that captures gradual transitions at lesion boundaries, and boundary uncertainty model that deals with uncertainty and mitigates class imbalance. Extensive experiments across five diverse medical imaging datasets — breast ultrasound, gastrointestinal polyps, brain Magnetic Resonance Imaging (MRI), chest Computed Tomography (CT), and skin lesions — demonstrate the effectiveness of our approach. The FRS loss achieves superior segmentation performance with an average improvement of 2.1% in Dice score compared to the best baseline method, while demonstrating statistically significant improvements across all evaluated metrics (p < 0.001). The FRS loss shows its robustness to moderate class imbalance while maintaining computational efficiency (mean inference time 0.075–0.12 s per image, 4.5 MB memory). These results suggest that the FRS loss function provides a robust and interpretable framework for precise medical image segmentation, particularly in cases with ambiguous boundaries and moderate imbalance. Code: https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss.
医学图像的准确分割对于诊断和治疗计划至关重要,但由于病变边界模糊、分类不平衡和复杂的解剖结构,仍然具有挑战性。我们提出了一种新的模糊粗糙集启发(FRS)损失函数,该函数通过凸组合方法将像素的模糊相似关系与边界不确定性模型集成来解决这些挑战。为了获得边界不确定性模型,利用了一组像素和隶属权的模糊上下近似。FRS损失函数增强了边界敏感性,并通过其双重组成部分处理预测不确定性:模糊相似项捕获病变边界的逐渐转变,边界不确定性模型处理不确定性并减轻类不平衡。在五种不同的医学成像数据集(乳腺超声、胃肠道息肉、脑磁共振成像(MRI)、胸部计算机断层扫描(CT)和皮肤病变)上进行的广泛实验证明了我们方法的有效性。与最佳基线方法相比,FRS损失获得了卓越的分割性能,Dice得分平均提高2.1%,同时在所有评估指标上显示出统计学上显著的改进(p < 0.001)。在保持计算效率(平均推理时间0.075-0.12 s /图像,4.5 MB内存)的情况下,FRS损失对适度的类不平衡具有鲁棒性。这些结果表明,FRS损失函数为精确的医学图像分割提供了一个鲁棒和可解释的框架,特别是在边界模糊和中度不平衡的情况下。代码:https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss。
{"title":"Fuzzy rough set loss for deep learning-based precise medical image segmentation","authors":"Mohsin Furkh Dar ,&nbsp;Avatharam Ganivada","doi":"10.1016/j.compmedimag.2026.102716","DOIUrl":"10.1016/j.compmedimag.2026.102716","url":null,"abstract":"<div><div>Accurate segmentation of medical images is crucial for diagnosis and treatment planning, yet it remains challenging due to ambiguous lesion boundaries, class imbalance, and complex anatomical structures. We propose a novel Fuzzy Rough Set-inspired (FRS) loss function that addresses these challenges by integrating pixels’ fuzzy similarity relations with a boundary uncertainty model in a convex combination method. To obtain the boundary uncertainty model, the fuzzy lower and upper approximations of a set of pixels and membership weights are utilized. The FRS loss function enhances boundary sensitivity and handles prediction uncertainty through its dual components: a fuzzy similarity term that captures gradual transitions at lesion boundaries, and boundary uncertainty model that deals with uncertainty and mitigates class imbalance. Extensive experiments across five diverse medical imaging datasets — breast ultrasound, gastrointestinal polyps, brain Magnetic Resonance Imaging (MRI), chest Computed Tomography (CT), and skin lesions — demonstrate the effectiveness of our approach. The FRS loss achieves superior segmentation performance with an average improvement of 2.1% in Dice score compared to the best baseline method, while demonstrating statistically significant improvements across all evaluated metrics (p <span><math><mo>&lt;</mo></math></span> 0.001). The FRS loss shows its robustness to moderate class imbalance while maintaining computational efficiency (mean inference time 0.075–0.12 s per image, 4.5 MB memory). These results suggest that the FRS loss function provides a robust and interpretable framework for precise medical image segmentation, particularly in cases with ambiguous boundaries and moderate imbalance. Code: <span><span>https://github.com/MohsinFurkh/Fuzzy-Rough-Set-Loss</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102716"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenoPath-MCA: Multimodal masked cross-attention between genomics and pathology for survival prediction genaath - mca:多模态掩盖了基因组学和病理学之间的交叉关注,用于生存预测
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-05 DOI: 10.1016/j.compmedimag.2026.102699
Kaixuan Zhang , Shuqi Dong , Peifeng Shi , Dingcan Hu , Geng Gao , Jinlin Yang , Tao Gan , Nini Rao
Survival prediction using whole slide images (WSIs) and bulk genes is a key task in computational pathology, essential for automated risk assessment and personalized treatment planning. While integrating WSIs with genomic features presents challenges due to inconsistent modality granularity, semantic disparity, and the lack of personalized fusion. We propose GenoPath-MCA, a novel multimodal framework that models dense cross-modal interactions between histopathology and gene expression data. A masked co-attention mechanism aligns features across modalities, and the Multimodal Masked Cross-Attention Module (M2CAM) jointly captures high-order image–gene and gene–gene relationships for enhanced semantic fusion. To address patient-level heterogeneity, we develop a Dynamic Modality Weight Adjustment Strategy (DMWAS) that adaptively modulates fusion weights based on the discriminative relevance of each modality. Additionally, an importance-guided patch selection strategy effectively filters redundant visual inputs, reducing computational cost while preserving critical context. Experiments on public multimodal cancer survival datasets demonstrate that GenoPath-MCA significantly outperforms existing methods in terms of concordance index and robustness. Visualizations of multimodal attention maps validate the biological interpretability and clinical potential of our approach.
使用全幻灯片图像(WSIs)和散装基因进行生存预测是计算病理学的关键任务,对于自动化风险评估和个性化治疗计划至关重要。然而,由于模态粒度不一致、语义差异和缺乏个性化融合,将wsi与基因组特征集成存在挑战。我们提出genaath - mca,一个新的多模态框架,模拟组织病理学和基因表达数据之间密集的跨模态相互作用。一个屏蔽的共同注意机制将不同模式的特征对齐,多模式屏蔽交叉注意模块(M2CAM)联合捕获高阶图像-基因和基因-基因关系,以增强语义融合。为了解决患者水平的异质性,我们开发了一个动态模态权重调整策略(DMWAS),该策略基于每个模态的鉴别相关性自适应调节融合权重。此外,一个重要性导向的补丁选择策略有效地过滤冗余的视觉输入,在保留关键上下文的同时降低计算成本。在公共多模态癌症生存数据集上的实验表明,genpath - mca在一致性指数和鲁棒性方面明显优于现有方法。多模态注意图的可视化验证了我们方法的生物学可解释性和临床潜力。
{"title":"GenoPath-MCA: Multimodal masked cross-attention between genomics and pathology for survival prediction","authors":"Kaixuan Zhang ,&nbsp;Shuqi Dong ,&nbsp;Peifeng Shi ,&nbsp;Dingcan Hu ,&nbsp;Geng Gao ,&nbsp;Jinlin Yang ,&nbsp;Tao Gan ,&nbsp;Nini Rao","doi":"10.1016/j.compmedimag.2026.102699","DOIUrl":"10.1016/j.compmedimag.2026.102699","url":null,"abstract":"<div><div>Survival prediction using whole slide images (WSIs) and bulk genes is a key task in computational pathology, essential for automated risk assessment and personalized treatment planning. While integrating WSIs with genomic features presents challenges due to inconsistent modality granularity, semantic disparity, and the lack of personalized fusion. We propose <strong>GenoPath-MCA</strong>, a novel multimodal framework that models dense cross-modal interactions between histopathology and gene expression data. A masked co-attention mechanism aligns features across modalities, and the Multimodal Masked Cross-Attention Module (<strong>M2CAM</strong>) jointly captures high-order image–gene and gene–gene relationships for enhanced semantic fusion. To address patient-level heterogeneity, we develop a Dynamic Modality Weight Adjustment Strategy (<strong>DMWAS</strong>) that adaptively modulates fusion weights based on the discriminative relevance of each modality. Additionally, an importance-guided patch selection strategy effectively filters redundant visual inputs, reducing computational cost while preserving critical context. Experiments on public multimodal cancer survival datasets demonstrate that GenoPath-MCA significantly outperforms existing methods in terms of concordance index and robustness. Visualizations of multimodal attention maps validate the biological interpretability and clinical potential of our approach.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102699"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised medical image classification via feature-level multi-scale consistency and adversarial training 基于特征级多尺度一致性和对抗训练的半监督医学图像分类
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2025-12-26 DOI: 10.1016/j.compmedimag.2025.102695
Li Shiyan, Wang Shuqin, Gu Xin, Sun Debing
In recent years, semi-supervised learning (SSL) has attracted increasing attention in medical image analysis, showing great potential in scenarios with limited annotations. However, existing consistency regularization methods suffer from several limitations: overly uniform constraints at the output layer, lack of interaction within adversarial strategies, and reliance on external sample pools for sample estimation, which together lead to insufficient use of feature-level information and unstable training. To address these challenges, this paper proposes a novel semi-supervised framework, termed Feature-level multi-scale Consistency and Adversarial Training (FCAT). A multi-scale feature-level consistency mechanism is introduced to capture hierarchical structural representations through cross-level feature fusion, enabling robust feature alignment without relying on external sample pools. To overcome the limitation of unidirectional adversarial training, a bidirectional feature perturbation strategy is designed under a teacher–student collaboration scheme, where both models generate perturbations from their own gradients and enforce mutual consistency. In addition, an intrinsic evaluation mechanism based on entropy and complementary confidence is developed to rank unlabeled samples according to their information content, guiding the training process toward informative hard samples while reducing overfitting to trivial ones. Experiments on the balanced Pneumonia Chest X-ray and NCT-CRC-HE histopathology datasets, as well as the imbalanced ISIC 2019 dermoscopic skin lesion dataset, demonstrate that our FCAT achieves competitive performance and strong generalization across diverse imaging modalities and data distributions.
近年来,半监督学习(semi-supervised learning, SSL)在医学图像分析中受到越来越多的关注,在标注有限的场景中显示出巨大的潜力。然而,现有的一致性正则化方法存在一些局限性:输出层过于统一的约束,对抗策略之间缺乏交互,以及依赖外部样本池进行样本估计,这些都导致特征级信息的使用不足和训练不稳定。为了解决这些挑战,本文提出了一种新的半监督框架,称为特征级多尺度一致性和对抗训练(FCAT)。引入了一种多尺度特征级一致性机制,通过跨级别特征融合捕获分层结构表示,实现了不依赖外部样本池的鲁棒特征对齐。为了克服单向对抗训练的局限性,在师生协作方案下设计了双向特征摄动策略,两个模型从各自的梯度产生摄动,并实现相互一致性。此外,开发了一种基于熵和互补置信度的内在评价机制,根据未标记样本的信息内容对其进行排序,引导训练过程向信息丰富的硬样本发展,同时减少对平凡样本的过拟合。在平衡的肺炎胸片和NCT-CRC-HE组织病理学数据集以及不平衡的ISIC 2019皮肤镜皮肤病变数据集上进行的实验表明,我们的FCAT在不同的成像方式和数据分布中具有竞争力的性能和很强的泛化性。
{"title":"Semi-supervised medical image classification via feature-level multi-scale consistency and adversarial training","authors":"Li Shiyan,&nbsp;Wang Shuqin,&nbsp;Gu Xin,&nbsp;Sun Debing","doi":"10.1016/j.compmedimag.2025.102695","DOIUrl":"10.1016/j.compmedimag.2025.102695","url":null,"abstract":"<div><div>In recent years, semi-supervised learning (SSL) has attracted increasing attention in medical image analysis, showing great potential in scenarios with limited annotations. However, existing consistency regularization methods suffer from several limitations: overly uniform constraints at the output layer, lack of interaction within adversarial strategies, and reliance on external sample pools for sample estimation, which together lead to insufficient use of feature-level information and unstable training. To address these challenges, this paper proposes a novel semi-supervised framework, termed Feature-level multi-scale Consistency and Adversarial Training (FCAT). A multi-scale feature-level consistency mechanism is introduced to capture hierarchical structural representations through cross-level feature fusion, enabling robust feature alignment without relying on external sample pools. To overcome the limitation of unidirectional adversarial training, a bidirectional feature perturbation strategy is designed under a teacher–student collaboration scheme, where both models generate perturbations from their own gradients and enforce mutual consistency. In addition, an intrinsic evaluation mechanism based on entropy and complementary confidence is developed to rank unlabeled samples according to their information content, guiding the training process toward informative hard samples while reducing overfitting to trivial ones. Experiments on the balanced Pneumonia Chest X-ray and NCT-CRC-HE histopathology datasets, as well as the imbalanced ISIC 2019 dermoscopic skin lesion dataset, demonstrate that our FCAT achieves competitive performance and strong generalization across diverse imaging modalities and data distributions.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102695"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified model with random penalty entropy loss for robust nasogastric tube placement analysis in X-ray 基于随机惩罚熵损失的x线鼻胃管鲁棒放置分析统一模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-24 DOI: 10.1016/j.compmedimag.2026.102715
GwiSeong Moon , Kyoung Min Moon , Inseo Park , Kanghee Lee , Doohee Lee , Woo Jin Kim , Yoon Kim , Ji Young Hong , Hyun-Soo Choi

Background and objective:

An accurate nasogastric (NG) tube placement assessment is essential to prevent serious complications. However, manual chest X-ray verification is prone to human error and variability. We propose a unified deep learning model that jointly performs segmentation and classification to improve the generalization and reliability of automated NG tube placement assessment.

Methods:

We developed a unified architecture based on nnUNet, which was optimized simultaneously for segmentation and classification. To enhance robustness and reduce overconfidence, we introduce Random Penalty Entropy Loss, which dynamically scales entropy penalties during training. The model was evaluated on internal datasets (5674 chest X-rays from three South Korean hospitals) and an external dataset from MIMIC-CXR.

Results:

On the internal test set, the proposed model outperformed the Wang 2-Stage method (F1: 93.94% vs. 87.39%), particularly in ambiguous cases. Baseline models using Focal Loss or Label Smoothing performed well internally but showed substantial performance drops and miscalibration externally. In contrast, our model with Random Penalty Entropy Loss achieved the highest external classification accuracy (F1: 66.34%, AUROC: 84.82%) and superior calibration (MCE: 0.429, ECE: 0.274).

Conclusion:

The proposed unified model surpasses existing two-stage approaches in classification and calibration. Incorporating Random Penalty Entropy Loss improves robustness and generalization across diverse clinical settings. These results highlight the model’s potential to reduce diagnostic errors and enhance patient safety in NG tube placement assessment.
背景与目的:准确的鼻胃管放置评估对预防严重并发症至关重要。然而,手动胸部x线验证容易出现人为错误和可变性。我们提出了一个统一的深度学习模型,联合进行分割和分类,以提高自动NG管放置评估的泛化和可靠性。方法:建立基于nnUNet的统一架构,并对该架构进行分割和分类的同时优化。为了增强鲁棒性和减少过度自信,我们引入了随机惩罚熵损失,它在训练过程中动态缩放熵惩罚。该模型在内部数据集(来自三家韩国医院的5674张胸部x光片)和MIMIC-CXR的外部数据集上进行了评估。结果:在内部测试集上,该模型优于Wang 2-Stage方法(F1: 93.94% vs. 87.39%),特别是在歧义情况下。使用焦损或标签平滑的基线模型在内部表现良好,但在外部表现出明显的性能下降和校准错误。相比之下,我们的随机惩罚熵损失模型获得了最高的外部分类准确率(F1: 66.34%, AUROC: 84.82%)和更好的校准(MCE: 0.429, ECE: 0.274)。结论:提出的统一模型在分类和标定方面优于现有的两阶段方法。结合随机惩罚熵损失提高鲁棒性和泛化在不同的临床设置。这些结果突出了该模型在减少NG管放置评估中的诊断错误和提高患者安全性方面的潜力。
{"title":"Unified model with random penalty entropy loss for robust nasogastric tube placement analysis in X-ray","authors":"GwiSeong Moon ,&nbsp;Kyoung Min Moon ,&nbsp;Inseo Park ,&nbsp;Kanghee Lee ,&nbsp;Doohee Lee ,&nbsp;Woo Jin Kim ,&nbsp;Yoon Kim ,&nbsp;Ji Young Hong ,&nbsp;Hyun-Soo Choi","doi":"10.1016/j.compmedimag.2026.102715","DOIUrl":"10.1016/j.compmedimag.2026.102715","url":null,"abstract":"<div><h3>Background and objective:</h3><div>An accurate nasogastric (NG) tube placement assessment is essential to prevent serious complications. However, manual chest X-ray verification is prone to human error and variability. We propose a unified deep learning model that jointly performs segmentation and classification to improve the generalization and reliability of automated NG tube placement assessment.</div></div><div><h3>Methods:</h3><div>We developed a unified architecture based on nnUNet, which was optimized simultaneously for segmentation and classification. To enhance robustness and reduce overconfidence, we introduce Random Penalty Entropy Loss, which dynamically scales entropy penalties during training. The model was evaluated on internal datasets (5674 chest X-rays from three South Korean hospitals) and an external dataset from MIMIC-CXR.</div></div><div><h3>Results:</h3><div>On the internal test set, the proposed model outperformed the Wang 2-Stage method (F1: 93.94% vs. 87.39%), particularly in ambiguous cases. Baseline models using Focal Loss or Label Smoothing performed well internally but showed substantial performance drops and miscalibration externally. In contrast, our model with Random Penalty Entropy Loss achieved the highest external classification accuracy (F1: 66.34%, AUROC: 84.82%) and superior calibration (MCE: 0.429, ECE: 0.274).</div></div><div><h3>Conclusion:</h3><div>The proposed unified model surpasses existing two-stage approaches in classification and calibration. Incorporating Random Penalty Entropy Loss improves robustness and generalization across diverse clinical settings. These results highlight the model’s potential to reduce diagnostic errors and enhance patient safety in NG tube placement assessment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102715"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid Transformer-CNN framework for uncertainty-guided semi-supervised multiclass eye disease classification with enhanced interpretability 一种用于不确定性引导的半监督多类眼病分类的混合Transformer-CNN框架
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-08 DOI: 10.1016/j.compmedimag.2026.102701
Muhammad Hammad Malik , Zishuo Wan , Yingying Ren , Da-Wei Ding
The accurate classification of eye diseases such as cataract, diabetic retinopathy (DR), glaucoma, and healthy conditions from fundus images remains a critical challenge in ophthalmology, requiring early diagnosis and treatment to prevent vision loss. Existing deep learning methods rely on large labeled datasets, inefficient use of unlabeled data, and limited interpretability, restricting clinical applicability. To address these limitations, we propose a novel CNN-Transformer hybrid architecture coupled with innovative semi-supervised learning (SSL) and explainability techniques to enhance multiclass eye disease classification. Our methodology integrates a ConvNeXt backbone with Transformer modules, leveraging multi-head attention to effectively capture both spatial features and long-range dependencies. We introduce Uncertainty-Guided MixMatch (UG-MixMatch), a semi-supervised framework that leverages Monte Carlo (MC) dropout for uncertainty quantification and pseudo-label refinement, effectively utilizing both labeled and unlabeled data. For interpretability, we propose a novel Gradient-based Integrated Attention Map (GIAM), which aggregates attention maps across multiple layers. It incorporates adaptive channel-wise weighting, offering more detailed insights into model predictions, surpassing traditional Grad-CAM methods. Evaluated on the Ocular Imaging Health (OIH) dataset of 4215 fundus images across four classes, our approach achieved a 95.27 % classification accuracy using UG-MixMatch and 95.51 % when incorporating MC dropout for direct model evaluation. Cohen’s kappa score reached 93.70, indicating near-perfect agreement with the ground truth. Class-wise performance was exceptional, with 100 % sensitivity and specificity for DR and over 95 % specificity for cataract and glaucoma. Robust AUC values were observed, including 1.00 for DR and cataract and 0.99 for glaucoma and healthy cases. GIAM visualizations effectively highlighted disease-relevant regions, offering enhanced clinical interpretability and validation potential. Our framework addresses data scarcity, enhances interpretability, and delivers clinically relevant performance, a promising step towards scalable, explainable, and accurate diagnostic tools for Clinical Decision Support Systems (CDSS) and ophthalmic screening.
从眼底图像中准确分类白内障、糖尿病视网膜病变(DR)、青光眼和健康状况等眼病仍然是眼科的一个关键挑战,需要早期诊断和治疗以防止视力丧失。现有的深度学习方法依赖于大型标记数据集,未标记数据的使用效率低下,可解释性有限,限制了临床适用性。为了解决这些限制,我们提出了一种新颖的CNN-Transformer混合架构,结合创新的半监督学习(SSL)和可解释性技术来增强多类眼病分类。我们的方法将ConvNeXt主干与Transformer模块集成在一起,利用多头注意力有效地捕获空间特征和远程依赖关系。我们引入了不确定性引导混合匹配(UG-MixMatch),这是一种半监督框架,利用蒙特卡罗(MC) dropout进行不确定性量化和伪标签细化,有效地利用了标记和未标记的数据。为了提高可解释性,我们提出了一种新的基于梯度的集成注意图(GIAM),它将多层的注意图聚集在一起。它结合了自适应通道加权,为模型预测提供了更详细的见解,超越了传统的Grad-CAM方法。在眼成像健康(OIH)数据集上对4个类别的4215张眼底图像进行了评估,我们的方法使用UG-MixMatch实现了95.27 %的分类准确率,而当结合MC dropout进行直接模型评估时,我们的方法实现了95.51 %的分类准确率。科恩的kappa分数达到了93.70,表明与基本事实几乎完美吻合。分类表现优异,DR的敏感性和特异性为100% %,白内障和青光眼的特异性超过95% %。观察到稳健的AUC值,其中DR和白内障为1.00,青光眼和健康病例为0.99。GIAM可视化有效地突出了疾病相关区域,提供了增强的临床可解释性和验证潜力。我们的框架解决了数据稀缺问题,增强了可解释性,并提供了临床相关的性能,这是朝着临床决策支持系统(CDSS)和眼科筛查的可扩展、可解释和准确诊断工具迈出的有希望的一步。
{"title":"A hybrid Transformer-CNN framework for uncertainty-guided semi-supervised multiclass eye disease classification with enhanced interpretability","authors":"Muhammad Hammad Malik ,&nbsp;Zishuo Wan ,&nbsp;Yingying Ren ,&nbsp;Da-Wei Ding","doi":"10.1016/j.compmedimag.2026.102701","DOIUrl":"10.1016/j.compmedimag.2026.102701","url":null,"abstract":"<div><div>The accurate classification of eye diseases such as cataract, diabetic retinopathy (DR), glaucoma, and healthy conditions from fundus images remains a critical challenge in ophthalmology, requiring early diagnosis and treatment to prevent vision loss. Existing deep learning methods rely on large labeled datasets, inefficient use of unlabeled data, and limited interpretability, restricting clinical applicability. To address these limitations, we propose a novel CNN-Transformer hybrid architecture coupled with innovative semi-supervised learning (SSL) and explainability techniques to enhance multiclass eye disease classification. Our methodology integrates a ConvNeXt backbone with Transformer modules, leveraging multi-head attention to effectively capture both spatial features and long-range dependencies. We introduce Uncertainty-Guided MixMatch (UG-MixMatch), a semi-supervised framework that leverages Monte Carlo (MC) dropout for uncertainty quantification and pseudo-label refinement, effectively utilizing both labeled and unlabeled data. For interpretability, we propose a novel Gradient-based Integrated Attention Map (GIAM), which aggregates attention maps across multiple layers. It incorporates adaptive channel-wise weighting, offering more detailed insights into model predictions, surpassing traditional Grad-CAM methods. Evaluated on the Ocular Imaging Health (OIH) dataset of 4215 fundus images across four classes, our approach achieved a 95.27 % classification accuracy using UG-MixMatch and 95.51 % when incorporating MC dropout for direct model evaluation. Cohen’s kappa score reached 93.70, indicating near-perfect agreement with the ground truth. Class-wise performance was exceptional, with 100 % sensitivity and specificity for DR and over 95 % specificity for cataract and glaucoma. Robust AUC values were observed, including 1.00 for DR and cataract and 0.99 for glaucoma and healthy cases. GIAM visualizations effectively highlighted disease-relevant regions, offering enhanced clinical interpretability and validation potential. Our framework addresses data scarcity, enhances interpretability, and delivers clinically relevant performance, a promising step towards scalable, explainable, and accurate diagnostic tools for Clinical Decision Support Systems (CDSS) and ophthalmic screening.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102701"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Initial evaluation of a mixed-reality system for image-guided navigation during percutaneous liver tumor ablation 经皮肝肿瘤消融过程中图像引导导航混合现实系统的初步评估
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-01-28 DOI: 10.1016/j.compmedimag.2026.102714
Dominik Spinczyk , Grzegorz Rosiak , Jarosław Żyłkowski , Krzysztof Milczarek , Dariusz Konecki , Karol Zaczkowski , Agata Tomaszewska , Łukasz Przepióra , Anna Wolińska-Sołtys , Piotr Sperka , Dawid Hajda , Ewa Piętka
Minimally invasive ablation is a challenge for contemporary interventional radiology. This study aimed to investigate the feasibility of utilizing a mixed-reality system for this type of treatment. A HoloLens mixed reality and optical tracking system, which supports diagnosis, planning, and procedure implementation, was used for percutaneous liver tumor ablation. The system differentiated pathological liver changes at the diagnostic stage, allowing for the selection of the entry point and target during planning. Meanwhile, it provided a real-time fusion of intraoperative ultrasound images with a pre-operative hologram during the procedure. Additionally, the collision detection module enabled the detection of collisions between the ablative needle and anatomical structures, utilizing the actual needle trajectory. The system was evaluated in 11 patients with cancerous liver lesions. The mean accuracy of target point registration, selected at the planning stage, was 2.8 mm during the procedure's supporting stage. Additionally, operator depth perception improved, the effective needle trajectory was shortened, and the radiation dose was reduced for both the patient and the operator due to improved visibility of the needle within the patient’s body. A generally improved understanding of the mutual spatial relationship between anatomical structures was observed compared to the classical two-dimensional view, along with improved depth perception of the operating field. An additional advantage indicated by the operators was the real-time highlighting of anatomical structures susceptible to damage by the needle trajectory, such as blood vessels, bile ducts, and the lungs, which lowers the risk of complications.
微创消融是当代介入放射学面临的挑战。本研究旨在探讨利用混合现实系统进行此类治疗的可行性。HoloLens混合现实和光学跟踪系统,支持诊断、计划和程序实施,用于经皮肝肿瘤消融。该系统在诊断阶段对病理肝脏变化进行区分,允许在计划过程中选择切入点和靶点。同时,它提供了术中超声图像与术中全息图的实时融合。此外,碰撞检测模块能够检测烧蚀针与解剖结构之间的碰撞,利用实际的烧蚀针轨迹。该系统在11例肝癌患者中进行了评估。在计划阶段选择的目标点配准的平均精度在程序支持阶段为2.8 mm。此外,由于针头在患者体内的可见性提高,操作者的深度感知能力提高,有效针头轨迹缩短,患者和操作者的辐射剂量都降低了。与经典的二维视图相比,观察到对解剖结构之间相互空间关系的理解普遍提高,同时对手术视野的深度感知也有所提高。操作员指出的另一个优势是实时突出易受针头轨迹损伤的解剖结构,如血管、胆管和肺部,从而降低了并发症的风险。
{"title":"Initial evaluation of a mixed-reality system for image-guided navigation during percutaneous liver tumor ablation","authors":"Dominik Spinczyk ,&nbsp;Grzegorz Rosiak ,&nbsp;Jarosław Żyłkowski ,&nbsp;Krzysztof Milczarek ,&nbsp;Dariusz Konecki ,&nbsp;Karol Zaczkowski ,&nbsp;Agata Tomaszewska ,&nbsp;Łukasz Przepióra ,&nbsp;Anna Wolińska-Sołtys ,&nbsp;Piotr Sperka ,&nbsp;Dawid Hajda ,&nbsp;Ewa Piętka","doi":"10.1016/j.compmedimag.2026.102714","DOIUrl":"10.1016/j.compmedimag.2026.102714","url":null,"abstract":"<div><div>Minimally invasive ablation is a challenge for contemporary interventional radiology. This study aimed to investigate the feasibility of utilizing a mixed-reality system for this type of treatment. A HoloLens mixed reality and optical tracking system, which supports diagnosis, planning, and procedure implementation, was used for percutaneous liver tumor ablation. The system differentiated pathological liver changes at the diagnostic stage, allowing for the selection of the entry point and target during planning. Meanwhile, it provided a real-time fusion of intraoperative ultrasound images with a pre-operative hologram during the procedure. Additionally, the collision detection module enabled the detection of collisions between the ablative needle and anatomical structures, utilizing the actual needle trajectory. The system was evaluated in 11 patients with cancerous liver lesions. The mean accuracy of target point registration, selected at the planning stage, was 2.8 mm during the procedure's supporting stage. Additionally, operator depth perception improved, the effective needle trajectory was shortened, and the radiation dose was reduced for both the patient and the operator due to improved visibility of the needle within the patient’s body. A generally improved understanding of the mutual spatial relationship between anatomical structures was observed compared to the classical two-dimensional view, along with improved depth perception of the operating field. An additional advantage indicated by the operators was the real-time highlighting of anatomical structures susceptible to damage by the needle trajectory, such as blood vessels, bile ducts, and the lungs, which lowers the risk of complications.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102714"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146078445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interpretable machine learning framework with data-informed imaging biomarkers for diagnosis and prediction of Alzheimer’s disease 一个可解释的机器学习框架,具有数据知情的成像生物标志物,用于阿尔茨海默病的诊断和预测。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-01 Epub Date: 2026-02-06 DOI: 10.1016/j.compmedimag.2026.102722
Wenjie Kang , Bo Li , Lize C. Jiskoot , Peter Paul De Deyn , Geert Jan Biessels , Huiberdina L. Koek , Jurgen A.H.R. Claassen , Huub A.M. Middelkoop , Wiesje M. van der Flier , Willemijn J. Jansen , Stefan Klein , Esther E. Bron , Alzheimer’s Disease Neuroimaging Initiative , on behalf of the Parelsnoer Neurodegenerative Diseases study group
Machine learning methods based on imaging and other clinical data have shown great potential for improving the early and accurate diagnosis of Alzheimer’s disease (AD). However, for most deep learning models, especially those including high-dimensional imaging data, the decision-making process remains largely opaque which limits clinical applicability. Explainable Boosting Machines (EBMs) are inherently interpretable machine learning models, but are typically applied to low-dimensional data. In this study, we propose an interpretable machine learning framework that integrates data-driven feature extraction based on Convolutional Neural Networks (CNNs) with the intrinsic transparency of EBMs for AD diagnosis and prediction. The framework enables interpretation at both the group-level and individual-level by identifying imaging biomarkers contributing to predictions. We validated the framework on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort, achieving an area-under-the-curve (AUC) of 0.969 for AD vs. control classification and 0.750 for MCI conversion prediction. External validation was performed on an independent cohort, yielding AUCs of 0.871 for AD vs. subjective cognitive decline (SCD) classification and 0.666 for MCI conversion prediction. The proposed framework achieves performance comparable to state-of-the-art black-box models while offering transparent decision-making, a critical requirement for clinical translation. Our code is available at: https://gitlab.com/radiology/neuro/interpretable_ad_classification.
基于成像和其他临床数据的机器学习方法在提高阿尔茨海默病(AD)的早期和准确诊断方面显示出巨大的潜力。然而,对于大多数深度学习模型,特别是那些包含高维成像数据的模型,决策过程在很大程度上仍然是不透明的,这限制了临床应用。可解释的增强机器(EBMs)本质上是可解释的机器学习模型,但通常应用于低维数据。在这项研究中,我们提出了一个可解释的机器学习框架,该框架将基于卷积神经网络(cnn)的数据驱动特征提取与EBMs的内在透明度相结合,用于AD的诊断和预测。通过识别有助于预测的成像生物标志物,该框架能够在群体水平和个人水平上进行解释。我们在阿尔茨海默病神经影像学倡议(ADNI)队列中验证了该框架,实现了AD与对照分类的曲线下面积(AUC)为0.969,MCI转换预测为0.750。在独立队列中进行外部验证,得出AD与主观认知衰退(SCD)分类的auc为0.871,MCI转换预测的auc为0.666。所提出的框架实现了与最先进的黑箱模型相当的性能,同时提供透明的决策,这是临床翻译的关键要求。我们的代码可在:https://gitlab.com/radiology/neuro/interpretable_ad_classification。
{"title":"An interpretable machine learning framework with data-informed imaging biomarkers for diagnosis and prediction of Alzheimer’s disease","authors":"Wenjie Kang ,&nbsp;Bo Li ,&nbsp;Lize C. Jiskoot ,&nbsp;Peter Paul De Deyn ,&nbsp;Geert Jan Biessels ,&nbsp;Huiberdina L. Koek ,&nbsp;Jurgen A.H.R. Claassen ,&nbsp;Huub A.M. Middelkoop ,&nbsp;Wiesje M. van der Flier ,&nbsp;Willemijn J. Jansen ,&nbsp;Stefan Klein ,&nbsp;Esther E. Bron ,&nbsp;Alzheimer’s Disease Neuroimaging Initiative ,&nbsp;on behalf of the Parelsnoer Neurodegenerative Diseases study group","doi":"10.1016/j.compmedimag.2026.102722","DOIUrl":"10.1016/j.compmedimag.2026.102722","url":null,"abstract":"<div><div>Machine learning methods based on imaging and other clinical data have shown great potential for improving the early and accurate diagnosis of Alzheimer’s disease (AD). However, for most deep learning models, especially those including high-dimensional imaging data, the decision-making process remains largely opaque which limits clinical applicability. Explainable Boosting Machines (EBMs) are inherently interpretable machine learning models, but are typically applied to low-dimensional data. In this study, we propose an interpretable machine learning framework that integrates data-driven feature extraction based on Convolutional Neural Networks (CNNs) with the intrinsic transparency of EBMs for AD diagnosis and prediction. The framework enables interpretation at both the group-level and individual-level by identifying imaging biomarkers contributing to predictions. We validated the framework on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort, achieving an area-under-the-curve (AUC) of 0.969 for AD vs. control classification and 0.750 for MCI conversion prediction. External validation was performed on an independent cohort, yielding AUCs of 0.871 for AD vs. subjective cognitive decline (SCD) classification and 0.666 for MCI conversion prediction. The proposed framework achieves performance comparable to state-of-the-art black-box models while offering transparent decision-making, a critical requirement for clinical translation. Our code is available at: <span><span>https://gitlab.com/radiology/neuro/interpretable_ad_classification</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"128 ","pages":"Article 102722"},"PeriodicalIF":4.9,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1