首页 > 最新文献

Medical image analysis最新文献

英文 中文
Advances in automated fetal brain MRI segmentation and biometry: Insights from the FeTA 2024 challenge 自动化胎儿脑MRI分割和生物计量的进展:来自FeTA 2024挑战的见解
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2026-01-16 DOI: 10.1016/j.media.2026.103941
Vladyslav Zalevskyi , Thomas Sanchez , Misha Kaandorp , Margaux Roulet , Diego Fajardo-Rojas , Liu Li , Jana Hutter , Hongwei Bran Li , Matthew J. Barkovich , Hui Ji , Luca Wilhelmi , Aline Dändliker , Céline Steger , Mériam Koob , Yvan Gomez , Anton Jakovčić , Melita Klaić , Ana Adžić , Pavel Marković , Gracia Grabarić , Meritxell Bach Cuadra
Accurate fetal brain tissue segmentation and biometric measurement are essential for monitoring neurodevelopment and detecting abnormalities in utero. The Fetal Tissue Annotation (FeTA) Challenges have established robust multi-center benchmarks for evaluating state-of-the-art segmentation methods. This paper presents the results of the 2024 challenge edition, which introduced three key innovations.
First, we introduced a topology-aware metric based on the Euler characteristic difference (ED) to overcome the performance plateau observed with traditional metrics like Dice or Hausdorff distance (HD), as the performance of the best models in segmentation surpassed the inter-rater variability. While the best teams reached similar scores in Dice (0.81-0.82) and HD95 (2.1–2.3 mm), ED provided greater discriminative power: the winning method achieved an ED of 20.9, representing roughly a 50% improvement over the second- and third-ranked teams despite comparable Dice scores.
Second, we introduced a new 0.55T low-field MRI test set, which, when paired with high-quality super-resolution reconstruction, achieved the highest segmentation performance across all test cohorts (Dice=0.86, HD95=1.69, ED=6.26). This provides the first quantitative evidence that low-cost, low-field MRI can match or surpass high-field systems in automated fetal brain segmentation.
Third, the new biometry estimation task exposed a clear performance gap: although the best model reached a mean average percentage error (MAPE) of 7.72%, most submissions failed to outperform a simple gestational-age-based linear regression model (MAPE=9.56%), and all remained above inter-rater variability with a MAPE of 5.38%.
Finally, by analyzing the top-performing models from FeTA 2024 alongside those from previous challenge editions, we identify ensembles of 3D nnU-Net trained on both real and synthetic data with both image- and anatomy-level augmentations as the most effective approaches for fetal brain segmentation. Our quantitative analysis reveals that acquisition site, super-resolution strategy, and image quality are the primary sources of domain shift, informing recommendations to enhance the robustness and generalizability of automated fetal brain analysis methods.
准确的胎儿脑组织分割和生物测量对于监测神经发育和检测子宫内异常是必不可少的。胎儿组织注释(FeTA)挑战为评估最先进的分割方法建立了强大的多中心基准。本文介绍了2024挑战版的结果,其中介绍了三个关键创新。首先,我们引入了基于欧拉特征差(ED)的拓扑感知度量,以克服传统度量(如Dice或Hausdorff distance, HD)所观察到的性能平台,因为最佳模型在分割中的性能超过了速率间的可变性。虽然最好的团队在Dice(0.81-0.82)和HD95(2.1-2.3 mm)中获得了相似的分数,但ED提供了更大的辨别能力:获胜方法的ED达到了20.9,比排名第二和第三的团队提高了大约50%,尽管Dice分数相当。其次,我们引入了一个新的0.55T低场MRI测试集,当它与高质量的超分辨率重建配对时,在所有测试队列中获得了最高的分割性能(Dice=0.86, HD95=1.69, ED=6.26)。这提供了第一个定量证据,低成本,低场MRI可以匹配或超过高场系统在胎儿脑自动分割。第三,新的生物特征估计任务暴露出明显的性能差距:尽管最佳模型达到7.72%的平均百分比误差(MAPE),但大多数提交的模型未能优于简单的基于胎龄的线性回归模型(MAPE=9.56%),并且都保持在5.38%的等级间变异之上。最后,通过分析FeTA 2024和以前的挑战版本中表现最好的模型,我们确定了在真实和合成数据上训练的3D nnU-Net集合,并在图像和解剖水平上增强,作为胎儿大脑分割的最有效方法。我们的定量分析表明,采集地点、超分辨率策略和图像质量是域漂移的主要来源,为提高自动化胎儿脑分析方法的鲁棒性和通用性提供了建议。
{"title":"Advances in automated fetal brain MRI segmentation and biometry: Insights from the FeTA 2024 challenge","authors":"Vladyslav Zalevskyi ,&nbsp;Thomas Sanchez ,&nbsp;Misha Kaandorp ,&nbsp;Margaux Roulet ,&nbsp;Diego Fajardo-Rojas ,&nbsp;Liu Li ,&nbsp;Jana Hutter ,&nbsp;Hongwei Bran Li ,&nbsp;Matthew J. Barkovich ,&nbsp;Hui Ji ,&nbsp;Luca Wilhelmi ,&nbsp;Aline Dändliker ,&nbsp;Céline Steger ,&nbsp;Mériam Koob ,&nbsp;Yvan Gomez ,&nbsp;Anton Jakovčić ,&nbsp;Melita Klaić ,&nbsp;Ana Adžić ,&nbsp;Pavel Marković ,&nbsp;Gracia Grabarić ,&nbsp;Meritxell Bach Cuadra","doi":"10.1016/j.media.2026.103941","DOIUrl":"10.1016/j.media.2026.103941","url":null,"abstract":"<div><div>Accurate fetal brain tissue segmentation and biometric measurement are essential for monitoring neurodevelopment and detecting abnormalities in utero. The Fetal Tissue Annotation (FeTA) Challenges have established robust multi-center benchmarks for evaluating state-of-the-art segmentation methods. This paper presents the results of the 2024 challenge edition, which introduced three key innovations.</div><div>First, we introduced a topology-aware metric based on the Euler characteristic difference (ED) to overcome the performance plateau observed with traditional metrics like Dice or Hausdorff distance (HD), as the performance of the best models in segmentation surpassed the inter-rater variability. While the best teams reached similar scores in Dice (0.81-0.82) and HD95 (2.1–2.3 mm), ED provided greater discriminative power: the winning method achieved an ED of 20.9, representing roughly a 50% improvement over the second- and third-ranked teams despite comparable Dice scores.</div><div>Second, we introduced a new 0.55T low-field MRI test set, which, when paired with high-quality super-resolution reconstruction, achieved the highest segmentation performance across all test cohorts (Dice=0.86, HD95=1.69, ED=6.26). This provides the first quantitative evidence that low-cost, low-field MRI can match or surpass high-field systems in automated fetal brain segmentation.</div><div>Third, the new biometry estimation task exposed a clear performance gap: although the best model reached a mean average percentage error (MAPE) of 7.72%, most submissions failed to outperform a simple gestational-age-based linear regression model (MAPE=9.56%), and all remained above inter-rater variability with a MAPE of 5.38%.</div><div>Finally, by analyzing the top-performing models from FeTA 2024 alongside those from previous challenge editions, we identify ensembles of 3D nnU-Net trained on both real and synthetic data with both image- and anatomy-level augmentations as the most effective approaches for fetal brain segmentation. Our quantitative analysis reveals that acquisition site, super-resolution strategy, and image quality are the primary sources of domain shift, informing recommendations to enhance the robustness and generalizability of automated fetal brain analysis methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103941"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BIASNet: A bidirectional feature alignment and semantics-guided network for weakly-supervised medical image registration BIASNet:用于弱监督医学图像配准的双向特征对齐和语义引导网络
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-12 DOI: 10.1016/j.media.2025.103913
Housheng Xie, Xiaoru Gao, Guoyan Zheng
Medical image registration, which establishes spatial correspondences between different medical images, serves as a fundamental process in numerous clinical applications and diagnostic workflows. Despite significant advancement in unsupervised deep learning-based registration methods, these approaches consistently yield suboptimal results compared to their weakly-supervised counterparts. Recent advancements in universal segmentation models have made it easier to obtain anatomical labels from medical images. However, existing registration methods have not fully leveraged the rich anatomical and structural prior information provided by segmentation labels. To address this limitation, we propose a BIdirectional feature Alignment and Semantics-guided Network, referred to as BIASNet, for weakly-supervised image registration. Specifically, starting from multi-scale features extracted from the pre-trained VoCo, fine-tuned using Low-Rank Adaptation (LoRA), we propose a dual-attribute learning scheme, incorporating a novel BIdirectional Alignment and Fusion (BIAF) module for extracting both semantics-wise and intensity-wise features. These two types of features are subsequently fed into a semantics-guided progressive registration framework for accurate deformation field estimation. We further propose an anatomical region deformation consistency learning to regularize the target anatomical regions deformation. Comprehensive experiments conducted on three typical yet challenging datasets demonstrate that our method achieves consistently better results than other state-of-the-art deformable registration approaches. The source code is publicly available at https://github.com/xiehousheng/BIASNet.
医学图像配准是建立不同医学图像之间的空间对应关系,是许多临床应用和诊断工作流程中的基本过程。尽管基于无监督深度学习的注册方法取得了重大进展,但与弱监督的方法相比,这些方法始终产生次优结果。通用分割模型的最新进展使得从医学图像中获得解剖标签变得更加容易。然而,现有的配准方法并没有充分利用分割标签提供的丰富的解剖和结构先验信息。为了解决这一限制,我们提出了一个双向特征对齐和语义引导网络,称为BIASNet,用于弱监督图像配准。具体而言,我们提出了一种双属性学习方案,从预训练的VoCo中提取多尺度特征,使用低秩自适应(Low-Rank Adaptation, LoRA)进行微调,并结合一种新的双向对齐和融合(BIAF)模块来提取语义和强度特征。这两种类型的特征随后被输入到一个语义引导的渐进配准框架中,用于精确的变形场估计。我们进一步提出了一种解剖区域变形一致性学习来规范目标解剖区域的变形。在三个典型但具有挑战性的数据集上进行的综合实验表明,我们的方法始终比其他最先进的可变形配准方法取得更好的结果。源代码可在https://github.com/xiehousheng/BIASNet上公开获得。
{"title":"BIASNet: A bidirectional feature alignment and semantics-guided network for weakly-supervised medical image registration","authors":"Housheng Xie,&nbsp;Xiaoru Gao,&nbsp;Guoyan Zheng","doi":"10.1016/j.media.2025.103913","DOIUrl":"10.1016/j.media.2025.103913","url":null,"abstract":"<div><div>Medical image registration, which establishes spatial correspondences between different medical images, serves as a fundamental process in numerous clinical applications and diagnostic workflows. Despite significant advancement in unsupervised deep learning-based registration methods, these approaches consistently yield suboptimal results compared to their weakly-supervised counterparts. Recent advancements in universal segmentation models have made it easier to obtain anatomical labels from medical images. However, existing registration methods have not fully leveraged the rich anatomical and structural prior information provided by segmentation labels. To address this limitation, we propose a <strong>BI</strong>directional feature <strong>A</strong>lignment and <strong>S</strong>emantics-guided Network, referred to as BIASNet, for weakly-supervised image registration. Specifically, starting from multi-scale features extracted from the pre-trained VoCo, fine-tuned using Low-Rank Adaptation (LoRA), we propose a dual-attribute learning scheme, incorporating a novel BIdirectional Alignment and Fusion (BIAF) module for extracting both semantics-wise and intensity-wise features. These two types of features are subsequently fed into a semantics-guided progressive registration framework for accurate deformation field estimation. We further propose an anatomical region deformation consistency learning to regularize the target anatomical regions deformation. Comprehensive experiments conducted on three typical yet challenging datasets demonstrate that our method achieves consistently better results than other state-of-the-art deformable registration approaches. The source code is publicly available at <span><span>https://github.com/xiehousheng/BIASNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103913"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145730671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GloW-VSNet: A scribble-based weakly supervised framework for global-view vitiligo lesion segmentation GloW-VSNet:一个基于涂鸦的弱监督框架,用于全局视图白癜风病灶分割
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-21 DOI: 10.1016/j.media.2025.103920
Yuheng Wang , Yuhan Zheng , Chloe Yue , Thomas Zhang , Jiayue Cai , Chunqi Chang , Harvey Lui , Sunil Kalia , Z. Jane Wang , Tim K. Lee
Vitiligo lesion identification is essential for quantifying disease severity, monitoring disease progression and assessing treatment response, particularly for objective quantification. However, segmenting vitiligo lesions from clinical images is challenging due to indistinct borders, complex backgrounds, and image artifacts. The difficulty increases when handling small and sparse lesions in global-view photographs. Fully supervised segmentation models require extensively annotated datasets, making the labelling process time-consuming and costly. To address these challenges, we propose GloW-VSNet, a scribble-guided weakly supervised segmentation method for global-view vitiligo detection. Our approach integrates differentiable feature clustering with a spatial attention mechanism based on physician-provided scribble annotations, enabling the model to focus on relevant spatial features and improve segmentation accuracy despite background noise and artifacts. Additionally, we introduce spatial continuity optimization to preserve the natural distribution of vitiligo, enhancing segmentation consistency while reducing computational demands. Extensive experiments on two public vitiligo datasets and two private datasets demonstrate that GloW-VSNet achieves state-of-the-art performance. To our knowledge, this is the first study to explore weakly supervised global-view vitiligo segmentation, addressing a critical research gap. Our method enhances the assessment of disease severity and monitoring of treatment response through an objective assessment for real-world applications. Our code is publicly available at https://github.com/YuhanZheng0327/Weakly-Supervised-Vitiligo-Lesion-Segmentation.
白癜风病变识别对于量化疾病严重程度、监测疾病进展和评估治疗反应至关重要,特别是对于客观量化。然而,由于边界模糊、背景复杂和图像伪影,从临床图像中分割白癜风病变是具有挑战性的。在处理全局视图照片中的小而稀疏的病变时,难度增加。完全监督分割模型需要广泛注释的数据集,使标记过程耗时且成本高昂。为了解决这些挑战,我们提出了GloW-VSNet,一种用于全局视图白癜风检测的涂鸦引导弱监督分割方法。我们的方法将可微特征聚类与基于医生提供的涂鸦注释的空间注意机制集成在一起,使模型能够专注于相关的空间特征,并在背景噪声和伪影的情况下提高分割精度。此外,我们引入空间连续性优化,以保持白癜风的自然分布,增强分割一致性,同时减少计算需求。在两个公共白癜风数据集和两个私人数据集上进行的大量实验表明,GloW-VSNet达到了最先进的性能。据我们所知,这是第一个探索弱监督全球视角白癜风分割的研究,解决了一个关键的研究空白。我们的方法通过对现实世界应用的客观评估,增强了对疾病严重程度的评估和对治疗反应的监测。我们的代码可以在https://github.com/YuhanZheng0327/Weakly-Supervised-Vitiligo-Lesion-Segmentation上公开获得。
{"title":"GloW-VSNet: A scribble-based weakly supervised framework for global-view vitiligo lesion segmentation","authors":"Yuheng Wang ,&nbsp;Yuhan Zheng ,&nbsp;Chloe Yue ,&nbsp;Thomas Zhang ,&nbsp;Jiayue Cai ,&nbsp;Chunqi Chang ,&nbsp;Harvey Lui ,&nbsp;Sunil Kalia ,&nbsp;Z. Jane Wang ,&nbsp;Tim K. Lee","doi":"10.1016/j.media.2025.103920","DOIUrl":"10.1016/j.media.2025.103920","url":null,"abstract":"<div><div>Vitiligo lesion identification is essential for quantifying disease severity, monitoring disease progression and assessing treatment response, particularly for objective quantification. However, segmenting vitiligo lesions from clinical images is challenging due to indistinct borders, complex backgrounds, and image artifacts. The difficulty increases when handling small and sparse lesions in global-view photographs. Fully supervised segmentation models require extensively annotated datasets, making the labelling process time-consuming and costly. To address these challenges, we propose GloW-VSNet, a scribble-guided weakly supervised segmentation method for global-view vitiligo detection. Our approach integrates differentiable feature clustering with a spatial attention mechanism based on physician-provided scribble annotations, enabling the model to focus on relevant spatial features and improve segmentation accuracy despite background noise and artifacts. Additionally, we introduce spatial continuity optimization to preserve the natural distribution of vitiligo, enhancing segmentation consistency while reducing computational demands. Extensive experiments on two public vitiligo datasets and two private datasets demonstrate that GloW-VSNet achieves state-of-the-art performance. To our knowledge, this is the first study to explore weakly supervised global-view vitiligo segmentation, addressing a critical research gap. Our method enhances the assessment of disease severity and monitoring of treatment response through an objective assessment for real-world applications. Our code is publicly available at <span><span>https://github.com/YuhanZheng0327/Weakly-Supervised-Vitiligo-Lesion-Segmentation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103920"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diagnostic text-guided representation learning in hierarchical classification for pathological whole slide image 诊断性文本引导表征学习在病理整片图像分层分类中的应用
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-07 DOI: 10.1016/j.media.2025.103894
Jiawen Li , Qiehe Sun , Renao Yan , Yizhi Wang , Yuqiu Fu , Yani Wei , Tian Guan , Huijuan Shi , Yonghong He , Anjia Han
With the development of digital imaging in medical microscopy, artificial intelligent-based analysis of pathological whole slide images (WSIs) provides a powerful tool for cancer diagnosis. Limited by the expensive cost of pixel-level annotation, current research primarily focuses on representation learning with slide-level labels, showing success in various downstream tasks. However, given the diversity of lesion types and the complex relationships between each other, these techniques still deserve further exploration in addressing advanced pathology tasks. To this end, we introduce the concept of hierarchical pathological image classification and propose a representation learning called PathTree. PathTree considers the multi-classification of diseases as a binary tree structure. Each category is represented as a professional pathological text description, which messages information with a tree-like encoder. The interactive text features are then used to guide the aggregation of hierarchical multiple representations. PathTree uses slide-text similarity to obtain probability scores and introduces two extra tree-specific losses to further constrain the association between texts and slides. Through extensive experiments on three challenging hierarchical classification datasets: in-house cryosectioned lung tissue lesion identification, public prostate cancer grade assessment, and public breast cancer subtyping, our proposed PathTree is consistently competitive compared to the state-of-the-art methods and provides a new perspective on the deep learning-assisted solution for more complex WSI classification.
随着数字成像技术在医学显微镜中的发展,基于人工智能的病理全切片图像分析为癌症诊断提供了有力的工具。受像素级标注昂贵成本的限制,目前的研究主要集中在使用幻灯片级标签的表示学习上,并在各种下游任务中取得了成功。然而,鉴于病变类型的多样性和彼此之间的复杂关系,这些技术在解决高级病理任务方面仍值得进一步探索。为此,我们引入了分层病理图像分类的概念,并提出了一种称为PathTree的表示学习方法。PathTree将疾病的多重分类视为二叉树结构。每个类别都表示为一个专业的病理文本描述,它用树状编码器传递信息。然后使用交互式文本特征来指导分层多表示的聚合。PathTree使用幻灯片-文本相似性来获得概率分数,并引入两个额外的树特定损失来进一步约束文本和幻灯片之间的关联。通过在三个具有挑战性的分层分类数据集上的广泛实验:内部冷冻切片肺组织病变识别、公共前列腺癌分级评估和公共乳腺癌亚型,我们提出的PathTree与最先进的方法相比始终具有竞争力,并为更复杂的WSI分类提供了深度学习辅助解决方案的新视角。
{"title":"Diagnostic text-guided representation learning in hierarchical classification for pathological whole slide image","authors":"Jiawen Li ,&nbsp;Qiehe Sun ,&nbsp;Renao Yan ,&nbsp;Yizhi Wang ,&nbsp;Yuqiu Fu ,&nbsp;Yani Wei ,&nbsp;Tian Guan ,&nbsp;Huijuan Shi ,&nbsp;Yonghong He ,&nbsp;Anjia Han","doi":"10.1016/j.media.2025.103894","DOIUrl":"10.1016/j.media.2025.103894","url":null,"abstract":"<div><div>With the development of digital imaging in medical microscopy, artificial intelligent-based analysis of pathological whole slide images (WSIs) provides a powerful tool for cancer diagnosis. Limited by the expensive cost of pixel-level annotation, current research primarily focuses on representation learning with slide-level labels, showing success in various downstream tasks. However, given the diversity of lesion types and the complex relationships between each other, these techniques still deserve further exploration in addressing advanced pathology tasks. To this end, we introduce the concept of hierarchical pathological image classification and propose a representation learning called PathTree. PathTree considers the multi-classification of diseases as a binary tree structure. Each category is represented as a professional pathological text description, which messages information with a tree-like encoder. The interactive text features are then used to guide the aggregation of hierarchical multiple representations. PathTree uses slide-text similarity to obtain probability scores and introduces two extra tree-specific losses to further constrain the association between texts and slides. Through extensive experiments on three challenging hierarchical classification datasets: in-house cryosectioned lung tissue lesion identification, public prostate cancer grade assessment, and public breast cancer subtyping, our proposed PathTree is consistently competitive compared to the state-of-the-art methods and provides a new perspective on the deep learning-assisted solution for more complex WSI classification.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103894"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145689528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable classification of endomicroscopic brain data via saliency consistent contrastive learning 通过显著性一致性对比学习对内窥镜下脑数据的可解释分类
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-19 DOI: 10.1016/j.media.2025.103917
Chi Xu , Alfie Roddan , Irini Kakaletri , Patra Charalampaki , Stamatia Giannarou
In neurosurgery, accurate brain tissue characterization via probe-based Confocal Laser Endomicroscopy (pCLE) has become popular for guiding surgical decisions and ensuring safe tumour resections. In order to enable surgeons to trust a tissue classification model, interpretability of the result is required. However, state-of-the-art (SOTA) deep learning models for pCLE data classification exhibit limited interpretability. This paper introduces a novel image classification framework for interpretable brain tissue characterisation using pCLE data. Firstly, instead of the commonly employed cross-entropy based classification loss, we propose Label Contrastive Learning (LCL) loss to learn intra-category similarities and inter-category contrasts. We are then able to generate highly representative data embeddings, which not only improve classification performance but also distinguish characteristics from different tissue classes. Secondly, we design a Saliency Consistency (SC) module to enable the trained model to generate clinically relevant saliency maps of the input data. To further refine the saliency maps, a novel Top-K Maximum and Minimum Pooling (TK-MMP) layer is introduced to our SC module, to increase the contrast of saliency values between non-clinically relevant and clinically relevant areas. For the first time, the Exponential Moving Average (EMA) is used in a novel fashion to update global embeddings of the different tissue categories rather than the weights of the model. In addition, we propose a Global Embedding Inference (GEI) layer to replace learnable classification layers to achieve more robust classification by estimating the cosine similarity between the input data embeddings and global embeddings. Performance evaluation on ex-vivo and in-vivo pCLE brain data verifies that our proposed approach outperforms SOTA classification models in terms of accuracy, robustness and interpretability. Our source codes are released at: https://github.com/XC9292/LCL-SC.git.
在神经外科中,通过基于探针的共聚焦激光内窥镜(pCLE)精确的脑组织表征已成为指导手术决策和确保安全切除肿瘤的流行方法。为了使外科医生能够信任组织分类模型,结果的可解释性是必需的。然而,用于pCLE数据分类的最先进(SOTA)深度学习模型表现出有限的可解释性。本文介绍了一种新的图像分类框架,用于利用pCLE数据进行可解释的脑组织表征。首先,我们提出了标签对比学习(LCL)损失来代替常用的基于交叉熵的分类损失来学习类别内相似性和类别间差异。然后,我们能够生成高度代表性的数据嵌入,这不仅提高了分类性能,而且还区分了不同组织类别的特征。其次,我们设计了一个显著性一致性(SC)模块,使训练后的模型能够生成输入数据的临床相关显著性图。为了进一步完善显著性图,我们在SC模块中引入了一种新的Top-K最大值和最小池化(TK-MMP)层,以增加非临床相关区域和临床相关区域之间显著性值的对比。指数移动平均线(EMA)首次以一种新颖的方式用于更新不同组织类别的全局嵌入,而不是模型的权重。此外,我们提出了一个全局嵌入推理(Global Embedding Inference, GEI)层来取代可学习的分类层,通过估计输入数据嵌入和全局嵌入之间的余弦相似度来实现更鲁棒的分类。对离体和活体pCLE大脑数据的性能评估验证了我们提出的方法在准确性、鲁棒性和可解释性方面优于SOTA分类模型。我们的源代码发布在:https://github.com/XC9292/LCL-SC.git。
{"title":"Interpretable classification of endomicroscopic brain data via saliency consistent contrastive learning","authors":"Chi Xu ,&nbsp;Alfie Roddan ,&nbsp;Irini Kakaletri ,&nbsp;Patra Charalampaki ,&nbsp;Stamatia Giannarou","doi":"10.1016/j.media.2025.103917","DOIUrl":"10.1016/j.media.2025.103917","url":null,"abstract":"<div><div>In neurosurgery, accurate brain tissue characterization via probe-based Confocal Laser Endomicroscopy (pCLE) has become popular for guiding surgical decisions and ensuring safe tumour resections. In order to enable surgeons to trust a tissue classification model, interpretability of the result is required. However, state-of-the-art (SOTA) deep learning models for pCLE data classification exhibit limited interpretability. This paper introduces a novel image classification framework for interpretable brain tissue characterisation using pCLE data. Firstly, instead of the commonly employed cross-entropy based classification loss, we propose Label Contrastive Learning (LCL) loss to learn intra-category similarities and inter-category contrasts. We are then able to generate highly representative data embeddings, which not only improve classification performance but also distinguish characteristics from different tissue classes. Secondly, we design a Saliency Consistency (SC) module to enable the trained model to generate clinically relevant saliency maps of the input data. To further refine the saliency maps, a novel Top-K Maximum and Minimum Pooling (TK-MMP) layer is introduced to our SC module, to increase the contrast of saliency values between non-clinically relevant and clinically relevant areas. For the first time, the Exponential Moving Average (EMA) is used in a novel fashion to update global embeddings of the different tissue categories rather than the weights of the model. In addition, we propose a Global Embedding Inference (GEI) layer to replace learnable classification layers to achieve more robust classification by estimating the cosine similarity between the input data embeddings and global embeddings. Performance evaluation on <em>ex-vivo</em> and <em>in-vivo</em> pCLE brain data verifies that our proposed approach outperforms SOTA classification models in terms of accuracy, robustness and interpretability. Our source codes are released at: <span><span>https://github.com/XC9292/LCL-SC.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103917"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145784487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perivascular space identification nnUNet for generalised usage (PINGU) 通用血管周围空间识别网(PINGU)
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-04 DOI: 10.1016/j.media.2025.103903
Benjamin Sinclair , William Pham , Lucy Vivash , Jasmine Moses , Miranda Lynch , Karina Dorfman , Cassandra Marotta , Shaun Koh , Jacob Bunyamin , Ella Rowsthorn , Alex Jarema , Himashi Peiris , Zhaolin Chen , Sandy R Shultz , David K Wright , Dexiao Kong , Sharon L. Naismith , Terence J. O’Brien , Meng Law
Perivascular spaces (PVSs) form a central component of the brain’s waste clearance system, the glymphatic system. These structures are visible on MRIs when enlarged, and their morphology is associated with aging and neurological disease. Manual quantification of PVS is time consuming and subjective. Numerous deep learning methods for PVS segmentation have been developed for automated segmentation. However, the majority of these algorithms have been developed and evaluated on homogenous datasets and high resolution scans, perhaps limiting their applicability for the wide range of image qualities acquired in clinical and research settings. In this work we train a nnUNet, a top-performing task driven biomedical image segmentation deep learning algorithm, on a heterogenous training sample of manually segmented MRIs of a range of different qualities and resolutions from 7 different datasets acquired on 6 different scanners. These are compared to the two currently publicly available deep learning methods for 3D segmentation of PVS, evaluated on scans with a range of resolutions and qualities. The resulting model, PINGU (Perivascular space Identification Nnunet for Generalised Usage), achieved voxel and cluster level dice scores of 0.50(SD=0.15) and 0.63(0.17) in the white matter (WM), and 0.54 (0.11) and 0.66(0.17) in the basal ganglia (BG). Performance on unseen “external” sites’ data was substantially lower for both PINGU (0.20-0.38 [WM, voxel], 0.29-0.58 [WM, cluster], 0.22-0.36 [BG, voxel], 0.46-0.60 [BG, cluster]) and the publicly available algorithms (0.18-0.30 [WM, voxel], 0.29-0.38 [WM cluster], 0.10-0.20 [BG, voxel], 0.15-0.37 [BG, cluster]). Nonetheless, PINGU strongly outperformed the publicly available algorithms, particularly in the BG. PINGU stands out as broad-use PVS segmentation tool, with particular strength in the BG, an area of PVS highly related to vascular disease and pathology.
血管周围空间(PVSs)是大脑废物清除系统(淋巴系统)的中心组成部分。这些结构放大后在mri上可见,其形态与衰老和神经系统疾病有关。人工量化pv既耗时又主观。为了实现自动分割,已经开发了许多用于pv分割的深度学习方法。然而,这些算法中的大多数都是在同质数据集和高分辨率扫描上开发和评估的,这可能限制了它们在临床和研究环境中获得的广泛图像质量的适用性。在这项工作中,我们训练了nnUNet,一种性能最好的任务驱动的生物医学图像分割深度学习算法,在来自6个不同扫描仪上获得的7个不同数据集的一系列不同质量和分辨率的手动分割mri的异构训练样本上。将这些方法与目前公开的两种用于pv 3D分割的深度学习方法进行比较,并对扫描结果进行一系列分辨率和质量的评估。所得模型PINGU(血管周围空间识别Nnunet for Generalised Usage)在白质(WM)中获得了0.50(SD=0.15)和0.63(0.17)的体素和聚类水平骰子分数,在基底节区(BG)中获得了0.54(0.11)和0.66(0.17)的分数。PINGU (0.20-0.38 [WM,体素],0.29-0.58 [WM,聚类],0.22-0.36 [BG,体素],0.46-0.60 [BG,聚类])和公开的算法(0.18-0.30 [WM,体素],0.29-0.38 [WM聚类],0.10-0.20 [BG,体素],0.15-0.37 [BG,聚类])在未见过的“外部”站点数据上的性能都明显较低。尽管如此,PINGU的表现仍然远远优于公开可用的算法,特别是在BG中。PINGU作为一种广泛使用的PVS分割工具,在与血管疾病和病理高度相关的PVS领域BG中具有特别的优势。
{"title":"Perivascular space identification nnUNet for generalised usage (PINGU)","authors":"Benjamin Sinclair ,&nbsp;William Pham ,&nbsp;Lucy Vivash ,&nbsp;Jasmine Moses ,&nbsp;Miranda Lynch ,&nbsp;Karina Dorfman ,&nbsp;Cassandra Marotta ,&nbsp;Shaun Koh ,&nbsp;Jacob Bunyamin ,&nbsp;Ella Rowsthorn ,&nbsp;Alex Jarema ,&nbsp;Himashi Peiris ,&nbsp;Zhaolin Chen ,&nbsp;Sandy R Shultz ,&nbsp;David K Wright ,&nbsp;Dexiao Kong ,&nbsp;Sharon L. Naismith ,&nbsp;Terence J. O’Brien ,&nbsp;Meng Law","doi":"10.1016/j.media.2025.103903","DOIUrl":"10.1016/j.media.2025.103903","url":null,"abstract":"<div><div>Perivascular spaces (PVSs) form a central component of the brain’s waste clearance system, the glymphatic system. These structures are visible on MRIs when enlarged, and their morphology is associated with aging and neurological disease. Manual quantification of PVS is time consuming and subjective. Numerous deep learning methods for PVS segmentation have been developed for automated segmentation. However, the majority of these algorithms have been developed and evaluated on homogenous datasets and high resolution scans, perhaps limiting their applicability for the wide range of image qualities acquired in clinical and research settings. In this work we train a nnUNet, a top-performing task driven biomedical image segmentation deep learning algorithm, on a heterogenous training sample of manually segmented MRIs of a range of different qualities and resolutions from 7 different datasets acquired on 6 different scanners. These are compared to the two currently publicly available deep learning methods for 3D segmentation of PVS, evaluated on scans with a range of resolutions and qualities. The resulting model, PINGU (Perivascular space Identification Nnunet for Generalised Usage), achieved voxel and cluster level dice scores of 0.50(SD=0.15) and 0.63(0.17) in the white matter (WM), and 0.54 (0.11) and 0.66(0.17) in the basal ganglia (BG). Performance on unseen “external” sites’ data was substantially lower for both PINGU (0.20-0.38 [WM, voxel], 0.29-0.58 [WM, cluster], 0.22-0.36 [BG, voxel], 0.46-0.60 [BG, cluster]) and the publicly available algorithms (0.18-0.30 [WM, voxel], 0.29-0.38 [WM cluster], 0.10-0.20 [BG, voxel], 0.15-0.37 [BG, cluster]). Nonetheless, PINGU strongly outperformed the publicly available algorithms, particularly in the BG. PINGU stands out as broad-use PVS segmentation tool, with particular strength in the BG, an area of PVS highly related to vascular disease and pathology.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103903"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145689529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-cancer framework with cancer-aware attention and adversarial mutual-information minimization for whole slide image classification 基于肿瘤感知关注和对抗性互信息最小化的多肿瘤框架全幻灯片图像分类
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-29 DOI: 10.1016/j.media.2025.103927
Sharon Peled , Yosef E. Maruvka , Moti Freiman
Whole Slide Images (WSIs) are crucial in modern pathology, offering high-resolution data for accurate diagnosis, treatment planning, and research. Deep learning methods have recently been proposed to harness this data by extracting and interpreting complex patterns. However, these approaches often focus on specific tumor types, limiting their generalizability across diverse pathological conditions and restricting scalability. This relatively narrow focus ultimately stems from the inherent heterogeneity in histopathology and the diverse morphological and molecular characteristics of different tumors. To this end, we propose a novel approach for multi-cancer WSI analysis, designed to leverage the diversity of different tumor types. We introduce a Cancer-Aware Attention module that models both shared patterns across cancers and cancer-specific variations to address heterogeneity and enhance cross-tumor generalization. Furthermore, we construct an adversarial cancer regularization mechanism to minimize cancer-specific biases through mutual information minimization. Additionally, we develop a hierarchical sample balancing strategy to mitigate data imbalances and promote unbiased learning. Together, these form a cohesive framework for unbiased multi-cancer WSI analysis. Extensive experiments on a uniquely constructed multi-cancer dataset demonstrate significant improvements in generalization, providing a scalable solution for WSI classification across diverse cancer types.
全幻灯片图像(wsi)在现代病理学中至关重要,为准确诊断、治疗计划和研究提供高分辨率数据。最近提出了深度学习方法,通过提取和解释复杂的模式来利用这些数据。然而,这些方法往往侧重于特定的肿瘤类型,限制了它们在不同病理条件下的通用性,限制了可扩展性。这种相对狭窄的焦点最终源于组织病理学的内在异质性以及不同肿瘤的不同形态和分子特征。为此,我们提出了一种新的多癌WSI分析方法,旨在利用不同肿瘤类型的多样性。我们介绍了一个癌症意识注意模块,该模块既可以模拟癌症之间的共享模式,也可以模拟癌症特异性变异,以解决异质性并增强跨肿瘤的泛化。此外,我们构建了一个对抗性癌症正则化机制,通过互信息最小化来最小化癌症特异性偏差。此外,我们开发了一个分层样本平衡策略,以减轻数据不平衡和促进无偏学习。总之,这些构成了一个有凝聚力的框架,用于无偏倚的多癌WSI分析。在一个独特构建的多癌症数据集上进行的大量实验表明,在泛化方面有了显著的改进,为跨不同癌症类型的WSI分类提供了可扩展的解决方案。
{"title":"Multi-cancer framework with cancer-aware attention and adversarial mutual-information minimization for whole slide image classification","authors":"Sharon Peled ,&nbsp;Yosef E. Maruvka ,&nbsp;Moti Freiman","doi":"10.1016/j.media.2025.103927","DOIUrl":"10.1016/j.media.2025.103927","url":null,"abstract":"<div><div>Whole Slide Images (WSIs) are crucial in modern pathology, offering high-resolution data for accurate diagnosis, treatment planning, and research. Deep learning methods have recently been proposed to harness this data by extracting and interpreting complex patterns. However, these approaches often focus on specific tumor types, limiting their generalizability across diverse pathological conditions and restricting scalability. This relatively narrow focus ultimately stems from the inherent heterogeneity in histopathology and the diverse morphological and molecular characteristics of different tumors. To this end, we propose a novel approach for multi-cancer WSI analysis, designed to leverage the diversity of different tumor types. We introduce a Cancer-Aware Attention module that models both shared patterns across cancers and cancer-specific variations to address heterogeneity and enhance cross-tumor generalization. Furthermore, we construct an adversarial cancer regularization mechanism to minimize cancer-specific biases through mutual information minimization. Additionally, we develop a hierarchical sample balancing strategy to mitigate data imbalances and promote unbiased learning. Together, these form a cohesive framework for unbiased multi-cancer WSI analysis. Extensive experiments on a uniquely constructed multi-cancer dataset demonstrate significant improvements in generalization, providing a scalable solution for WSI classification across diverse cancer types.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103927"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145894002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-contrast CT esophageal varices grading through clinical prior-enhanced multi-organ analysis 非对比CT食管静脉曲张分级的临床多器官分析
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-24 DOI: 10.1016/j.media.2025.103924
Xiaoming Zhang , Chunli Li , Jiacheng Hao , Yuan Gao , Danyang Tu , Jianyi Qiao , Xiaoli Yin , Le Lu , Ling Zhang , Ke Yan , Yang Hou , Yu Shi
Esophageal varices (EV) represent a critical complication of portal hypertension, affecting approximately 60% of cirrhosis patients with a significant bleeding risk of  ∼ 30%. While traditionally diagnosed through invasive endoscopy, non-contrast computed tomography (NCCT) presents a potential non-invasive alternative that has yet to be fully utilized in clinical practice. We present Multi-Organ-COhesion Network++ (MOON++), a novel multimodal framework that enhances EV assessment through comprehensive analysis of NCCT scans. Inspired by clinical evidence correlating organ volumetric relationships with liver disease severity, MOON++ synthesizes imaging characteristics of the esophagus, liver, and spleen through multimodal learning. We evaluated our approach using 1631 patients, those with endoscopically confirmed EV were classified into four severity grades. Validation in 239 patient cases and independent testing in 289 cases demonstrate superior performance compared to conventional single organ methods, achieving an AUC of 0.894 versus 0.803 for the severe grade EV classification (G3 versus  < G3) and 0.921 versus 0.793 for the differentiation of moderate to severe grades ( ≥ G2 versus  < G2). We conducted a reader study involving experienced radiologists to further validate the performance of MOON++. To our knowledge, MOON++ represents the first comprehensive multi-organ NCCT analysis framework incorporating clinical knowledge priors for EV assessment, potentially offering a promising non-invasive diagnostic alternative. Code is available at https://github.com/StevenHaojc/MOON.
食管静脉曲张(EV)是门脉高压的一个重要并发症,影响约60%的肝硬化患者,其显著出血风险为 ~ 30%。虽然传统上通过侵入性内窥镜诊断,但非对比计算机断层扫描(NCCT)提出了一种潜在的非侵入性替代方法,尚未在临床实践中得到充分利用。我们提出了多器官内聚网络++ (moon+ +),这是一个新的多模态框架,通过综合分析NCCT扫描来增强EV评估。受脏器体积与肝脏疾病严重程度相关的临床证据启发,moon++通过多模态学习综合了食道、肝脏和脾脏的影像学特征。我们对1631例经内镜确诊的EV患者进行了评估,这些患者被分为四个严重级别。239例患者的验证和289例患者的独立测试表明,与传统的单器官方法相比,该方法的性能更优越,严重级别EV分类(G3 vs <; G3)的AUC为0.894 vs 0.803,中度至重度EV分类( ≥ G2 vs <; G2)的AUC为0.921 vs 0.793。我们进行了一项涉及经验丰富的放射科医生的读者研究,以进一步验证moon++的性能。据我们所知,moon++代表了第一个综合多器官NCCT分析框架,将临床知识纳入EV评估,可能提供一种有前途的非侵入性诊断替代方案。代码可从https://github.com/StevenHaojc/MOON获得。
{"title":"Non-contrast CT esophageal varices grading through clinical prior-enhanced multi-organ analysis","authors":"Xiaoming Zhang ,&nbsp;Chunli Li ,&nbsp;Jiacheng Hao ,&nbsp;Yuan Gao ,&nbsp;Danyang Tu ,&nbsp;Jianyi Qiao ,&nbsp;Xiaoli Yin ,&nbsp;Le Lu ,&nbsp;Ling Zhang ,&nbsp;Ke Yan ,&nbsp;Yang Hou ,&nbsp;Yu Shi","doi":"10.1016/j.media.2025.103924","DOIUrl":"10.1016/j.media.2025.103924","url":null,"abstract":"<div><div>Esophageal varices (EV) represent a critical complication of portal hypertension, affecting approximately 60% of cirrhosis patients with a significant bleeding risk of  ∼ 30%. While traditionally diagnosed through invasive endoscopy, non-contrast computed tomography (NCCT) presents a potential non-invasive alternative that has yet to be fully utilized in clinical practice. We present Multi-Organ-COhesion Network++ (MOON++), a novel multimodal framework that enhances EV assessment through comprehensive analysis of NCCT scans. Inspired by clinical evidence correlating organ volumetric relationships with liver disease severity, MOON++ synthesizes imaging characteristics of the esophagus, liver, and spleen through multimodal learning. We evaluated our approach using 1631 patients, those with endoscopically confirmed EV were classified into four severity grades. Validation in 239 patient cases and independent testing in 289 cases demonstrate superior performance compared to conventional single organ methods, achieving an AUC of 0.894 versus 0.803 for the severe grade EV classification (G3 versus  &lt; G3) and 0.921 versus 0.793 for the differentiation of moderate to severe grades ( ≥ G2 versus  &lt; G2). We conducted a reader study involving experienced radiologists to further validate the performance of MOON++. To our knowledge, MOON++ represents the first comprehensive multi-organ NCCT analysis framework incorporating clinical knowledge priors for EV assessment, potentially offering a promising non-invasive diagnostic alternative. Code is available at <span><span>https://github.com/StevenHaojc/MOON</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103924"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145822816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CLIP-Guided Generative network for pathology nuclei image augmentation 用于病理核图像增强的clip引导生成网络
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-14 DOI: 10.1016/j.media.2025.103908
Yanan Zhang , Qingyang Liu , Qian Chen , Xiangzhi Bai
Nuclei segmentation and classification play a crucial role in the quantitative analysis of computational pathology (CPath). However, the challenge of creating a large volume of labeled pathology nuclei images due to annotation costs has significantly limited the performance of deep learning-based nuclei segmentation methods. Generative data augmentation offers a promising solution by substantially expanding the available training data without additional annotations. In medical image analysis, Generative Adversarial Networks (GANs) were effective for data augmentation, enhancing model performance by generating realistic synthetic data. However, these approaches lack scalability for multi-class data, as nuclei masks cannot provide sufficient information for diverse image generation. Recently, visual-language foundation models, pretrained on large-scale image-caption pairs, have demonstrated robust performance in pathological diagnostic tasks. In this study, we propose a CLIP-guided generative data augmentation method for nuclei segmentation and classification, leveraging the pretrained pathological CLIP text and image encoders in both the generator and discriminator. Specifically, we first create text descriptions by processing paired histopathology images and nuclei masks, which include information such as organ tissue type, cell count, and nuclei types. These paired text descriptions and nuclei masks are then fed into our multi-modal conditional image generator to guide the synthesis of realistic histopathology images. To ensure the quality of synthesized images, we utilize a high-resolution image discriminator and a CLIP image encoder-based discriminator, focusing on both local and global features of histopathology images. The synthetic histopathology images, paired with corresponding nuclei masks, are integrated into the real dataset to train the nuclei segmentation and classification model. Our experiments, conducted on diverse publicly available pathology nuclei datasets, including both qualitative and quantitative analysis, demonstrate the effectiveness of our proposed method. The experimental results of the nuclei segmentation and classification task underscore the advantages of our data augmentation approach. The code is available at https://github.com/zhangyn1415/CGPN-GAN.
细胞核的分割和分类在计算病理(CPath)的定量分析中起着至关重要的作用。然而,由于标注成本的原因,创建大量标记病理核图像的挑战极大地限制了基于深度学习的核分割方法的性能。生成数据增强提供了一个很有前途的解决方案,它可以在不添加额外注释的情况下大量扩展可用的训练数据。在医学图像分析中,生成对抗网络(GANs)可以有效地增强数据,通过生成真实的合成数据来提高模型性能。然而,这些方法缺乏多类数据的可扩展性,因为核掩模不能为不同的图像生成提供足够的信息。最近,在大规模图像标题对上进行预训练的视觉语言基础模型在病理诊断任务中表现出了强大的性能。在这项研究中,我们提出了一种CLIP引导的生成数据增强方法,用于细胞核分割和分类,在生成器和鉴别器中利用预训练的病理CLIP文本和图像编码器。具体来说,我们首先通过处理成对的组织病理学图像和细胞核掩膜来创建文本描述,其中包括器官组织类型、细胞计数和细胞核类型等信息。然后将这些配对的文本描述和核掩模输入到我们的多模态条件图像生成器中,以指导真实组织病理学图像的合成。为了保证合成图像的质量,我们利用高分辨率图像鉴别器和基于CLIP图像编码器的鉴别器,同时关注组织病理学图像的局部和全局特征。将合成的组织病理学图像与相应的细胞核掩模配对,整合到真实数据集中,训练细胞核分割和分类模型。我们在不同的公开可用的病理核数据集上进行的实验,包括定性和定量分析,证明了我们提出的方法的有效性。核分割和分类任务的实验结果强调了我们的数据增强方法的优势。代码可在https://github.com/zhangyn1415/CGPN-GAN上获得。
{"title":"CLIP-Guided Generative network for pathology nuclei image augmentation","authors":"Yanan Zhang ,&nbsp;Qingyang Liu ,&nbsp;Qian Chen ,&nbsp;Xiangzhi Bai","doi":"10.1016/j.media.2025.103908","DOIUrl":"10.1016/j.media.2025.103908","url":null,"abstract":"<div><div>Nuclei segmentation and classification play a crucial role in the quantitative analysis of computational pathology (CPath). However, the challenge of creating a large volume of labeled pathology nuclei images due to annotation costs has significantly limited the performance of deep learning-based nuclei segmentation methods. Generative data augmentation offers a promising solution by substantially expanding the available training data without additional annotations. In medical image analysis, Generative Adversarial Networks (GANs) were effective for data augmentation, enhancing model performance by generating realistic synthetic data. However, these approaches lack scalability for multi-class data, as nuclei masks cannot provide sufficient information for diverse image generation. Recently, visual-language foundation models, pretrained on large-scale image-caption pairs, have demonstrated robust performance in pathological diagnostic tasks. In this study, we propose a CLIP-guided generative data augmentation method for nuclei segmentation and classification, leveraging the pretrained pathological CLIP text and image encoders in both the generator and discriminator. Specifically, we first create text descriptions by processing paired histopathology images and nuclei masks, which include information such as organ tissue type, cell count, and nuclei types. These paired text descriptions and nuclei masks are then fed into our multi-modal conditional image generator to guide the synthesis of realistic histopathology images. To ensure the quality of synthesized images, we utilize a high-resolution image discriminator and a CLIP image encoder-based discriminator, focusing on both local and global features of histopathology images. The synthetic histopathology images, paired with corresponding nuclei masks, are integrated into the real dataset to train the nuclei segmentation and classification model. Our experiments, conducted on diverse publicly available pathology nuclei datasets, including both qualitative and quantitative analysis, demonstrate the effectiveness of our proposed method. The experimental results of the nuclei segmentation and classification task underscore the advantages of our data augmentation approach. The code is available at <span><span>https://github.com/zhangyn1415/CGPN-GAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103908"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145753375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised anomaly detection in medical imaging using aggregated normative diffusion 基于聚合规范扩散的医学成像无监督异常检测
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-01 Epub Date: 2025-12-08 DOI: 10.1016/j.media.2025.103895
Alexander Frotscher , Jaivardhan Kapoor , Thomas Wolfers , Christian F. Baumgartner
Early detection of anomalies in medical images such as brain magnetic resonance imaging (MRI) is highly relevant for diagnosis and treatment of many medical conditions. Supervised machine learning methods are limited to a small number of pathologies where there is good availability of labeled data. In contrast, unsupervised anomaly detection (UAD) has the potential to identify a broader spectrum of anomalies by spotting deviations from normal patterns. Our research demonstrates that previous state-of-the-art UAD approaches do not generalise well to diverse types of anomalies in multi-modal MRI data. To overcome this, we introduce a new UAD method named Aggregated Normative Diffusion (ANDi). ANDi operates by aggregating differences between predicted denoising steps and ground truth backwards transitions in Denoising Diffusion Probabilistic Models (DDPMs) that have been trained on pyramidal Gaussian noise. We validate ANDi against four recent UAD baselines, and across three diverse brain MRI datasets. We show that ANDi, in some cases, substantially surpasses these baselines and shows increased robustness to varying types of anomalies. Particularly in detecting multiple sclerosis (MS) lesions, ANDi achieves improvements of up to 44 % (0.302 to 0.436 on Lubljana, +0.134) in terms of AUPRC.
早期发现异常的医学图像,如脑磁共振成像(MRI)是高度相关的诊断和治疗许多医疗条件。有监督的机器学习方法仅限于有标记数据可用的少数病理。相比之下,无监督异常检测(UAD)有可能通过发现与正常模式的偏差来识别更广泛的异常范围。我们的研究表明,以前最先进的UAD方法不能很好地推广到多模态MRI数据中不同类型的异常。为了克服这个问题,我们引入了一种新的UAD方法——聚合规范扩散(ANDi)。ANDi通过汇总在金字塔高斯噪声上训练的去噪扩散概率模型(ddpm)中预测去噪步骤和地面真值向后过渡之间的差异来操作。我们通过三个最近的UAD基线和三个不同的脑MRI数据集验证了ANDi。我们表明,在某些情况下,ANDi实质上超过了这些基线,并且对不同类型的异常表现出更高的鲁棒性。特别是在检测多发性硬化症(MS)病变方面,ANDi在AUPRC方面达到了高达73%的提高。
{"title":"Unsupervised anomaly detection in medical imaging using aggregated normative diffusion","authors":"Alexander Frotscher ,&nbsp;Jaivardhan Kapoor ,&nbsp;Thomas Wolfers ,&nbsp;Christian F. Baumgartner","doi":"10.1016/j.media.2025.103895","DOIUrl":"10.1016/j.media.2025.103895","url":null,"abstract":"<div><div>Early detection of anomalies in medical images such as brain magnetic resonance imaging (MRI) is highly relevant for diagnosis and treatment of many medical conditions. Supervised machine learning methods are limited to a small number of pathologies where there is good availability of labeled data. In contrast, <em>unsupervised</em> anomaly detection (UAD) has the potential to identify a broader spectrum of anomalies by spotting deviations from normal patterns. Our research demonstrates that previous state-of-the-art UAD approaches do not generalise well to diverse types of anomalies in multi-modal MRI data. To overcome this, we introduce a new UAD method named Aggregated Normative Diffusion (<span>ANDi</span>). <span>ANDi</span> operates by aggregating differences between predicted denoising steps and ground truth backwards transitions in Denoising Diffusion Probabilistic Models (DDPMs) that have been trained on pyramidal Gaussian noise. We validate <span>ANDi</span> against four recent UAD baselines, and across three diverse brain MRI datasets. We show that <span>ANDi</span>, in some cases, substantially surpasses these baselines and shows increased robustness to varying types of anomalies. Particularly in detecting multiple sclerosis (MS) lesions, <span>ANDi</span> achieves improvements of up to 44 % (0.302 to 0.436 on Lubljana, +0.134) in terms of AUPRC.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103895"},"PeriodicalIF":11.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145705010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1