When multiple instance learning meets foundation models: Advancing histological whole slide image analysis

IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Medical image analysis Pub Date : 2025-01-14 DOI:10.1016/j.media.2025.103456
Hongming Xu , Mingkang Wang , Duanbo Shi , Huamin Qin , Yunpeng Zhang , Zaiyi Liu , Anant Madabhushi , Peng Gao , Fengyu Cong , Cheng Lu
{"title":"When multiple instance learning meets foundation models: Advancing histological whole slide image analysis","authors":"Hongming Xu ,&nbsp;Mingkang Wang ,&nbsp;Duanbo Shi ,&nbsp;Huamin Qin ,&nbsp;Yunpeng Zhang ,&nbsp;Zaiyi Liu ,&nbsp;Anant Madabhushi ,&nbsp;Peng Gao ,&nbsp;Fengyu Cong ,&nbsp;Cheng Lu","doi":"10.1016/j.media.2025.103456","DOIUrl":null,"url":null,"abstract":"<div><div>Deep multiple instance learning (MIL) pipelines are the mainstream weakly supervised learning methodologies for whole slide image (WSI) classification. However, it remains unclear how these widely used approaches compare to each other, given the recent proliferation of foundation models (FMs) for patch-level embedding and the diversity of slide-level aggregations. This paper implemented and systematically compared six FMs and six recent MIL methods by organizing different feature extractions and aggregations across seven clinically relevant end-to-end prediction tasks using WSIs from 4044 patients with four different cancer types. We tested state-of-the-art (SOTA) FMs in computational pathology, including CTransPath, PathoDuet, PLIP, CONCH, and UNI, as patch-level feature extractors. Feature aggregators, such as attention-based pooling, transformers, and dynamic graphs were thoroughly tested. Our experiments on cancer grading, biomarker status prediction, and microsatellite instability (MSI) prediction suggest that (1) FMs like UNI, trained with more diverse histological images, outperform generic models with smaller training datasets in patch embeddings, significantly enhancing downstream MIL classification accuracy and model training convergence speed, (2) instance feature fine-tuning, known as online feature re-embedding, to capture both fine-grained details and spatial interactions can often further improve WSI classification performance, (3) FMs advance MIL models by enabling promising grading classifications, biomarker status, and MSI predictions without requiring pixel- or patch-level annotations. These findings encourage the development of advanced, domain-specific FMs, aimed at more universally applicable diagnostic tasks, aligning with the evolving needs of clinical AI in pathology.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"Article 103456"},"PeriodicalIF":11.8000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525000040","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Deep multiple instance learning (MIL) pipelines are the mainstream weakly supervised learning methodologies for whole slide image (WSI) classification. However, it remains unclear how these widely used approaches compare to each other, given the recent proliferation of foundation models (FMs) for patch-level embedding and the diversity of slide-level aggregations. This paper implemented and systematically compared six FMs and six recent MIL methods by organizing different feature extractions and aggregations across seven clinically relevant end-to-end prediction tasks using WSIs from 4044 patients with four different cancer types. We tested state-of-the-art (SOTA) FMs in computational pathology, including CTransPath, PathoDuet, PLIP, CONCH, and UNI, as patch-level feature extractors. Feature aggregators, such as attention-based pooling, transformers, and dynamic graphs were thoroughly tested. Our experiments on cancer grading, biomarker status prediction, and microsatellite instability (MSI) prediction suggest that (1) FMs like UNI, trained with more diverse histological images, outperform generic models with smaller training datasets in patch embeddings, significantly enhancing downstream MIL classification accuracy and model training convergence speed, (2) instance feature fine-tuning, known as online feature re-embedding, to capture both fine-grained details and spatial interactions can often further improve WSI classification performance, (3) FMs advance MIL models by enabling promising grading classifications, biomarker status, and MSI predictions without requiring pixel- or patch-level annotations. These findings encourage the development of advanced, domain-specific FMs, aimed at more universally applicable diagnostic tasks, aligning with the evolving needs of clinical AI in pathology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
当多实例学习满足基础模型时:推进组织整体幻灯片图像分析。
深度多实例学习(MIL)管道是全幻灯片图像分类的主流弱监督学习方法。然而,考虑到最近用于补丁级嵌入的基础模型(FMs)和滑动级聚合的多样性,这些广泛使用的方法如何相互比较仍然不清楚。本文利用来自4044名不同癌症类型的患者的wsi,在7个临床相关的端到端预测任务中组织不同的特征提取和聚合,实现并系统地比较了6种FMs和6种最新的MIL方法。我们在计算病理学中测试了最先进的(SOTA) FMs,包括CTransPath, PathoDuet, PLIP, CONCH和UNI,作为斑块级特征提取器。特征聚合器,如基于注意力的池、变压器和动态图都经过了彻底的测试。我们在癌症分级、生物标志物状态预测和微卫星不稳定性(MSI)预测方面的实验表明:(1)像UNI这样用更多样化的组织图像训练的模型,在贴片嵌入方面优于使用更小训练数据集的通用模型,显著提高了下游MIL分类精度和模型训练收敛速度;(2)实例特征微调,即在线特征重新嵌入;捕获细粒度细节和空间相互作用通常可以进一步提高WSI分类性能。(3)FMs通过实现有前途的分级分类、生物标志物状态和MSI预测,而不需要像素级或补丁级注释,从而推进MSI模型。这些发现鼓励了先进的、特定领域的FMs的发展,旨在实现更普遍适用的诊断任务,与病理临床人工智能不断发展的需求保持一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Medical image analysis
Medical image analysis 工程技术-工程:生物医学
CiteScore
22.10
自引率
6.40%
发文量
309
审稿时长
6.6 months
期刊介绍: Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.
期刊最新文献
Two-stage Contrastive learning framework for Vertebral Compression Fracture Screening in Frontal Chest X-ray Explicit Differentiable Slicing and Global Deformation for Cardiac Mesh Reconstruction Dual Selective Gleason Pattern-Aware Multiple Instance Learning with Uncertainty Regularization for Grade Group Prediction in Histopathology Images Efficient Self-Supervised Barlow Twins from Limited Tissue Slide Cohorts for Colonic Pathology Diagnostics BCIRT: Backscattering-Corrected Implicit Representation Tomography
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1