CIMIL-CRC: A clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H&E stained images

IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer methods and programs in biomedicine Pub Date : 2024-11-19 DOI:10.1016/j.cmpb.2024.108513
Hadar Hezi , Matan Gelber , Alexander Balabanov , Yosef E. Maruvka , Moti Freiman
{"title":"CIMIL-CRC: A clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H&E stained images","authors":"Hadar Hezi ,&nbsp;Matan Gelber ,&nbsp;Alexander Balabanov ,&nbsp;Yosef E. Maruvka ,&nbsp;Moti Freiman","doi":"10.1016/j.cmpb.2024.108513","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objective:</h3><div>Treatment approaches for colorectal cancer (CRC) are highly dependent on the molecular subtype, as immunotherapy has shown efficacy in cases with microsatellite instability (MSI) but is ineffective for the microsatellite stable (MSS) subtype. There is promising potential in utilizing deep neural networks (DNNs) to automate the differentiation of CRC subtypes by analyzing hematoxylin and eosin (H&amp;E) stained whole-slide images (WSIs). Due to the extensive size of WSIs, multiple instance learning (MIL) techniques are typically explored. However, existing MIL methods focus on identifying the most representative image patches for classification, which may result in the loss of critical information. Additionally, these methods often overlook clinically relevant information, like the tendency for MSI class tumors to predominantly occur on the proximal (right side) colon.</div></div><div><h3>Methods:</h3><div>We introduce ‘CIMIL-CRC’, a DNN framework that: (1) solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches, and (2) integrates clinical priors, particularly the tumor location within the colon, into the model to enhance patient-level classification accuracy. We assessed our CIMIL-CRC method using the average area under the receiver operating characteristic curve (AUROC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort, contrasting it with a baseline patch-level classification, a MIL-only approach, and a clinically-informed patch-level classification approach.</div></div><div><h3>Results:</h3><div>Our CIMIL-CRC outperformed all methods (AUROC: <span><math><mrow><mn>0</mn><mo>.</mo><mn>92</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>002</mn></mrow></math></span> (95% CI 0.91–0.92), vs. <span><math><mrow><mn>0</mn><mo>.</mo><mn>79</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>02</mn></mrow></math></span> (95% CI 0.76–0.82), <span><math><mrow><mn>0</mn><mo>.</mo><mn>86</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span> (95% CI 0.85–0.88), and <span><math><mrow><mn>0</mn><mo>.</mo><mn>87</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span> (95% CI 0.86–0.88), respectively). The improvement was statistically significant. To the best of our knowledge, this is the best result achieved for MSI/MSS classification on this dataset.</div></div><div><h3>Conclusion:</h3><div>Our CIMIL-CRC method holds promise for offering insights into the key representations of histopathological images and suggests a straightforward implementation.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"259 ","pages":"Article 108513"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724005066","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Background and objective:

Treatment approaches for colorectal cancer (CRC) are highly dependent on the molecular subtype, as immunotherapy has shown efficacy in cases with microsatellite instability (MSI) but is ineffective for the microsatellite stable (MSS) subtype. There is promising potential in utilizing deep neural networks (DNNs) to automate the differentiation of CRC subtypes by analyzing hematoxylin and eosin (H&E) stained whole-slide images (WSIs). Due to the extensive size of WSIs, multiple instance learning (MIL) techniques are typically explored. However, existing MIL methods focus on identifying the most representative image patches for classification, which may result in the loss of critical information. Additionally, these methods often overlook clinically relevant information, like the tendency for MSI class tumors to predominantly occur on the proximal (right side) colon.

Methods:

We introduce ‘CIMIL-CRC’, a DNN framework that: (1) solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches, and (2) integrates clinical priors, particularly the tumor location within the colon, into the model to enhance patient-level classification accuracy. We assessed our CIMIL-CRC method using the average area under the receiver operating characteristic curve (AUROC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort, contrasting it with a baseline patch-level classification, a MIL-only approach, and a clinically-informed patch-level classification approach.

Results:

Our CIMIL-CRC outperformed all methods (AUROC: 0.92±0.002 (95% CI 0.91–0.92), vs. 0.79±0.02 (95% CI 0.76–0.82), 0.86±0.01 (95% CI 0.85–0.88), and 0.87±0.01 (95% CI 0.86–0.88), respectively). The improvement was statistically significant. To the best of our knowledge, this is the best result achieved for MSI/MSS classification on this dataset.

Conclusion:

Our CIMIL-CRC method holds promise for offering insights into the key representations of histopathological images and suggests a straightforward implementation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CIMIL-CRC:从 H&E 染色图像进行患者级别结直肠癌分子亚型分类的临床信息多实例学习框架
背景和目的:结直肠癌(CRC)的治疗方法在很大程度上取决于分子亚型,因为免疫疗法对微卫星不稳定(MSI)病例有疗效,但对微卫星稳定(MSS)亚型无效。通过分析苏木精和伊红(H&E)染色的全切片图像(WSI),利用深度神经网络(DNN)自动分辨 CRC 亚型具有广阔的前景。由于 WSIs 体积庞大,通常需要探索多实例学习 (MIL) 技术。然而,现有的多实例学习方法侧重于识别最具代表性的图像片段进行分类,这可能会导致关键信息的丢失。此外,这些方法往往忽略了与临床相关的信息,如 MSI 类肿瘤主要发生在结肠近端(右侧)的趋势:(方法:我们介绍了 "CIMIL-CRC",这是一种 DNN 框架,它:(1)通过将预先训练的特征提取模型与主成分分析(PCA)有效结合,汇总来自所有斑块的信息,从而解决 MSI/MSS MIL 问题;(2)将临床先验(尤其是结肠内的肿瘤位置)整合到模型中,以提高患者级别的分类准确性。我们利用在 TCGA-CRC-DX 队列中进行模型开发的 5 倍交叉验证实验设置得出的接收者操作特征曲线下的平均面积(AUROC)评估了我们的 CIMIL-CRC 方法,并将其与基线斑块级分类、纯 MIL 方法和临床信息斑块级分类方法进行了对比。结果:我们的 CIMIL-CRC 优于所有方法(AUROC:0.92±0.002 (95% CI 0.91-0.92) vs. 0.79±0.02 (95% CI 0.76-0.82), 0.86±0.01 (95% CI 0.85-0.88), and 0.87±0.01 (95% CI 0.86-0.88))。这些改善在统计学上具有重要意义。结论:我们的 CIMIL-CRC 方法有望为组织病理学图像的关键表征提供见解,并建议直接实施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
期刊最新文献
Computational hemodynamic indices to identify Transcatheter Aortic Valve Implantation degeneration One-class classification with confound control for cognitive screening in older adults using gait, fingertapping, cognitive, and dual tasks A porohyperelastic scheme targeted at High-Performance Computing frameworks for the simulation of the intervertebral disc CIMIL-CRC: A clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H&E stained images CTGAN-driven synthetic data generation: A multidisciplinary, expert-guided approach (TIMA)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1