Martim Afonso , Praphulla M.S. Bhawsar , Monjoy Saha , Jonas S. Almeida , Arlindo L. Oliveira
{"title":"Multiple Instance Learning for WSI: A comparative analysis of attention-based approaches","authors":"Martim Afonso , Praphulla M.S. Bhawsar , Monjoy Saha , Jonas S. Almeida , Arlindo L. Oliveira","doi":"10.1016/j.jpi.2024.100403","DOIUrl":null,"url":null,"abstract":"<div><div>Whole slide images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to artificial intelligence (AI)-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: (a) accurately predicting the overall cancer phenotype and (b) finding out what cellular morphologies are associated with it at the tile level. To better understand and address these challenges, two existing weakly supervised Multiple Instance Learning (MIL) approaches were explored and compared: Attention MIL (AMIL) and Additive MIL (AdMIL). These architectures were analyzed on tumor detection (a task where these models obtained good results previously) and TP53 mutation detection (a much less explored task). For tumor detection, we built a dataset from Lung Squamous Cell Carcinoma (TCGA-LUSC) slides, with 349 positive and 349 negative slides. The patches were extracted from 5× magnification. For TP53 mutation detection, we explored a dataset built from Invasive Breast Carcinoma (TCGA-BRCA) slides, with 347 positive and 347 negative slides. In this case, we explored three different magnification levels: 5×, 10×, and 20×. Our results show that a modified additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by AMIL (AUC 0.97) on the tumor detection task. TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved. More interestingly from the perspective of the molecular pathologist, we highlight the possible ability of these MIL architectures to identify distinct sensitivities to morphological features (through the detection of regions of interest, ROIs) at different amplification levels. This ability for models to obtain tile-level ROIs is very appealing to pathologists as it provides the possibility for these algorithms to be integrated in a digital staining application for analysis, facilitating the navigation through these high-dimensional images and the diagnostic process.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100403"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Informatics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2153353924000427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Whole slide images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to artificial intelligence (AI)-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: (a) accurately predicting the overall cancer phenotype and (b) finding out what cellular morphologies are associated with it at the tile level. To better understand and address these challenges, two existing weakly supervised Multiple Instance Learning (MIL) approaches were explored and compared: Attention MIL (AMIL) and Additive MIL (AdMIL). These architectures were analyzed on tumor detection (a task where these models obtained good results previously) and TP53 mutation detection (a much less explored task). For tumor detection, we built a dataset from Lung Squamous Cell Carcinoma (TCGA-LUSC) slides, with 349 positive and 349 negative slides. The patches were extracted from 5× magnification. For TP53 mutation detection, we explored a dataset built from Invasive Breast Carcinoma (TCGA-BRCA) slides, with 347 positive and 347 negative slides. In this case, we explored three different magnification levels: 5×, 10×, and 20×. Our results show that a modified additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by AMIL (AUC 0.97) on the tumor detection task. TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved. More interestingly from the perspective of the molecular pathologist, we highlight the possible ability of these MIL architectures to identify distinct sensitivities to morphological features (through the detection of regions of interest, ROIs) at different amplification levels. This ability for models to obtain tile-level ROIs is very appealing to pathologists as it provides the possibility for these algorithms to be integrated in a digital staining application for analysis, facilitating the navigation through these high-dimensional images and the diagnostic process.
期刊介绍:
The Journal of Pathology Informatics (JPI) is an open access peer-reviewed journal dedicated to the advancement of pathology informatics. This is the official journal of the Association for Pathology Informatics (API). The journal aims to publish broadly about pathology informatics and freely disseminate all articles worldwide. This journal is of interest to pathologists, informaticians, academics, researchers, health IT specialists, information officers, IT staff, vendors, and anyone with an interest in informatics. We encourage submissions from anyone with an interest in the field of pathology informatics. We publish all types of papers related to pathology informatics including original research articles, technical notes, reviews, viewpoints, commentaries, editorials, symposia, meeting abstracts, book reviews, and correspondence to the editors. All submissions are subject to rigorous peer review by the well-regarded editorial board and by expert referees in appropriate specialties.