Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.
数字病理学和计算资源的进步对用于乳腺癌诊断和治疗的计算病理学领域产生了重大影响。然而,获取高质量的乳腺癌标记组织病理学图像是一个巨大的挑战,限制了准确、稳健的深度学习模型的开发。在这篇范围综述中,我们确定了可用于开发深度学习算法的公开可用的乳腺H&E染色全切片图像(WSI)数据集。我们系统地搜索了 9 个科学文献数据库和 9 个研究数据存储库,发现了 17 个公开可用的数据集,包含 10 385 张乳腺癌 H&E WSIs。此外,我们还报告了每个数据集的图像元数据和特征,以帮助研究人员为乳腺癌计算病理学的特定任务选择合适的数据集。此外,我们还编制了两份乳腺 H&E 补丁和私人数据集列表,作为研究人员的补充资源。值得注意的是,只有28%的收录文章使用了多个数据集,只有14%的文章使用了外部验证集,这表明其他已开发模型的性能可能容易被高估。52%的入选研究使用了 TCGA-BRCA。该数据集存在相当大的选择偏差,可能会影响训练算法的稳健性和普适性。此外,乳腺 WSI 数据集缺乏一致的元数据报告,这可能会成为开发精确深度学习模型的一个问题,这表明有必要制定明确的指南来记录乳腺 WSI 数据集的特征和元数据。
{"title":"Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review","authors":"Masoud Tafavvoghi , Lars Ailo Bongo , Nikita Shvetsov , Lill-Tove Rasmussen Busund , Kajsa Møllersen","doi":"10.1016/j.jpi.2024.100363","DOIUrl":"10.1016/j.jpi.2024.100363","url":null,"abstract":"<div><p>Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000026/pdfft?md5=e1d6b199f5ede66427075250c84de4c0&pid=1-s2.0-S2153353924000026-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139824478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-14DOI: 10.1016/j.jpi.2023.100357
Mahdi S. Hosseini , Babak Ehteshami Bejnordi , Vincent Quoc-Huy Trinh , Lyndon Chan , Danial Hasan , Xingwen Li , Stephen Yang , Taehyo Kim , Haochen Zhang , Theodore Wu , Kajanan Chinniah , Sina Maghsoudlou , Ryan Zhang , Jiadai Zhu , Samir Khaki , Andrei Buin , Fatemeh Chaji , Ala Salehi , Bich Ngoc Nguyen , Dimitris Samaras , Konstantinos N. Plataniotis
Computational Pathology (CPath) is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that are mainly address by CPath tools. With evergrowing developments in deep learning and computer vision algorithms, and the ease of the data flow from digital pathology, currently CPath is witnessing a paradigm shift. Despite the sheer volume of engineering and scientific works being introduced for cancer image analysis, there is still a considerable gap of adopting and integrating these algorithms in clinical practice. This raises a significant question regarding the direction and trends that are undertaken in CPath. In this article we provide a comprehensive review of more than 800 papers to address the challenges faced in problem design all-the-way to the application and implementation viewpoints. We have catalogued each paper into a model-card by examining the key works and challenges faced to layout the current landscape in CPath. We hope this helps the community to locate relevant works and facilitate understanding of the field’s future directions. In a nutshell, we oversee the CPath developments in cycle of stages which are required to be cohesively linked together to address the challenges associated with such multidisciplinary science. We overview this cycle from different perspectives of data-centric, model-centric, and application-centric problems. We finally sketch remaining challenges and provide directions for future technical developments and clinical integration of CPath. For updated information on this survey review paper and accessing to the original model cards repository, please refer to GitHub. Updated version of this draft can also be found from arXiv.
{"title":"Computational pathology: A survey review and the way forward","authors":"Mahdi S. Hosseini , Babak Ehteshami Bejnordi , Vincent Quoc-Huy Trinh , Lyndon Chan , Danial Hasan , Xingwen Li , Stephen Yang , Taehyo Kim , Haochen Zhang , Theodore Wu , Kajanan Chinniah , Sina Maghsoudlou , Ryan Zhang , Jiadai Zhu , Samir Khaki , Andrei Buin , Fatemeh Chaji , Ala Salehi , Bich Ngoc Nguyen , Dimitris Samaras , Konstantinos N. Plataniotis","doi":"10.1016/j.jpi.2023.100357","DOIUrl":"10.1016/j.jpi.2023.100357","url":null,"abstract":"<div><p>Computational Pathology (CPath) is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that are mainly address by CPath tools. With evergrowing developments in deep learning and computer vision algorithms, and the ease of the data flow from digital pathology, currently CPath is witnessing a paradigm shift. Despite the sheer volume of engineering and scientific works being introduced for cancer image analysis, there is still a considerable gap of adopting and integrating these algorithms in clinical practice. This raises a significant question regarding the direction and trends that are undertaken in CPath. In this article we provide a comprehensive review of more than 800 papers to address the challenges faced in problem design all-the-way to the application and implementation viewpoints. We have catalogued each paper into a model-card by examining the key works and challenges faced to layout the current landscape in CPath. We hope this helps the community to locate relevant works and facilitate understanding of the field’s future directions. In a nutshell, we oversee the CPath developments in cycle of stages which are required to be cohesively linked together to address the challenges associated with such multidisciplinary science. We overview this cycle from different perspectives of data-centric, model-centric, and application-centric problems. We finally sketch remaining challenges and provide directions for future technical developments and clinical integration of CPath. For updated information on this survey review paper and accessing to the original model cards repository, please refer to <span>GitHub</span><svg><path></path></svg>. Updated version of this draft can also be found from <span>arXiv</span><svg><path></path></svg>.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001712/pdfft?md5=cc2f8380838ba30f7db624e4fbf72b6e&pid=1-s2.0-S2153353923001712-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139538505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-04DOI: 10.1016/j.jpi.2023.100358
Richard F. Xiang
Natural language processing (NLP) has been used to extract information from and summarize medical reports. Currently, the most advanced NLP models require large training datasets of accurately labeled medical text. An approach to creating these large datasets is to use low resource intensive classical NLP algorithms. In this manuscript, we examined how an automated classical NLP algorithm was able to classify portions of bone marrow report text into their appropriate sections. A total of 1480 bone marrow reports were extracted from the laboratory information system of a tertiary healthcare network. The free text of these bone marrow reports were preprocessed by separating the reports into text blocks and then removing the section headers. A natural language processing algorithm involving n-grams and K-means clustering was used to classify the text blocks into their appropriate bone marrow sections. The impact of token replacement of numerical values, accession numbers, and clusters of differentiation, varying the number of centroids (1–19) and n-grams (1–5), and utilizing an ensemble algorithm were assessed. The optimal NLP model was found to employ an ensemble algorithm that incorporated token replacement, utilized 1-gram or bag of words, and 10 centroids for K-means clustering. This optimal model was able to classify text blocks with an accuracy of 89%, suggesting that classical NLP models can accurately classify portions of marrow report text.
{"title":"Use of n-grams and K-means clustering to classify data from free text bone marrow reports","authors":"Richard F. Xiang","doi":"10.1016/j.jpi.2023.100358","DOIUrl":"10.1016/j.jpi.2023.100358","url":null,"abstract":"<div><p>Natural language processing (NLP) has been used to extract information from and summarize medical reports. Currently, the most advanced NLP models require large training datasets of accurately labeled medical text. An approach to creating these large datasets is to use low resource intensive classical NLP algorithms. In this manuscript, we examined how an automated classical NLP algorithm was able to classify portions of bone marrow report text into their appropriate sections. A total of 1480 bone marrow reports were extracted from the laboratory information system of a tertiary healthcare network. The free text of these bone marrow reports were preprocessed by separating the reports into text blocks and then removing the section headers. A natural language processing algorithm involving n-grams and K-means clustering was used to classify the text blocks into their appropriate bone marrow sections. The impact of token replacement of numerical values, accession numbers, and clusters of differentiation, varying the number of centroids (1–19) and n-grams (1–5), and utilizing an ensemble algorithm were assessed. The optimal NLP model was found to employ an ensemble algorithm that incorporated token replacement, utilized 1-gram or bag of words, and 10 centroids for K-means clustering. This optimal model was able to classify text blocks with an accuracy of 89%, suggesting that classical NLP models can accurately classify portions of marrow report text.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001724/pdfft?md5=b7e487105251a8e09617df3f8efc1607&pid=1-s2.0-S2153353923001724-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139395227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-30DOI: 10.1016/j.jpi.2023.100361
Jerome Cheng , Carl Schmidt , Allecia Wilson , Zixi Wang , Wei Hao , Joshua Pantanowitz , Catherine Morris , Randy Tashjian , Liron Pantanowitz
Certain features are helpful in the identification of gunshot entrance and exit wounds, such as the presence of muzzle imprints, peripheral tears, stippling, bone beveling, and wound border irregularity. Some cases are less straightforward and wounds can thus pose challenges to an emergency room doctor or forensic pathologist. In recent years, deep learning has shown promise in various automated medical image classification tasks.
This study explores the feasibility of using a deep learning model to classify entry and exit gunshot wounds in digital color images. A collection of 2418 images of entrance and exit gunshot wounds were procured. Of these, 2028 entrance and 1314 exit wounds were cropped, focusing on the area around each gunshot wound. A ConvNext Tiny deep learning model was trained using the Fastai deep learning library, with a train/validation split ratio of 70/30, until a maximum validation accuracy of 92.6% was achieved. An additional 415 entrance and 293 exit wound images were collected for the test (holdout) set. The model achieved an accuracy of 87.99%, precision of 83.99%, recall of 87.71%, and F1-score 85.81% on the holdout set. Correctly classified were 88.19% of entrance wounds and 87.71% of exit wounds. The results are comparable to what a forensic pathologist can achieve without other morphologic cues. This study represents one of the first applications of artificial intelligence to the field of forensic pathology. This work demonstrates that deep learning models can discern entrance and exit gunshot wounds in digital images with high accuracy.
{"title":"Artificial intelligence for human gunshot wound classification","authors":"Jerome Cheng , Carl Schmidt , Allecia Wilson , Zixi Wang , Wei Hao , Joshua Pantanowitz , Catherine Morris , Randy Tashjian , Liron Pantanowitz","doi":"10.1016/j.jpi.2023.100361","DOIUrl":"10.1016/j.jpi.2023.100361","url":null,"abstract":"<div><p>Certain features are helpful in the identification of gunshot entrance and exit wounds, such as the presence of muzzle imprints, peripheral tears, stippling, bone beveling, and wound border irregularity. Some cases are less straightforward and wounds can thus pose challenges to an emergency room doctor or forensic pathologist. In recent years, deep learning has shown promise in various automated medical image classification tasks.</p><p>This study explores the feasibility of using a deep learning model to classify entry and exit gunshot wounds in digital color images. A collection of 2418 images of entrance and exit gunshot wounds were procured. Of these, 2028 entrance and 1314 exit wounds were cropped, focusing on the area around each gunshot wound. A ConvNext Tiny deep learning model was trained using the Fastai deep learning library, with a train/validation split ratio of 70/30, until a maximum validation accuracy of 92.6% was achieved. An additional 415 entrance and 293 exit wound images were collected for the test (holdout) set. The model achieved an accuracy of 87.99%, precision of 83.99%, recall of 87.71%, and F1-score 85.81% on the holdout set. Correctly classified were 88.19% of entrance wounds and 87.71% of exit wounds. The results are comparable to what a forensic pathologist can achieve without other morphologic cues. This study represents one of the first applications of artificial intelligence to the field of forensic pathology. This work demonstrates that deep learning models can discern entrance and exit gunshot wounds in digital images with high accuracy.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S215335392300175X/pdfft?md5=e69bbf6eb449bdb360e3bbcbd84c9a3a&pid=1-s2.0-S215335392300175X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139190131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, we present a deep-learning-based multimodal classification method for lymphoma diagnosis in digital pathology, which utilizes a whole slide image (WSI) as the primary image data and flow cytometry (FCM) data as auxiliary information. In pathological diagnosis of malignant lymphoma, FCM serves as valuable auxiliary information during the diagnosis process, offering useful insights into predicting the major class (superclass) of subtypes. By incorporating both images and FCM data into the classification process, we can develop a method that mimics the diagnostic process of pathologists, enhancing the explainability. In order to incorporate the hierarchical structure between superclasses and their subclasses, the proposed method utilizes a network structure that effectively combines the mixture of experts (MoE) and multiple instance learning (MIL) techniques, where MIL is widely recognized for its effectiveness in handling WSIs in digital pathology. The MoE network in the proposed method consists of a gating network for superclass classification and multiple expert networks for (sub)class classification, specialized for each superclass. To evaluate the effectiveness of our method, we conducted experiments involving a six-class classification task using 600 lymphoma cases. The proposed method achieved a classification accuracy of 72.3%, surpassing the 69.5% obtained through the straightforward combination of FCM and images, as well as the 70.2% achieved by the method using only images. Moreover, the combination of multiple weights in the MoE and MIL allows for the visualization of specific cellular and tumor regions, resulting in a highly explanatory model that cannot be attained with conventional methods. It is anticipated that by targeting a larger number of classes and increasing the number of expert networks, the proposed method could be effectively applied to the real problem of lymphoma diagnosis.
在这项研究中,我们提出了一种基于深度学习的多模态分类方法,该方法利用全切片图像(WSI)作为主要图像数据,流式细胞术(FCM)数据作为辅助信息,用于数字病理学中的淋巴瘤诊断。在恶性淋巴瘤的病理诊断中,FCM 是诊断过程中非常有价值的辅助信息,为预测亚型的主要类别(超类别)提供了有用的见解。通过将图像和 FCM 数据同时纳入分类过程,我们可以开发出一种模仿病理学家诊断过程的方法,从而提高可解释性。为了将超类与子类之间的层次结构结合起来,所提出的方法采用了一种网络结构,该结构有效地结合了专家混合(MoE)和多实例学习(MIL)技术,其中 MIL 因其在数字病理学中处理 WSI 的有效性而得到广泛认可。拟议方法中的混合专家网络由一个用于超类分类的门控网络和多个用于(子)类分类的专家网络组成,每个超类都有专门的专家网络。为了评估我们方法的有效性,我们使用 600 个淋巴瘤病例进行了六类分类任务实验。所提出的方法达到了 72.3% 的分类准确率,超过了直接结合 FCM 和图像所达到的 69.5%,也超过了只使用图像的方法所达到的 70.2%。此外,MoE 和 MIL 中多重权重的组合可实现特定细胞和肿瘤区域的可视化,从而产生传统方法无法实现的高解释性模型。预计通过针对更多的类别和增加专家网络的数量,所提出的方法可以有效地应用于淋巴瘤诊断的实际问题。
{"title":"Multimodal Gated Mixture of Experts Using Whole Slide Image and Flow Cytometry for Multiple Instance Learning Classification of Lymphoma","authors":"Noriaki Hashimoto , Hiroyuki Hanada , Hiroaki Miyoshi , Miharu Nagaishi , Kensaku Sato , Hidekata Hontani , Koichi Ohshima , Ichiro Takeuchi","doi":"10.1016/j.jpi.2023.100359","DOIUrl":"10.1016/j.jpi.2023.100359","url":null,"abstract":"<div><p>In this study, we present a deep-learning-based multimodal classification method for lymphoma diagnosis in digital pathology, which utilizes a whole slide image (WSI) as the primary image data and flow cytometry (FCM) data as auxiliary information. In pathological diagnosis of malignant lymphoma, FCM serves as valuable auxiliary information during the diagnosis process, offering useful insights into predicting the major class (superclass) of subtypes. By incorporating both images and FCM data into the classification process, we can develop a method that mimics the diagnostic process of pathologists, enhancing the explainability. In order to incorporate the hierarchical structure between superclasses and their subclasses, the proposed method utilizes a network structure that effectively combines the mixture of experts (MoE) and multiple instance learning (MIL) techniques, where MIL is widely recognized for its effectiveness in handling WSIs in digital pathology. The MoE network in the proposed method consists of a gating network for superclass classification and multiple expert networks for (sub)class classification, specialized for each superclass. To evaluate the effectiveness of our method, we conducted experiments involving a six-class classification task using 600 lymphoma cases. The proposed method achieved a classification accuracy of 72.3%, surpassing the 69.5% obtained through the straightforward combination of FCM and images, as well as the 70.2% achieved by the method using only images. Moreover, the combination of multiple weights in the MoE and MIL allows for the visualization of specific cellular and tumor regions, resulting in a highly explanatory model that cannot be attained with conventional methods. It is anticipated that by targeting a larger number of classes and increasing the number of expert networks, the proposed method could be effectively applied to the real problem of lymphoma diagnosis.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001736/pdfft?md5=7da710e168eb41e1143a4b0663efcd4f&pid=1-s2.0-S2153353923001736-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139190722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-29DOI: 10.1016/j.jpi.2023.100360
Benoit Schmauch , Sarah S. Elsoukkary , Amika Moro , Roma Raj , Chase J. Wehrle , Kazunari Sasaki , Julien Calderaro , Patrick Sin-Chan , Federico Aucejo , Daniel E. Roberts
Hepatocellular carcinoma (HCC) is among the most common cancers worldwide, and tumor recurrence following liver resection or transplantation is one of the highest contributors to mortality in HCC patients after surgery. Using artificial intelligence (AI), we developed an interdisciplinary model to predict HCC recurrence and patient survival following surgery. We collected whole-slide H&E images, clinical variables, and follow-up data from 300 patients with HCC who underwent transplant and 169 patients who underwent resection at the Cleveland Clinic. A deep learning model was trained to predict recurrence-free survival (RFS) and disease-specific survival (DSS) from the H&E-stained slides. Repeated cross-validation splits were used to compute robust C-index estimates, and the results were compared to those obtained by fitting a Cox proportional hazard model using only clinical variables. While the deep learning model alone was predictive of recurrence and survival among patients in both cohorts, integrating the clinical and histologic models significantly increased the C-index in each cohort. In every subgroup analyzed, we found that a combined clinical and deep learning model better predicted post-surgical outcome in HCC patients compared to either approach independently.
{"title":"Combining a deep learning model with clinical data better predicts hepatocellular carcinoma behavior following surgery","authors":"Benoit Schmauch , Sarah S. Elsoukkary , Amika Moro , Roma Raj , Chase J. Wehrle , Kazunari Sasaki , Julien Calderaro , Patrick Sin-Chan , Federico Aucejo , Daniel E. Roberts","doi":"10.1016/j.jpi.2023.100360","DOIUrl":"10.1016/j.jpi.2023.100360","url":null,"abstract":"<div><p>Hepatocellular carcinoma (HCC) is among the most common cancers worldwide, and tumor recurrence following liver resection or transplantation is one of the highest contributors to mortality in HCC patients after surgery. Using artificial intelligence (AI), we developed an interdisciplinary model to predict HCC recurrence and patient survival following surgery. We collected whole-slide H&E images, clinical variables, and follow-up data from 300 patients with HCC who underwent transplant and 169 patients who underwent resection at the Cleveland Clinic. A deep learning model was trained to predict recurrence-free survival (RFS) and disease-specific survival (DSS) from the H&E-stained slides. Repeated cross-validation splits were used to compute robust C-index estimates, and the results were compared to those obtained by fitting a Cox proportional hazard model using only clinical variables. While the deep learning model alone was predictive of recurrence and survival among patients in both cohorts, integrating the clinical and histologic models significantly increased the C-index in each cohort. In every subgroup analyzed, we found that a combined clinical and deep learning model better predicted post-surgical outcome in HCC patients compared to either approach independently.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001748/pdfft?md5=765c0e6b2719108fb46126309088e40a&pid=1-s2.0-S2153353923001748-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139191922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.1016/j.jpi.2023.100356
Leonardo Barcellona , Lorenzo Nicolè , Rocco Cappellesso , Angelo Paolo Dei Tos , Stefano Ghidoni
The introduction of deep learning caused a significant breakthrough in digital pathology. Thanks to its capability of mining hidden data patterns in digitised histological slides to resolve diagnostic tasks and extract prognostic and predictive information. However, the high performance achieved in classification tasks depends on the availability of large datasets, whose collection and preprocessing are still time-consuming processes. Therefore, strategies to make these steps more efficient are worth investigation. This work introduces SlideTiler, an open-source software with a user-friendly graphical interface. SlideTiler can manage several image preprocessing phases through an intuitive workflow that does not require specific coding skills. The software was designed to provide direct access to virtual slides, allowing custom tiling of specific regions of interest drawn by the user, tile labelling, quality assessment, and direct export to dataset directories. To illustrate the functions and the scalability of SlideTiler, a deep learning-based classifier was implemented to classify 4 different tumour histotypes available in the TCGA repository. The results demonstrate the effectiveness of SlideTiler in facilitating data preprocessing and promoting accessibility to digitised pathology images for research purposes. Considering the increasing interest in deep learning applications of digital pathology, SlideTiler has a positive impact on this field. Moreover, SlideTiler has been conceived as a dynamic tool in constant evolution, and more updated and efficient versions will be released in the future.
{"title":"SlideTiler: A dataset creator software for boosting deep learning on histological whole slide images","authors":"Leonardo Barcellona , Lorenzo Nicolè , Rocco Cappellesso , Angelo Paolo Dei Tos , Stefano Ghidoni","doi":"10.1016/j.jpi.2023.100356","DOIUrl":"10.1016/j.jpi.2023.100356","url":null,"abstract":"<div><p>The introduction of deep learning caused a significant breakthrough in digital pathology. Thanks to its capability of mining hidden data patterns in digitised histological slides to resolve diagnostic tasks and extract prognostic and predictive information. However, the high performance achieved in classification tasks depends on the availability of large datasets, whose collection and preprocessing are still time-consuming processes. Therefore, strategies to make these steps more efficient are worth investigation. This work introduces SlideTiler, an open-source software with a user-friendly graphical interface. SlideTiler can manage several image preprocessing phases through an intuitive workflow that does not require specific coding skills. The software was designed to provide direct access to virtual slides, allowing custom tiling of specific regions of interest drawn by the user, tile labelling, quality assessment, and direct export to dataset directories. To illustrate the functions and the scalability of SlideTiler, a deep learning-based classifier was implemented to classify 4 different tumour histotypes available in the TCGA repository. The results demonstrate the effectiveness of SlideTiler in facilitating data preprocessing and promoting accessibility to digitised pathology images for research purposes. Considering the increasing interest in deep learning applications of digital pathology, SlideTiler has a positive impact on this field. Moreover, SlideTiler has been conceived as a dynamic tool in constant evolution, and more updated and efficient versions will be released in the future.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001700/pdfft?md5=8704f1d3116c95cedb709a6224e1022e&pid=1-s2.0-S2153353923001700-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138624638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evaluation of the parameters such as tumor microenvironment (TME) and tumor budding (TB) is one of the most important steps in colorectal cancer (CRC) diagnosis and cancer development prognosis. In recent years, artificial intelligence (AI) has been successfully used to solve such problems. In this paper, we summarize the latest data on the use of artificial intelligence to predict tumor microenvironment and tumor budding in histological scans of patients with colorectal cancer. We performed a systematic literature search using 2 databases (Medline and Scopus) with the following search terms: ("tumor microenvironment" OR "tumor budding") AND ("colorectal cancer" OR CRC) AND ("artificial intelligence" OR "machine learning " OR "deep learning"). During the analysis, we gathered from the articles performance scores such as sensitivity, specificity, and accuracy of identifying TME and TB using artificial intelligence. The systematic review showed that machine learning and deep learning successfully cope with the prediction of these parameters. The highest accuracy values in TB and TME prediction were 97.7% and 97.3%, respectively. This review led us to the conclusion that AI platforms can already be used as diagnostic aids, which will greatly facilitate the work of pathologists in detection and estimation of TB and TME as instruments and second-opinion services. A key limitation in writing this systematic review was the heterogeneous use of performance metrics for machine learning models by different authors, as well as relatively small datasets used in some studies.
{"title":"Artificial intelligence (AI) for tumor microenvironment (TME) and tumor budding (TB) identification in colorectal cancer (CRC) patients: A systematic review","authors":"Olga Andreevna Lobanova , Anastasia Olegovna Kolesnikova , Valeria Aleksandrovna Ponomareva , Ksenia Andreevna Vekhova , Anaida Lusparonovna Shaginyan , Alisa Borisovna Semenova , Dmitry Petrovich Nekhoroshkov , Svetlana Evgenievna Kochetkova , Natalia Valeryevna Kretova , Alexander Sergeevich Zanozin , Maria Alekseevna Peshkova , Natalia Borisovna Serezhnikova , Nikolay Vladimirovich Zharkov , Evgeniya Altarovna Kogan , Alexander Alekseevich Biryukov , Ekaterina Evgenievna Rudenko , Tatiana Alexandrovna Demura","doi":"10.1016/j.jpi.2023.100353","DOIUrl":"https://doi.org/10.1016/j.jpi.2023.100353","url":null,"abstract":"<div><p>Evaluation of the parameters such as tumor microenvironment (TME) and tumor budding (TB) is one of the most important steps in colorectal cancer (CRC) diagnosis and cancer development prognosis. In recent years, artificial intelligence (AI) has been successfully used to solve such problems. In this paper, we summarize the latest data on the use of artificial intelligence to predict tumor microenvironment and tumor budding in histological scans of patients with colorectal cancer. We performed a systematic literature search using 2 databases (Medline and Scopus) with the following search terms: (\"tumor microenvironment\" OR \"tumor budding\") AND (\"colorectal cancer\" OR CRC) AND (\"artificial intelligence\" OR \"machine learning \" OR \"deep learning\"). During the analysis, we gathered from the articles performance scores such as sensitivity, specificity, and accuracy of identifying TME and TB using artificial intelligence. The systematic review showed that machine learning and deep learning successfully cope with the prediction of these parameters. The highest accuracy values in TB and TME prediction were 97.7% and 97.3%, respectively. This review led us to the conclusion that AI platforms can already be used as diagnostic aids, which will greatly facilitate the work of pathologists in detection and estimation of TB and TME as instruments and second-opinion services. A key limitation in writing this systematic review was the heterogeneous use of performance metrics for machine learning models by different authors, as well as relatively small datasets used in some studies.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001670/pdfft?md5=02bdc3221d561cd21e9de3f4dfe5954d&pid=1-s2.0-S2153353923001670-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139038548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-22DOI: 10.1016/j.jpi.2023.100354
Francesco Martino , Gennaro Ilardi , Silvia Varricchio , Daniela Russo , Rosa Maria Di Crescenzo , Stefania Staibano , Francesco Merolla
Anatomical pathology is undergoing its third revolution, transitioning from analogical to digital pathology and incorporating new artificial intelligence technologies into clinical practice. Aside from classification, detection, and segmentation models, predictive models are gaining traction since they can impact diagnostic processes and laboratory activity, lowering consumable usage and turnaround time. Our research aimed to create a deep-learning model to generate synthetic Ki-67 immunohistochemistry from Haematoxylin and Eosin (H&E) stained images. We used 175 oral squamous cell carcinoma (OSCC) from the University Federico II’s Pathology Unit’s archives to train our model to generate 4 Tissue Micro Arrays (TMAs). We sectioned one slide from each TMA, first stained with H&E and then re-stained with anti-Ki-67 immunohistochemistry (IHC). In digitised slides, cores were disarrayed, and the matching cores of the 2 stained were aligned to construct a dataset to train a Pix2Pix algorithm to convert H&E images to IHC. Pathologists could recognise the synthetic images in only half of the cases in a specially designed likelihood test. Hence, our model produced realistic synthetic images. We next used QuPath to quantify IHC positivity, achieving remarkable levels of agreement between genuine and synthetic IHC.
Furthermore, a categorical analysis employing 3 Ki-67 positivity cut-offs (5%, 10%, and 15%) revealed high positive-predictive values. Our model is a promising tool for collecting Ki-67 positivity information directly on H&E slides, reducing laboratory demand and improving patient management. It is also a valuable option for smaller laboratories to easily and quickly screen bioptic samples and prioritise them in a digital pathology workflow.
{"title":"A deep learning model to predict Ki-67 positivity in oral squamous cell carcinoma","authors":"Francesco Martino , Gennaro Ilardi , Silvia Varricchio , Daniela Russo , Rosa Maria Di Crescenzo , Stefania Staibano , Francesco Merolla","doi":"10.1016/j.jpi.2023.100354","DOIUrl":"https://doi.org/10.1016/j.jpi.2023.100354","url":null,"abstract":"<div><p>Anatomical pathology is undergoing its third revolution, transitioning from analogical to digital pathology and incorporating new artificial intelligence technologies into clinical practice. Aside from classification, detection, and segmentation models, predictive models are gaining traction since they can impact diagnostic processes and laboratory activity, lowering consumable usage and turnaround time. Our research aimed to create a deep-learning model to generate synthetic Ki-67 immunohistochemistry from Haematoxylin and Eosin (H&E) stained images. We used 175 oral squamous cell carcinoma (OSCC) from the University Federico II’s Pathology Unit’s archives to train our model to generate 4 Tissue Micro Arrays (TMAs). We sectioned one slide from each TMA, first stained with H&E and then re-stained with anti-Ki-67 immunohistochemistry (IHC). In digitised slides, cores were disarrayed, and the matching cores of the 2 stained were aligned to construct a dataset to train a Pix2Pix algorithm to convert H&E images to IHC. Pathologists could recognise the synthetic images in only half of the cases in a specially designed likelihood test. Hence, our model produced realistic synthetic images. We next used QuPath to quantify IHC positivity, achieving remarkable levels of agreement between genuine and synthetic IHC.</p><p>Furthermore, a categorical analysis employing 3 Ki-67 positivity cut-offs (5%, 10%, and 15%) revealed high positive-predictive values. Our model is a promising tool for collecting Ki-67 positivity information directly on H&E slides, reducing laboratory demand and improving patient management. It is also a valuable option for smaller laboratories to easily and quickly screen bioptic samples and prioritise them in a digital pathology workflow.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001682/pdfft?md5=1e50c2def78451a443dc9e3c3d022cbc&pid=1-s2.0-S2153353923001682-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138558826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.1016/j.jpi.2023.100352
Rahul Rajendran, Rachel C. Beck, Morteza M. Waskasi, Brian D. Kelly, Daniel R. Bauer
As our understanding of the tumor microenvironment grows, the pathology field is increasingly utilizing multianalyte diagnostic assays to understand important characteristics of tumor growth. In clinical settings, brightfield chromogenic assays represent the gold-standard and have developed significant trust as the first-line diagnostic method. However, conventional brightfield tests have been limited to low-order assays that are visually interrogated. We have developed a hybrid method of brightfield chromogenic multiplexing that overcomes these limitations and enables high-order multiplex assays. However, how compatible high-order brightfield multiplexed images are with advanced analytical algorithms has not been extensively evaluated. In the present study, we address this gap by developing a novel 6-marker prostate cancer assay that targets diverse aspects of the tumor microenvironment such as prostate-specific biomarkers (PSMA and p504s), immune biomarkers (CD8 and PD-L1), a prognostic biomarker (Ki-67), as well as an adjunctive diagnostic biomarker (basal cell cocktail) and apply the assay to 143 differentially graded adenocarcinoma prostate tissues. The tissues were then imaged on our spectroscopic multiplexing imaging platform and mined for proteomic and spatial features that were correlated with cancer presence and disease grade. Extracted features were used to train a UMAP model that differentiated healthy from cancerous tissue with an accuracy of 89% and identified clusters of cells based on cancer grade. For spatial analysis, cell-to-cell distances were calculated for all biomarkers and differences between healthy and adenocarcinoma tissues were studied. We report that p504s positive cells were at least 2× closer to cells expressing PD-L1, CD8, Ki-67, and basal cell in adenocarcinoma tissues relative to the healthy control tissues. These findings offer a powerful insight to understand the fingerprint of the prostate tumor microenvironment and indicate that high-order chromogenic multiplexing is compatible with digital analysis. Thus, the presented chromogenic multiplexing system combines the clinical applicability of brightfield assays with the emerging diagnostic power of high-order multiplexing in a digital pathology friendly format that is well-suited for translational studies to better understand mechanisms of tumor development and growth.
{"title":"Digital analysis of the prostate tumor microenvironment with high-order chromogenic multiplexing","authors":"Rahul Rajendran, Rachel C. Beck, Morteza M. Waskasi, Brian D. Kelly, Daniel R. Bauer","doi":"10.1016/j.jpi.2023.100352","DOIUrl":"https://doi.org/10.1016/j.jpi.2023.100352","url":null,"abstract":"<div><p>As our understanding of the tumor microenvironment grows, the pathology field is increasingly utilizing multianalyte diagnostic assays to understand important characteristics of tumor growth. In clinical settings, brightfield chromogenic assays represent the gold-standard and have developed significant trust as the first-line diagnostic method. However, conventional brightfield tests have been limited to low-order assays that are visually interrogated. We have developed a hybrid method of brightfield chromogenic multiplexing that overcomes these limitations and enables high-order multiplex assays. However, how compatible high-order brightfield multiplexed images are with advanced analytical algorithms has not been extensively evaluated. In the present study, we address this gap by developing a novel 6-marker prostate cancer assay that targets diverse aspects of the tumor microenvironment such as prostate-specific biomarkers (PSMA and p504s), immune biomarkers (CD8 and PD-L1), a prognostic biomarker (Ki-67), as well as an adjunctive diagnostic biomarker (basal cell cocktail) and apply the assay to 143 differentially graded adenocarcinoma prostate tissues. The tissues were then imaged on our spectroscopic multiplexing imaging platform and mined for proteomic and spatial features that were correlated with cancer presence and disease grade. Extracted features were used to train a UMAP model that differentiated healthy from cancerous tissue with an accuracy of 89% and identified clusters of cells based on cancer grade. For spatial analysis, cell-to-cell distances were calculated for all biomarkers and differences between healthy and adenocarcinoma tissues were studied. We report that p504s positive cells were at least 2× closer to cells expressing PD-L1, CD8, Ki-67, and basal cell in adenocarcinoma tissues relative to the healthy control tissues. These findings offer a powerful insight to understand the fingerprint of the prostate tumor microenvironment and indicate that high-order chromogenic multiplexing is compatible with digital analysis. Thus, the presented chromogenic multiplexing system combines the clinical applicability of brightfield assays with the emerging diagnostic power of high-order multiplexing in a digital pathology friendly format that is well-suited for translational studies to better understand mechanisms of tumor development and growth.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353923001669/pdfft?md5=0e916a6f16983a6fd85779bc435e915c&pid=1-s2.0-S2153353923001669-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138633481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}