Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100413
Ryoichi Koga , Tatsuya Yokota , Koji Arihiro , Hidekata Hontani
We propose a method of attention induction to improve an attention mechanism in a whole slide image (WSI) classifier. Generally, only some regions in a WSI are useful for lesion classification, and the WSI classifier is required to find and focus on such regions for the classification. Multiple instance learning and hierarchical representation learning are widely employed for WSI processing and both use attention mechanisms to automatically find the useful regions and then conduct the class prediction. Here, it is impractical to collect a large number of WSIs, and when the attention mechanism is trained with a small number of training WSIs, the resultant attention often fails to focus on the useful regions. To improve the attention mechanism without increasing the number of training WSIs, we propose a method of attention induction for a hierarchical representation of WSI that guides attention to focus on the regions useful for lesion classification based on pathologist's coarse annotations. Our experimental results demonstrate that the proposed method improves the attention mechanism, thereby enhancing the performance of WSI classification.
{"title":"Attention induction based on pathologist annotations for improving whole slide pathology image classifier","authors":"Ryoichi Koga , Tatsuya Yokota , Koji Arihiro , Hidekata Hontani","doi":"10.1016/j.jpi.2024.100413","DOIUrl":"10.1016/j.jpi.2024.100413","url":null,"abstract":"<div><div>We propose a method of <em>attention induction</em> to improve an attention mechanism in a whole slide image (WSI) classifier. Generally, only some regions in a WSI are useful for lesion classification, and the WSI classifier is required to find and focus on such regions for the classification. Multiple instance learning and hierarchical representation learning are widely employed for WSI processing and both use attention mechanisms to automatically find the useful regions and then conduct the class prediction. Here, it is impractical to collect a large number of WSIs, and when the attention mechanism is trained with a small number of training WSIs, the resultant attention often fails to focus on the useful regions. To improve the attention mechanism without increasing the number of training WSIs, we propose a method of attention induction for a hierarchical representation of WSI that guides attention to focus on the regions useful for lesion classification based on pathologist's coarse annotations. Our experimental results demonstrate that the proposed method improves the attention mechanism, thereby enhancing the performance of WSI classification.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100413"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11750489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100408
Sana Ahuja, Sufian Zaheer
Pathology, a cornerstone of medical diagnostics and research, is undergoing a revolutionary transformation fueled by digital technology, molecular biology advancements, and big data analytics. Digital pathology converts conventional glass slides into high-resolution digital images, enhancing collaboration and efficiency among pathologists worldwide. Integrating artificial intelligence (AI) and machine learning (ML) algorithms with digital pathology improves diagnostic accuracy, particularly in complex diseases like cancer. Molecular pathology, facilitated by next-generation sequencing (NGS), provides comprehensive genomic, transcriptomic, and proteomic insights into disease mechanisms, guiding personalized therapies. Immunohistochemistry (IHC) plays a pivotal role in biomarker discovery, refining disease classification and prognostication. Precision medicine integrates pathology's molecular findings with individual genetic, environmental, and lifestyle factors to customize treatment strategies, optimizing patient outcomes. Telepathology extends diagnostic services to underserved areas through remote digital pathology. Pathomics leverages big data analytics to extract meaningful insights from pathology images, advancing our understanding of disease pathology and therapeutic targets. Virtual autopsies employ non-invasive imaging technologies to revolutionize forensic pathology. These innovations promise earlier diagnoses, tailored treatments, and enhanced patient care. Collaboration across disciplines is essential to fully realize the transformative potential of these advancements in medical practice and research.
{"title":"Advancements in pathology: Digital transformation, precision medicine, and beyond","authors":"Sana Ahuja, Sufian Zaheer","doi":"10.1016/j.jpi.2024.100408","DOIUrl":"10.1016/j.jpi.2024.100408","url":null,"abstract":"<div><div>Pathology, a cornerstone of medical diagnostics and research, is undergoing a revolutionary transformation fueled by digital technology, molecular biology advancements, and big data analytics. Digital pathology converts conventional glass slides into high-resolution digital images, enhancing collaboration and efficiency among pathologists worldwide. Integrating artificial intelligence (AI) and machine learning (ML) algorithms with digital pathology improves diagnostic accuracy, particularly in complex diseases like cancer. Molecular pathology, facilitated by next-generation sequencing (NGS), provides comprehensive genomic, transcriptomic, and proteomic insights into disease mechanisms, guiding personalized therapies. Immunohistochemistry (IHC) plays a pivotal role in biomarker discovery, refining disease classification and prognostication. Precision medicine integrates pathology's molecular findings with individual genetic, environmental, and lifestyle factors to customize treatment strategies, optimizing patient outcomes. Telepathology extends diagnostic services to underserved areas through remote digital pathology. Pathomics leverages big data analytics to extract meaningful insights from pathology images, advancing our understanding of disease pathology and therapeutic targets. Virtual autopsies employ non-invasive imaging technologies to revolutionize forensic pathology. These innovations promise earlier diagnoses, tailored treatments, and enhanced patient care. Collaboration across disciplines is essential to fully realize the transformative potential of these advancements in medical practice and research.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100408"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143092063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100411
Victor Garcia , Emma Gardecki , Stephanie Jou , Xiaoxian Li , Kenneth R. Shroyer , Joel Saltz , Balazs Acs , Katherine Elfer , Jochen Lennerz , Roberto Salgado , Brandon D. Gallas
Objective
With the increasing energy surrounding the development of artificial intelligence and machine learning (AI/ML) models, the use of the same external validation dataset by various developers allows for a direct comparison of model performance. Through our High Throughput Truthing project, we are creating a validation dataset for AI/ML models trained in the assessment of stromal tumor-infiltrating lymphocytes (sTILs) in triple negative breast cancer (TNBC).
Materials and methods
We obtained clinical metadata for hematoxylin and eosin-stained glass slides and corresponding scanned whole slide images (WSIs) of TNBC core biopsies from two US academic medical centers. We selected regions of interest (ROIs) from the WSIs to target regions with various tissue morphologies and sTILs densities. Given the selected ROIs, we implemented a hierarchical rank-sort method for case prioritization.
Results
We received 122 glass slides and clinical metadata on 105 unique patients with TNBC. All received cases were female, and the mean age was 63.44 years. 60% of all cases were White patients, and 38.1% were Black or African American. After case prioritization, the skewness of the sTILs density distribution improved from 0.60 to 0.46 with a corresponding increase in the entropy of the sTILs density bins from 1.20 to 1.24. We retained cases with less prevalent metadata elements.
Conclusion
This method allows us to prioritize underrepresented subgroups based on important clinical factors. In this manuscript, we discuss how we sourced the clinical metadata, selected ROIs, and developed our approach to prioritizing cases for inclusion in our pivotal study.
{"title":"Prioritizing cases from a multi-institutional cohort for a dataset of pathologist annotations","authors":"Victor Garcia , Emma Gardecki , Stephanie Jou , Xiaoxian Li , Kenneth R. Shroyer , Joel Saltz , Balazs Acs , Katherine Elfer , Jochen Lennerz , Roberto Salgado , Brandon D. Gallas","doi":"10.1016/j.jpi.2024.100411","DOIUrl":"10.1016/j.jpi.2024.100411","url":null,"abstract":"<div><h3>Objective</h3><div>With the increasing energy surrounding the development of artificial intelligence and machine learning (AI/ML) models, the use of the same external validation dataset by various developers allows for a direct comparison of model performance. Through our High Throughput Truthing project, we are creating a validation dataset for AI/ML models trained in the assessment of stromal tumor-infiltrating lymphocytes (sTILs) in triple negative breast cancer (TNBC).</div></div><div><h3>Materials and methods</h3><div>We obtained clinical metadata for hematoxylin and eosin-stained glass slides and corresponding scanned whole slide images (WSIs) of TNBC core biopsies from two US academic medical centers. We selected regions of interest (ROIs) from the WSIs to target regions with various tissue morphologies and sTILs densities. Given the selected ROIs, we implemented a hierarchical rank-sort method for case prioritization.</div></div><div><h3>Results</h3><div>We received 122 glass slides and clinical metadata on 105 unique patients with TNBC. All received cases were female, and the mean age was 63.44 years. 60% of all cases were White patients, and 38.1% were Black or African American. After case prioritization, the skewness of the sTILs density distribution improved from 0.60 to 0.46 with a corresponding increase in the entropy of the sTILs density bins from 1.20 to 1.24. We retained cases with less prevalent metadata elements.</div></div><div><h3>Conclusion</h3><div>This method allows us to prioritize underrepresented subgroups based on important clinical factors. In this manuscript, we discuss how we sourced the clinical metadata, selected ROIs, and developed our approach to prioritizing cases for inclusion in our pivotal study.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100411"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11667696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100416
Mahdieh Shabanian , Zachary Taylor , Christopher Woods , Anas Bernieh , Jonathan Dillman , Lili He , Sarangarajan Ranganathan , Jennifer Picarsic , Elanchezhian Somasundaram
Background
Traditional liver fibrosis staging via percutaneous biopsy suffers from sampling bias and variable inter-pathologist agreement, highlighting the need for more objective techniques. Deep learning models for disease staging from medical images have shown potential to decrease diagnostic variability, with recent weakly supervised learning strategies showing promising results even with limited manual annotation.
Purpose
To study the clustering-constrained attention multiple instance learning (CLAM) approach for staging liver fibrosis on trichrome whole slide images (WSIs) of children and young adults.
Methods
This is an ethics board approved retrospective study utilizing 217 trichrome WSI from pediatric liver biopsies for model development and testing. Two pediatric pathologists scored WSI using two liver fibrosis staging systems, METAVIR and Ishak. Cases were then secondarily categorized into either high- or low-stage liver fibrosis and used for model development. The CLAM pipeline was used to develop binary classification models for histological liver fibrosis. Model performance was evaluated using area under the curve (AUC), accuracy, sensitivity, specificity, and Cohen's Kappa.
Results
The CLAM models showed strong diagnostic performance, with sensitivities up to 0.76 and AUCs up to 0.92 for distinguishing low- and high-stage fibrosis. The agreement between model predictions and average pathologist scores was moderate to substantial (Kappa: 0.57–0.69), whereas pathologist agreement on the METAVIR and Ishak scoring systems was only fair (Kappa: 0.39–0.46).
Conclusions
CLAM pipeline showed promise in detecting features important for differentiating low- and high-stage fibrosis from trichrome WSI based on the results, offering a promising objective method for liver fibrosis detection in children and young adults.
{"title":"Liver fibrosis classification on trichrome histology slides using weakly supervised learning in children and young adults","authors":"Mahdieh Shabanian , Zachary Taylor , Christopher Woods , Anas Bernieh , Jonathan Dillman , Lili He , Sarangarajan Ranganathan , Jennifer Picarsic , Elanchezhian Somasundaram","doi":"10.1016/j.jpi.2024.100416","DOIUrl":"10.1016/j.jpi.2024.100416","url":null,"abstract":"<div><h3>Background</h3><div>Traditional liver fibrosis staging via percutaneous biopsy suffers from sampling bias and variable inter-pathologist agreement, highlighting the need for more objective techniques. Deep learning models for disease staging from medical images have shown potential to decrease diagnostic variability, with recent weakly supervised learning strategies showing promising results even with limited manual annotation.</div></div><div><h3>Purpose</h3><div>To study the clustering-constrained attention multiple instance learning (CLAM) approach for staging liver fibrosis on trichrome whole slide images (WSIs) of children and young adults.</div></div><div><h3>Methods</h3><div>This is an ethics board approved retrospective study utilizing 217 trichrome WSI from pediatric liver biopsies for model development and testing. Two pediatric pathologists scored WSI using two liver fibrosis staging systems, METAVIR and Ishak. Cases were then secondarily categorized into either high- or low-stage liver fibrosis and used for model development. The CLAM pipeline was used to develop binary classification models for histological liver fibrosis. Model performance was evaluated using area under the curve (AUC), accuracy, sensitivity, specificity, and Cohen's Kappa.</div></div><div><h3>Results</h3><div>The CLAM models showed strong diagnostic performance, with sensitivities up to 0.76 and AUCs up to 0.92 for distinguishing low- and high-stage fibrosis. The agreement between model predictions and average pathologist scores was moderate to substantial (Kappa: 0.57–0.69), whereas pathologist agreement on the METAVIR and Ishak scoring systems was only fair (Kappa: 0.39–0.46).</div></div><div><h3>Conclusions</h3><div>CLAM pipeline showed promise in detecting features important for differentiating low- and high-stage fibrosis from trichrome WSI based on the results, offering a promising objective method for liver fibrosis detection in children and young adults.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100416"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760786/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143048182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100406
Sanghoon Kang , Jesus D. Penaloza Aponte , Omar Elashkar , Juan Francisco Morales , Nicholas Waddington , Damon G. Lamb , Huiwen Ju , Martha Campbell-Thompson , Sarah Kim
Human islets display a high degree of heterogeneity in terms of size, number, architecture, and endocrine cell-type compositions. An ever-increasing number of immunohistochemistry-stained whole slide images (WSIs) are available through the online pathology database of the Network for Pancreatic Organ donors with Diabetes (nPOD) program at the University of Florida (UF). We aimed to develop an enhanced machine learning-assisted WSI analysis workflow to utilize the nPOD resource for analysis of endocrine cell heterogeneity in the natural history of type 1 diabetes (T1D) in comparison to donors without diabetes. To maximize usability, the user-friendly open-source software QuPath was selected for the main interface. The WSI data were analyzed with two pre-trained machine learning models (i.e., Segment Anything Model (SAM) and QuPath's pixel classifier), using the UF high-performance-computing cluster, HiPerGator. SAM was used to define precise endocrine cell and cell grouping boundaries (with an average quality score of 0.91 per slide), and the artificial neural network-based pixel classifier was applied to segment areas of insulin- or glucagon-stained cytoplasmic regions within each endocrine cell. An additional script was developed to automatically count CD3+ cells inside and within 20 μm of each islet perimeter to quantify the number of islets with inflammation (i.e., CD3+ T-cell infiltration). Proof-of-concept analysis was performed to test the developed workflow in 12 subjects using 24 slides. This open-source machine learning-assisted workflow enables rapid and high throughput determinations of endocrine cells, whether as single cells or within groups, across hundreds of slides. It is expected that the use of this workflow will accelerate our understanding of endocrine cell and islet heterogeneity in the context of T1D endotypes and pathogenesis.
人类胰岛在大小、数量、结构和内分泌细胞类型组成方面表现出高度的异质性。越来越多的免疫组织化学染色的全切片图像(WSIs)可以通过佛罗里达大学(UF)的胰腺器官供体网络(nPOD)项目的在线病理数据库获得。我们的目标是开发一种增强的机器学习辅助WSI分析工作流程,利用nPOD资源分析1型糖尿病(T1D)自然史中与非糖尿病供者相比的内分泌细胞异质性。为了最大限度地提高可用性,选择了用户友好的开源软件QuPath作为主界面。使用UF高性能计算集群HiPerGator,使用两个预训练的机器学习模型(即Segment Anything Model (SAM)和QuPath的像素分类器)分析WSI数据。使用SAM定义精确的内分泌细胞和细胞分组边界(每张幻灯片的平均质量分数为0.91),并将基于人工神经网络的像素分类器应用于每个内分泌细胞内胰岛素或胰高血糖素染色的细胞质区域的分割区域。另外还开发了一个脚本,用于自动计数每个胰岛周长20 μm内的CD3+细胞,以量化炎症(即CD3+ t细胞浸润)的胰岛数量。使用24张幻灯片对12名受试者进行了概念验证分析,以测试开发的工作流。这个开源的机器学习辅助工作流程能够快速和高通量地确定内分泌细胞,无论是单个细胞还是组内,跨越数百张幻灯片。预计该工作流程的使用将加速我们对T1D内型和发病机制背景下内分泌细胞和胰岛异质性的理解。
{"title":"Leveraging pre-trained machine learning models for islet quantification in type 1 diabetes","authors":"Sanghoon Kang , Jesus D. Penaloza Aponte , Omar Elashkar , Juan Francisco Morales , Nicholas Waddington , Damon G. Lamb , Huiwen Ju , Martha Campbell-Thompson , Sarah Kim","doi":"10.1016/j.jpi.2024.100406","DOIUrl":"10.1016/j.jpi.2024.100406","url":null,"abstract":"<div><div>Human islets display a high degree of heterogeneity in terms of size, number, architecture, and endocrine cell-type compositions. An ever-increasing number of immunohistochemistry-stained whole slide images (WSIs) are available through the online pathology database of the Network for Pancreatic Organ donors with Diabetes (nPOD) program at the University of Florida (UF). We aimed to develop an enhanced machine learning-assisted WSI analysis workflow to utilize the nPOD resource for analysis of endocrine cell heterogeneity in the natural history of type 1 diabetes (T1D) in comparison to donors without diabetes. To maximize usability, the user-friendly open-source software QuPath was selected for the main interface. The WSI data were analyzed with two pre-trained machine learning models (i.e., Segment Anything Model (SAM) and QuPath's pixel classifier), using the UF high-performance-computing cluster, HiPerGator. SAM was used to define precise endocrine cell and cell grouping boundaries (with an average quality score of 0.91 per slide), and the artificial neural network-based pixel classifier was applied to segment areas of insulin- or glucagon-stained cytoplasmic regions within each endocrine cell. An additional script was developed to automatically count CD3+ cells inside and within 20 μm of each islet perimeter to quantify the number of islets with inflammation (i.e., CD3+ T-cell infiltration). Proof-of-concept analysis was performed to test the developed workflow in 12 subjects using 24 slides. This open-source machine learning-assisted workflow enables rapid and high throughput determinations of endocrine cells, whether as single cells or within groups, across hundreds of slides. It is expected that the use of this workflow will accelerate our understanding of endocrine cell and islet heterogeneity in the context of T1D endotypes and pathogenesis.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100406"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting model. The pipeline was tested on 221 hold-out WSIs, achieving an F1 score of 0.95 for tumor vs non-tumor classification and a macro F1 score of 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.
{"title":"Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images","authors":"Masoud Tafavvoghi , Anders Sildnes , Mehrdad Rakaee , Nikita Shvetsov , Lars Ailo Bongo , Lill-Tove Rasmussen Busund , Kajsa Møllersen","doi":"10.1016/j.jpi.2024.100410","DOIUrl":"10.1016/j.jpi.2024.100410","url":null,"abstract":"<div><div>Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting model. The pipeline was tested on 221 hold-out WSIs, achieving an F1 score of 0.95 for tumor vs non-tumor classification and a macro F1 score of 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100410"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11667687/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The incidence of breast cancer has risen in Chile, along with the complexity of diagnosis. For accurate diagnosis, it is necessary to complement the morphology assessed with hematoxylin and eosin with additional techniques to evaluate specific tumor markers. Evaluating the impact on costs, time, and productivity of automated techniques integrated with digital pathology solutions is crucial.
Objectives
To estimate the impact on costs, time, and productivity of incorporating the automation of the HER2 in situ hybridization technique combined with integrative digital pathology (IDP) in breast cancer diagnosis in a Chilean public provider versus a manual technique.
Methods
This economic evaluation adopted a health economics multi-method approach. A decision model was developed to represent the current manual fluorescence in situ hybridization (FISH) scenario versus an automated dual in situ hybridization (DISH) plus IDP in breast cancer diagnosis. Business process management (BPM) methodology was applied for capturing working time and latencies, in combination with a time-driven activity-based costing (TDABC) methodology for estimating direct, total, and average cost (2023 USD) for both scenarios for the following vectors: Human resources, supplies, and equipment, sorted by pre-analytical, analytical, and post-analytical phases. Indirect costs (2023 USD) were also retrieved. Both BPM and TDABC served to estimate labor productivity.
Results
In the baseline scenario based on manual FISH, the turnaround time (TAT) was estimated at 1259 min, at an average total cost of $265.67, considering direct and indirect costs for all phases. An average of 20.5 FISH reports were submitted per pathologist monthly during the baseline. The automated DISH plus IDP scenario consumed 203 min per biopsy, at an average total cost of $231.08, considering direct and indirect costs for all phases; it also showed an average of 22.8 submitted reports per pathologist monthly. This represents a decrease of 13.02% in average total costs, an 83.86% decrease in TAT, and an average labor productivity increase of 11.29%.
Conclusions
The incorporation of automated DISH plus IDP in the pathology department of this public provider has resulted in reductions in the time required to perform the in situ hybridization technique, a decrease in total costs, and increased productivity. Particular attention should be given to adopting new technologies to accelerate processing times and workflow.
{"title":"Economic evaluation: Impact on costs, time, and productivity of the incorporation of integrative digital pathology (IDP) in the anatomopathological analysis of breast cancer in a national reference public provider in Chile","authors":"Rony Lenz-Alcayaga , Daniela Paredes-Fernández , Fancy Gaete Verdejo , Luciano Páez-Pizarro , Karla Hernández-Sánchez","doi":"10.1016/j.jpi.2024.100417","DOIUrl":"10.1016/j.jpi.2024.100417","url":null,"abstract":"<div><h3>Introduction</h3><div>The incidence of breast cancer has risen in Chile, along with the complexity of diagnosis. For accurate diagnosis, it is necessary to complement the morphology assessed with hematoxylin and eosin with additional techniques to evaluate specific tumor markers. Evaluating the impact on costs, time, and productivity of automated techniques integrated with digital pathology solutions is crucial.</div></div><div><h3>Objectives</h3><div>To estimate the impact on costs, time, and productivity of incorporating the automation of the HER2 in situ hybridization technique combined with integrative digital pathology (IDP) in breast cancer diagnosis in a Chilean public provider versus a manual technique.</div></div><div><h3>Methods</h3><div>This economic evaluation adopted a health economics multi-method approach. A decision model was developed to represent the current manual fluorescence in situ hybridization (FISH) scenario versus an automated dual in situ hybridization (DISH) plus IDP in breast cancer diagnosis. Business process management (BPM) methodology was applied for capturing working time and latencies, in combination with a time-driven activity-based costing (TDABC) methodology for estimating direct, total, and average cost (2023 USD) for both scenarios for the following vectors: Human resources, supplies, and equipment, sorted by pre-analytical, analytical, and post-analytical phases. Indirect costs (2023 USD) were also retrieved. Both BPM and TDABC served to estimate labor productivity.</div></div><div><h3>Results</h3><div>In the baseline scenario based on manual FISH, the turnaround time (TAT) was estimated at 1259 min, at an average total cost of $265.67, considering direct and indirect costs for all phases. An average of 20.5 FISH reports were submitted per pathologist monthly during the baseline. The automated DISH plus IDP scenario consumed 203 min per biopsy, at an average total cost of $231.08, considering direct and indirect costs for all phases; it also showed an average of 22.8 submitted reports per pathologist monthly. This represents a decrease of 13.02% in average total costs, an 83.86% decrease in TAT, and an average labor productivity increase of 11.29%.</div></div><div><h3>Conclusions</h3><div>The incorporation of automated DISH plus IDP in the pathology department of this public provider has resulted in reductions in the time required to perform the in situ hybridization technique, a decrease in total costs, and increased productivity. Particular attention should be given to adopting new technologies to accelerate processing times and workflow.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100417"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143092062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100414
Ryan Erik Landvater, Ulysses Balis
Digital pathology is a tool of rapidly evolving importance within the discipline of pathology. Whole slide imaging promises numerous advantages; however, adoption is limited by challenges in ease of use and speed of high-quality image rendering relative to the simplicity and visual quality of glass slides. Herein, we introduce Iris, a new high-performance digital pathology rendering system. Specifically, we outline and detail the performance metrics of Iris Core, the core rendering engine technology. Iris Core comprises machine code modules written from the ground up in C++ and using Vulkan, a low-level and low-overhead cross-platform graphical processing unit application program interface, and our novel rapid tile buffering algorithms. We provide a detailed explanation of Iris Core's system architecture, including the stateless isolation of core processes, interprocess communication paradigms, and explicit synchronization paradigms that provide powerful control over the graphical processing unit. Iris Core achieves slide rendering at the sustained maximum frame rate on all tested platforms (120 FPS) and buffers an entire new slide field of view, without overlapping pixels, in 10 ms with enhanced detail in 30 ms. Further, it is able to buffer and compute high-fidelity reduction-enhancements for viewing low-power cytology with increased visual quality at a rate of 100–160 μs per slide tile, and with a cumulative median buffering rate of 1.36 GB of decompressed image data per second. This buffering rate allows for an entirely new field of view to be fully buffered and rendered in less than a single monitor refresh on a standard display, and high detail features within 2–3 monitor refresh frames. These metrics far exceed previously published specifications, beyond an order of magnitude in some contexts. The system shows no slowing with high use loads, but rather increases performance due to graphical processing unit cache control mechanisms and is “future-proof” due to near unlimited parallel scalability.
{"title":"Iris: A Next Generation Digital Pathology Rendering Engine","authors":"Ryan Erik Landvater, Ulysses Balis","doi":"10.1016/j.jpi.2024.100414","DOIUrl":"10.1016/j.jpi.2024.100414","url":null,"abstract":"<div><div>Digital pathology is a tool of rapidly evolving importance within the discipline of pathology. Whole slide imaging promises numerous advantages; however, adoption is limited by challenges in ease of use and speed of high-quality image rendering relative to the simplicity and visual quality of glass slides. Herein, we introduce Iris, a new high-performance digital pathology rendering system. Specifically, we outline and detail the performance metrics of Iris Core, the core rendering engine technology. Iris Core comprises machine code modules written from the ground up in C++ and using Vulkan, a low-level and low-overhead cross-platform graphical processing unit application program interface, and our novel rapid tile buffering algorithms. We provide a detailed explanation of Iris Core's system architecture, including the stateless isolation of core processes, interprocess communication paradigms, and explicit synchronization paradigms that provide powerful control over the graphical processing unit. Iris Core achieves slide rendering at the sustained maximum frame rate on all tested platforms (120 FPS) and buffers an entire new slide field of view, without overlapping pixels, in 10 ms with enhanced detail in 30 ms. Further, it is able to buffer and compute high-fidelity reduction-enhancements for viewing low-power cytology with increased visual quality at a rate of 100–160 μs per slide tile, and with a cumulative median buffering rate of 1.36 GB of decompressed image data per second. This buffering rate allows for an entirely new field of view to be fully buffered and rendered in less than a single monitor refresh on a standard display, and high detail features within 2–3 monitor refresh frames. These metrics far exceed previously published specifications, beyond an order of magnitude in some contexts. The system shows no slowing with high use loads, but rather increases performance due to graphical processing unit cache control mechanisms and is “future-proof” due to near unlimited parallel scalability.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100414"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11742306/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143013435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100412
William Gordon , Maria Aguad , Layne Ainsworth , Samuel Aronson , Jane Baronas , Edward Comeau , Rory De La Paz , Justin B.L. Halls , Vincent T. Ho , Michael Oates , Adam Landman , Wen Lu , Shawn N. Murphy , Fei Wang , Indira Guleria , Sean R. Stowell , Melissa Y. Yeung , Edgar L. Milford , Richard M. Kaufman , William J. Lane
Objective
Thrombocytopenia is a common complication of hematopoietic stem-cell transplantation (HSCT), though many patients will become immune refractory to platelet transfusions over time. We built and evaluated an electronic health record (EHR)-integrated, standards-based application that enables blood-bank clinicians to match platelet inventory with patients using data previously not available at the point-of-care, like human leukocyte antigen (HLA) data for donors and recipients.
Materials and methods
The web-based application launches as an EHR-embedded application or as a standalone application. The application coalesces disparate data streams into a unified view, including platelet count, HLA data, demographics, and real-time inventory. We looked at application usage over time and developed a multivariable logistic regression model to compute odds ratios that a patient undergoing HSCT would have a complicated thrombocytopenia course, with several model covariates including pre-/post-application deployment.
Results
Usage of the application has been consistent since launch, with a slight dip during the first COVID wave. Our model, which included 376 patients in the final analysis, did not demonstrate a significantly decreased odds that a patient would have a complicated thrombocytopenia course after application deployment as compared to before application deployment.
Discussion
We built an EHR-integrated application to improve platelet transfusion processes. Whereas our model did not demonstrate decreased odds of a patient having a complicated thrombocytopenia course, there are other workflow and clinical benefits that will benefit from future evaluation.
Conclusion
A web-based, EHR-integrated application was built and integrated into our EHR system and is now part of the standard operating procedures of our blood bank.
{"title":"A standards-based application for improving platelet transfusion workflow","authors":"William Gordon , Maria Aguad , Layne Ainsworth , Samuel Aronson , Jane Baronas , Edward Comeau , Rory De La Paz , Justin B.L. Halls , Vincent T. Ho , Michael Oates , Adam Landman , Wen Lu , Shawn N. Murphy , Fei Wang , Indira Guleria , Sean R. Stowell , Melissa Y. Yeung , Edgar L. Milford , Richard M. Kaufman , William J. Lane","doi":"10.1016/j.jpi.2024.100412","DOIUrl":"10.1016/j.jpi.2024.100412","url":null,"abstract":"<div><h3>Objective</h3><div>Thrombocytopenia is a common complication of hematopoietic stem-cell transplantation (HSCT), though many patients will become immune refractory to platelet transfusions over time. We built and evaluated an electronic health record (EHR)-integrated, standards-based application that enables blood-bank clinicians to match platelet inventory with patients using data previously not available at the point-of-care, like human leukocyte antigen (HLA) data for donors and recipients.</div></div><div><h3>Materials and methods</h3><div>The web-based application launches as an EHR-embedded application or as a standalone application. The application coalesces disparate data streams into a unified view, including platelet count, HLA data, demographics, and real-time inventory. We looked at application usage over time and developed a multivariable logistic regression model to compute odds ratios that a patient undergoing HSCT would have a complicated thrombocytopenia course, with several model covariates including pre-/post-application deployment.</div></div><div><h3>Results</h3><div>Usage of the application has been consistent since launch, with a slight dip during the first COVID wave. Our model, which included 376 patients in the final analysis, did not demonstrate a significantly decreased odds that a patient would have a complicated thrombocytopenia course after application deployment as compared to before application deployment.</div></div><div><h3>Discussion</h3><div>We built an EHR-integrated application to improve platelet transfusion processes. Whereas our model did not demonstrate decreased odds of a patient having a complicated thrombocytopenia course, there are other workflow and clinical benefits that will benefit from future evaluation.</div></div><div><h3>Conclusion</h3><div>A web-based, EHR-integrated application was built and integrated into our EHR system and is now part of the standard operating procedures of our blood bank.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100412"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11721207/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142972501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.jpi.2024.100409
Abdulkadir Albayrak , Yao Xiao , Piyush Mukherjee , Sarah S. Barnett , Cherisse A. Marcou , Steven N. Hart
With the increasing utilization of exome and genome sequencing in clinical and research genetics, accurate and automated extraction of human phenotype ontology (HPO) terms from clinical texts has become imperative. Traditional methods for HPO term extraction, such as PhenoTagger, often face limitations in coverage and precision. In this study, we propose a novel approach that leverages large language models (LLMs) to generate synthetic sentences with clinical context, which were semantically encoded into vector embeddings. These embeddings are linked to HPO terms, creating a robust knowledgebase that facilitates precise information retrieval. Our method circumvents the known issue of LLM hallucinations by storing and querying these embeddings within a true database, ensuring accurate context matching without the need for a predictive model. We evaluated the performance of three different embedding models, all of which demonstrated substantial improvements over PhenoTagger. Top recall (sensitivity), precision (positive-predictive value, PPV), and F1 are 0.64, 0.64, and 0.64, respectively, which were 31%, 10%, and 21% better than PhenoTagger. Furthermore, optimal performance was achieved when we combined the best performing embedding model with PhenoTagger (a.k.a. Fused model), resulting in recall (sensitivity), precision (PPV), and F1 values of 0.7, 0.7, and 0.7, respectively, which are 10%, 10%, and 10% better than the best embedding models. Our findings underscore the potential of this integrated approach to enhance the precision and reliability of HPO term extraction, offering a scalable and effective solution for biomedical data annotation.
{"title":"Enhancing human phenotype ontology term extraction through synthetic case reports and embedding-based retrieval: A novel approach for improved biomedical data annotation","authors":"Abdulkadir Albayrak , Yao Xiao , Piyush Mukherjee , Sarah S. Barnett , Cherisse A. Marcou , Steven N. Hart","doi":"10.1016/j.jpi.2024.100409","DOIUrl":"10.1016/j.jpi.2024.100409","url":null,"abstract":"<div><div>With the increasing utilization of exome and genome sequencing in clinical and research genetics, accurate and automated extraction of human phenotype ontology (HPO) terms from clinical texts has become imperative. Traditional methods for HPO term extraction, such as PhenoTagger, often face limitations in coverage and precision. In this study, we propose a novel approach that leverages large language models (LLMs) to generate synthetic sentences with clinical context, which were semantically encoded into vector embeddings. These embeddings are linked to HPO terms, creating a robust knowledgebase that facilitates precise information retrieval. Our method circumvents the known issue of LLM hallucinations by storing and querying these embeddings within a true database, ensuring accurate context matching without the need for a predictive model. We evaluated the performance of three different embedding models, all of which demonstrated substantial improvements over PhenoTagger. Top recall (sensitivity), precision (positive-predictive value, PPV), and F1 are 0.64, 0.64, and 0.64, respectively, which were 31%, 10%, and 21% better than PhenoTagger. Furthermore, optimal performance was achieved when we combined the best performing embedding model with PhenoTagger (a.k.a. Fused model), resulting in recall (sensitivity), precision (PPV), and F1 values of 0.7, 0.7, and 0.7, respectively, which are 10%, 10%, and 10% better than the best embedding models. Our findings underscore the potential of this integrated approach to enhance the precision and reliability of HPO term extraction, offering a scalable and effective solution for biomedical data annotation.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"16 ","pages":"Article 100409"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11667693/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}