首页 > 最新文献

Journal of Pathology Informatics最新文献

英文 中文
ViCE: An automated and quantitative program to assess intestinal tissue morphology ViCE:自动定量评估肠道组织形态的程序
Q2 Medicine Pub Date : 2024-09-13 DOI: 10.1016/j.jpi.2024.100397
Jeffrey La , Krishnan Raghunathan , Jocelyn A. Silvester , Jay R. Thiagarajah

Background and objective

The tissue morphology of the intestinal surface is architecturally complex with finger-like projections called villi, and glandular structures called crypts. The ratio of villus height-to-crypt depth ratio (Vh:Cd) is used to quantitatively assess disease severity and response to therapy for intestinal enteropathies, such as celiac disease and is currently quantified manually. Given the time required, manual Vh:Cd measurements have largely been limited to clinical trials and are not used widely in clinical practice. We developed ViCE (Villus Crypt Evaluator), a user-friendly software that automatically quantifies histological parameters in standard hematoxylin and eosin-stained intestinal biopsies.

Methods

ViCE is based on mathematical morphology operations and is scale and staining agnostic. It evaluates tissue orientation, identifies geometrical structure, and outputs key tissue measurements.

Results

The output measurements of Vh:Cd are concordant with manual quantifications across multiple datasets.

Conclusions

The underlying mathematical morphological approach for ViCE is robust, and reproducible and easily adaptable for measurement of morphological features in other tissues.
背景和目的肠表面的组织形态结构复杂,有称为绒毛的指状突起和称为隐窝的腺体结构。绒毛高度与隐窝深度之比(Vh:Cd)用于定量评估疾病的严重程度以及对乳糜泻等肠道疾病的治疗反应,目前采用人工定量的方法。由于需要花费大量时间,手动 Vh:Cd 测量在很大程度上仅限于临床试验,并未广泛应用于临床实践。我们开发了 ViCE(绒毛隐窝评价器),这是一款用户友好型软件,可自动量化标准苏木精和伊红染色肠活检组织学参数。结果在多个数据集中,Vh:Cd 的输出测量结果与人工量化结果一致。结论ViCE 的基本数学形态学方法是稳健的、可重复的,并且很容易适应于其他组织形态特征的测量。
{"title":"ViCE: An automated and quantitative program to assess intestinal tissue morphology","authors":"Jeffrey La ,&nbsp;Krishnan Raghunathan ,&nbsp;Jocelyn A. Silvester ,&nbsp;Jay R. Thiagarajah","doi":"10.1016/j.jpi.2024.100397","DOIUrl":"10.1016/j.jpi.2024.100397","url":null,"abstract":"<div><h3>Background and objective</h3><div>The tissue morphology of the intestinal surface is architecturally complex with finger-like projections called villi, and glandular structures called crypts. The ratio of villus height-to-crypt depth ratio (Vh:Cd) is used to quantitatively assess disease severity and response to therapy for intestinal enteropathies, such as celiac disease and is currently quantified manually. Given the time required, manual Vh:Cd measurements have largely been limited to clinical trials and are not used widely in clinical practice. We developed ViCE (Villus Crypt Evaluator), a user-friendly software that automatically quantifies histological parameters in standard hematoxylin and eosin-stained intestinal biopsies.</div></div><div><h3>Methods</h3><div>ViCE is based on mathematical morphology operations and is scale and staining agnostic. It evaluates tissue orientation, identifies geometrical structure, and outputs key tissue measurements.</div></div><div><h3>Results</h3><div>The output measurements of Vh:Cd are concordant with manual quantifications across multiple datasets.</div></div><div><h3>Conclusions</h3><div>The underlying mathematical morphological approach for ViCE is robust, and reproducible and easily adaptable for measurement of morphological features in other tissues.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100397"},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142422215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep feature batch correction using ComBat for machine learning applications in computational pathology 利用 ComBat 对计算病理学中的机器学习应用进行深度特征批量校正
Q2 Medicine Pub Date : 2024-09-12 DOI: 10.1016/j.jpi.2024.100396
Pierre Murchan , Pilib Ó Broin , Anne-Marie Baird , Orla Sheils , Stephen P Finn

Background

Developing artificial intelligence (AI) models for digital pathology requires large datasets from multiple sources. However, without careful implementation, AI models risk learning confounding site-specific features in datasets instead of clinically relevant information, leading to overestimated performance, poor generalizability to real-world data, and potential misdiagnosis.

Methods

Whole-slide images (WSIs) from The Cancer Genome Atlas (TCGA) colon (COAD), and stomach adenocarcinoma datasets were selected for inclusion in this study. Patch embeddings were obtained using three feature extraction models, followed by ComBat harmonization. Attention-based multiple instance learning models were trained to predict tissue-source site (TSS), as well as clinical and genetic attributes, using raw, Macenko normalized, and Combat-harmonized patch embeddings.

Results

TSS prediction achieved high accuracy (AUROC > 0.95) with all three feature extraction models. ComBat harmonization significantly reduced the AUROC for TSS prediction, with mean AUROCs dropping to approximately 0.5 for most models, indicating successful mitigation of batch effects (e.g., CCL-ResNet50 in TCGA-COAD: Pre-ComBat AUROC = 0.960, Post-ComBat AUROC = 0.506, p < 0.001). Clinical attributes associated with TSS, such as race and treatment response, showed decreased predictability post-harmonization. Notably, the prediction of genetic features like MSI status remained robust after harmonization (e.g., MSI in TCGA-COAD: Pre-ComBat AUROC = 0.667, Post-ComBat AUROC = 0.669, p=0.952), indicating the preservation of true histological signals.

Conclusion

ComBat harmonization of deep learning-derived histology features effectively reduces the risk of AI models learning confounding features in WSIs, ensuring more reliable performance estimates. This approach is promising for the integration of large-scale digital pathology datasets.
背景为数字病理学开发人工智能(AI)模型需要来自多个来源的大型数据集。然而,如果不仔细实施,人工智能模型就有可能学习数据集中的混杂部位特异性特征,而不是临床相关信息,从而导致性能被高估、对真实世界数据的普适性差以及潜在的误诊。使用三种特征提取模型获得斑块嵌入,然后进行 ComBat 协调。使用原始、Macenko 归一化和 Combat 协调的斑块嵌入,训练了基于注意力的多实例学习模型,以预测组织来源部位(TSS)以及临床和遗传属性。ComBat 协调大大降低了 TSS 预测的 AUROC,大多数模型的平均 AUROC 降至 0.5 左右,这表明批次效应得到了成功缓解(例如,TCGA-COAD 中的 CCL-ResNet50 模型,其平均 AUROC 降至 0.5 左右):CCL-ResNet50 in TCGA-COAD:Pre-ComBat AUROC = 0.960, Post-ComBat AUROC = 0.506, p < 0.001)。与 TSS 相关的临床属性(如种族和治疗反应)在协调后的可预测性有所下降。值得注意的是,MSI 状态等遗传特征的预测能力在协调后仍然很强(例如,TCGA-COAD 中的 MSI:ConclusionComBat harmonization of deep learning-derived histology features effectively reduces the risk of AI models learning confounding features in WSIs, ensuring more reliable performance estimates.这种方法在整合大规模数字病理数据集方面大有可为。
{"title":"Deep feature batch correction using ComBat for machine learning applications in computational pathology","authors":"Pierre Murchan ,&nbsp;Pilib Ó Broin ,&nbsp;Anne-Marie Baird ,&nbsp;Orla Sheils ,&nbsp;Stephen P Finn","doi":"10.1016/j.jpi.2024.100396","DOIUrl":"10.1016/j.jpi.2024.100396","url":null,"abstract":"<div><h3>Background</h3><div>Developing artificial intelligence (AI) models for digital pathology requires large datasets from multiple sources. However, without careful implementation, AI models risk learning confounding site-specific features in datasets instead of clinically relevant information, leading to overestimated performance, poor generalizability to real-world data, and potential misdiagnosis.</div></div><div><h3>Methods</h3><div>Whole-slide images (WSIs) from The Cancer Genome Atlas (TCGA) colon (COAD), and stomach adenocarcinoma datasets were selected for inclusion in this study. Patch embeddings were obtained using three feature extraction models, followed by ComBat harmonization. Attention-based multiple instance learning models were trained to predict tissue-source site (TSS), as well as clinical and genetic attributes, using raw, Macenko normalized, and Combat-harmonized patch embeddings.</div></div><div><h3>Results</h3><div>TSS prediction achieved high accuracy (AUROC &gt; 0.95) with all three feature extraction models. ComBat harmonization significantly reduced the AUROC for TSS prediction, with mean AUROCs dropping to approximately 0.5 for most models, indicating successful mitigation of batch effects (e.g., CCL-ResNet50 in TCGA-COAD: Pre-ComBat AUROC = 0.960, Post-ComBat AUROC = 0.506, <em>p &lt;</em> 0.001). Clinical attributes associated with TSS, such as race and treatment response, showed decreased predictability post-harmonization. Notably, the prediction of genetic features like MSI status remained robust after harmonization (e.g., MSI in TCGA-COAD: Pre-ComBat AUROC = 0.667, Post-ComBat AUROC = 0.669, <em>p</em>=0.952), indicating the preservation of true histological signals.</div></div><div><h3>Conclusion</h3><div>ComBat harmonization of deep learning-derived histology features effectively reduces the risk of AI models learning confounding features in WSIs, ensuring more reliable performance estimates. This approach is promising for the integration of large-scale digital pathology datasets.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100396"},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LVI-PathNet: Segmentation-classification pipeline for detection of lymphovascular invasion in whole slide images of lung adenocarcinoma LVI-PathNet:用于检测肺腺癌全切片图像中淋巴管侵犯的分割-分类管道
Q2 Medicine Pub Date : 2024-08-30 DOI: 10.1016/j.jpi.2024.100395
Anna Timakova , Vladislav Ananev , Alexey Fayzullin , Egor Zemnuhov , Egor Rumyantsev , Andrey Zharov , Nicolay Zharkov , Varvara Zotova , Elena Shchelokova , Tatiana Demura , Peter Timashev , Vladimir Makarov

Lymphovascular invasion (LVI) in lung cancer is a significant prognostic factor that influences treatment and outcomes, yet its reliable detection is challenging due to interobserver variability. This study aims to develop a deep learning model for LVI detection using whole slide images (WSIs) and evaluate its effectiveness within a pathologist's information system. Experienced pathologists annotated blood vessels and invading tumor cells in 162 WSIs of non-mucinous lung adenocarcinoma sourced from two external and one internal datasets. Two models were trained to segment vessels and identify images with LVI features. DeepLabV3+ model achieved an Intersection-over-Union of 0.8840 and an area under the receiver operating characteristic curve (AUC-ROC) of 0.9869 in vessel segmentation. For LVI classification, the ensemble model achieved a F1-score of 0.9683 and an AUC-ROC of 0.9987. The model demonstrated robustness and was unaffected by variations in staining and image quality. The pilot study showed that pathologists' evaluation time for LVI detecting decreased by an average of 16.95%, and by 21.5% in “hard cases”. The model facilitated consistent diagnostic assessments, suggesting potential for broader applications in detecting pathological changes in blood vessels and other lung pathologies.

肺癌中的淋巴管侵犯(LVI)是影响治疗和预后的重要预后因素,但由于观察者之间的差异,其可靠检测具有挑战性。本研究旨在开发一种利用全切片图像(WSI)进行淋巴管侵犯检测的深度学习模型,并评估其在病理学家信息系统中的有效性。经验丰富的病理学家对来自两个外部数据集和一个内部数据集的 162 张非粘液性肺腺癌 WSI 图像中的血管和入侵肿瘤细胞进行了标注。对两个模型进行了训练,以利用 LVI 特征分割血管和识别图像。DeepLabV3+ 模型在血管分割方面取得了 0.8840 的 "联合交叉"(Intersection-over-Union)和 0.9869 的接收者操作特征曲线下面积(AUC-ROC)。在 LVI 分类中,集合模型的 F1 分数为 0.9683,AUC-ROC 为 0.9987。该模型具有鲁棒性,不受染色和图像质量变化的影响。试点研究表明,病理学家检测 LVI 的评估时间平均减少了 16.95%,在 "疑难病例 "中减少了 21.5%。该模型有助于进行一致的诊断评估,表明它在检测血管病理变化和其他肺部病变方面具有更广泛的应用潜力。
{"title":"LVI-PathNet: Segmentation-classification pipeline for detection of lymphovascular invasion in whole slide images of lung adenocarcinoma","authors":"Anna Timakova ,&nbsp;Vladislav Ananev ,&nbsp;Alexey Fayzullin ,&nbsp;Egor Zemnuhov ,&nbsp;Egor Rumyantsev ,&nbsp;Andrey Zharov ,&nbsp;Nicolay Zharkov ,&nbsp;Varvara Zotova ,&nbsp;Elena Shchelokova ,&nbsp;Tatiana Demura ,&nbsp;Peter Timashev ,&nbsp;Vladimir Makarov","doi":"10.1016/j.jpi.2024.100395","DOIUrl":"10.1016/j.jpi.2024.100395","url":null,"abstract":"<div><p>Lymphovascular invasion (LVI) in lung cancer is a significant prognostic factor that influences treatment and outcomes, yet its reliable detection is challenging due to interobserver variability. This study aims to develop a deep learning model for LVI detection using whole slide images (WSIs) and evaluate its effectiveness within a pathologist's information system. Experienced pathologists annotated blood vessels and invading tumor cells in 162 WSIs of non-mucinous lung adenocarcinoma sourced from two external and one internal datasets. Two models were trained to segment vessels and identify images with LVI features. DeepLabV3+ model achieved an Intersection-over-Union of 0.8840 and an area under the receiver operating characteristic curve (AUC-ROC) of 0.9869 in vessel segmentation. For LVI classification, the ensemble model achieved a F1-score of 0.9683 and an AUC-ROC of 0.9987. The model demonstrated robustness and was unaffected by variations in staining and image quality. The pilot study showed that pathologists' evaluation time for LVI detecting decreased by an average of 16.95%, and by 21.5% in “hard cases”. The model facilitated consistent diagnostic assessments, suggesting potential for broader applications in detecting pathological changes in blood vessels and other lung pathologies.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100395"},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000348/pdfft?md5=9a1e9217891b1539c144069b2cb2703f&pid=1-s2.0-S2153353924000348-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Globalization of a telepathology network with artificial intelligence applications in Colombia: The GLORIA program study protocol 哥伦比亚应用人工智能的远程病理网络全球化:GLORIA 计划研究协议
Q2 Medicine Pub Date : 2024-08-15 DOI: 10.1016/j.jpi.2024.100394
Andrés Mosquera-Zamudio , Marcela Gomez-Suarez , John Sprockel , Julian Camilo Riaño-Moreno , Emiel A.M. Janssen , Liron Pantanowitz , Rafael Parra-Medina

In Colombia, cancer is recognized as a high-cost pathology by the national government and the Colombian High-Cost Disease Fund. As of 2020, the situation is most critical for adult cancer patients, particularly those under public healthcare and residing in remote regions of the country. The highest lag time for a diagnosis was observed for cervical cancer (79.13 days), followed by prostate (77.30 days), and breast cancer (70.25 days). Timely and accurate histopathological reporting plays a vital role in the diagnosis of cancer. In recent years, digital pathology has been globally implemented as a technological tool in two main areas: telepathology (TP) and computational pathology. TP has been shown to improve rapid and timely diagnosis in anatomic pathology by facilitating interaction between general laboratories and specialized pathologists worldwide through information and telecommunication technologies. Computational pathology provides diagnostic and prognostic assistance based on histopathological patterns, molecular, and clinical information, aiding pathologists in making more accurate diagnoses. We present the study protocol of the GLORIA digital pathology network, a pioneering initiative, and national grant-approved program aiming to design and pilot a Colombian digital pathology transformation focused on TP and computational pathology, in response to the general needs of pathology laboratories for diagnosing complex malignant tumors. The study protocol describes the design of a TP network to expand oncopathology services across all Colombian regions. It also describes an artificial intelligence proposal for lung cancer, one of Colombia's most prevalent cancers, and a freely accessible national histopathological image database to facilitate image analysis studies.

在哥伦比亚,癌症被国家政府和哥伦比亚高成本疾病基金认定为高成本病症。截至 2020 年,成年癌症患者的情况最为严峻,尤其是那些享受公共医疗服务和居住在偏远地区的患者。宫颈癌的诊断滞后时间最长(79.13 天),其次是前列腺癌(77.30 天)和乳腺癌(70.25 天)。及时准确的组织病理学报告在癌症诊断中起着至关重要的作用。近年来,数字病理学作为一种技术工具已在全球范围内广泛应用,主要涉及两个领域:远程病理学(TP)和计算病理学。远程病理学通过信息和电信技术促进了全球普通实验室和专业病理学家之间的互动,从而提高了解剖病理学诊断的快速性和及时性。计算病理学根据组织病理学模式、分子和临床信息提供诊断和预后帮助,帮助病理学家做出更准确的诊断。我们介绍了 GLORIA 数字病理学网络的研究方案,该网络是一项开创性计划,也是国家拨款批准的计划,旨在设计和试点哥伦比亚数字病理学转型,重点关注 TP 和计算病理学,以满足病理实验室诊断复杂恶性肿瘤的普遍需求。该研究计划介绍了如何设计一个 TP 网络,以在哥伦比亚所有地区扩展肿瘤病理学服务。它还介绍了针对肺癌(哥伦比亚最常见的癌症之一)的人工智能提案,以及可免费访问的国家组织病理学图像数据库,以促进图像分析研究。
{"title":"Globalization of a telepathology network with artificial intelligence applications in Colombia: The GLORIA program study protocol","authors":"Andrés Mosquera-Zamudio ,&nbsp;Marcela Gomez-Suarez ,&nbsp;John Sprockel ,&nbsp;Julian Camilo Riaño-Moreno ,&nbsp;Emiel A.M. Janssen ,&nbsp;Liron Pantanowitz ,&nbsp;Rafael Parra-Medina","doi":"10.1016/j.jpi.2024.100394","DOIUrl":"10.1016/j.jpi.2024.100394","url":null,"abstract":"<div><p>In Colombia, cancer is recognized as a high-cost pathology by the national government and the Colombian High-Cost Disease Fund. As of 2020, the situation is most critical for adult cancer patients, particularly those under public healthcare and residing in remote regions of the country. The highest lag time for a diagnosis was observed for cervical cancer (79.13 days), followed by prostate (77.30 days), and breast cancer (70.25 days). Timely and accurate histopathological reporting plays a vital role in the diagnosis of cancer. In recent years, digital pathology has been globally implemented as a technological tool in two main areas: telepathology (TP) and computational pathology. TP has been shown to improve rapid and timely diagnosis in anatomic pathology by facilitating interaction between general laboratories and specialized pathologists worldwide through information and telecommunication technologies. Computational pathology provides diagnostic and prognostic assistance based on histopathological patterns, molecular, and clinical information, aiding pathologists in making more accurate diagnoses. We present the study protocol of the GLORIA digital pathology network, a pioneering initiative, and national grant-approved program aiming to design and pilot a Colombian digital pathology transformation focused on TP and computational pathology, in response to the general needs of pathology laboratories for diagnosing complex malignant tumors. The study protocol describes the design of a TP network to expand oncopathology services across all Colombian regions. It also describes an artificial intelligence proposal for lung cancer, one of Colombia's most prevalent cancers, and a freely accessible national histopathological image database to facilitate image analysis studies.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100394"},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000336/pdfft?md5=861e86fc08dee64d7bef49370be8286b&pid=1-s2.0-S2153353924000336-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142075800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-diagnostic time in digital pathology: An empirical study over 10 years 数字病理学的非诊断时间:十年实证研究
Q2 Medicine Pub Date : 2024-08-05 DOI: 10.1016/j.jpi.2024.100393
Aleksandar Vodovnik
{"title":"Non-diagnostic time in digital pathology: An empirical study over 10 years","authors":"Aleksandar Vodovnik","doi":"10.1016/j.jpi.2024.100393","DOIUrl":"10.1016/j.jpi.2024.100393","url":null,"abstract":"","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100393"},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000324/pdfft?md5=215132a8d517d7691de823ffcf6bf232&pid=1-s2.0-S2153353924000324-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141963948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Engineered feature embeddings meet deep learning: A novel strategy to improve bone marrow cell classification and model transparency 工程特征嵌入与深度学习的结合:改善骨髓细胞分类和模型透明度的新策略
Q2 Medicine Pub Date : 2024-07-03 DOI: 10.1016/j.jpi.2024.100390
Jonathan Tarquino , Jhonathan Rodríguez , David Becerra , Lucia Roa-Peña , Eduardo Romero

Cytomorphology evaluation of bone marrow cell is the initial step to diagnose different hematological diseases. This assessment is still manually performed by trained specialists, who may be a bottleneck within the clinical process. Deep learning algorithms are a promising approach to automate this bone marrow cell evaluation. These artificial intelligence models have focused on limited cell subtypes, mainly associated to a particular disease, and are frequently presented as black boxes. The herein introduced strategy presents an engineered feature representation, the region-attention embedding, which improves the deep learning classification performance of a cytomorphology with 21 bone marrow cell subtypes. This embedding is built upon a specific organization of cytology features within a squared matrix by distributing them after pre-segmented cell regions, i.e., cytoplasm, nucleus, and whole-cell. This novel cell image representation, aimed to preserve spatial/regional relations, is used as input of the network. Combination of region-attention embedding and deep learning networks (Xception and ResNet50) provides local relevance associated to image regions, adding up interpretable information to the prediction. Additionally, this approach is evaluated in a public database with the largest number of cell subtypes (21) by a thorough evaluation scheme with three iterations of a 3-fold cross-validation, performed in 80% of the images (n = 89,484), and a testing process in an unseen set of images composed by the remaining 20% of the images (n = 22,371). This evaluation process demonstrates the introduced strategy outperforms previously published approaches in an equivalent validation set, with a f1-score of 0.82, and presented competitive results on the unseen data partition with a f1-score of 0.56.

骨髓细胞的细胞形态学评估是诊断各种血液病的第一步。这种评估仍由训练有素的专家手工完成,这可能是临床过程中的一个瓶颈。深度学习算法是一种有望实现骨髓细胞评估自动化的方法。这些人工智能模型侧重于有限的细胞亚型,主要与特定疾病相关,通常以黑盒形式呈现。本文介绍的策略提出了一种工程特征表征--区域注意嵌入,它提高了 21 种骨髓细胞亚型的细胞形态学深度学习分类性能。这种嵌入建立在方形矩阵中细胞学特征的特定组织之上,将它们分布在预先分割的细胞区域(即细胞质、细胞核和全细胞)之后。这种旨在保留空间/区域关系的新型细胞图像表示法被用作网络的输入。区域注意嵌入和深度学习网络(Xception 和 ResNet50)的结合提供了与图像区域相关的局部相关性,为预测增加了可解释的信息。此外,我们还在一个拥有最多细胞亚型的公共数据库(21)中对该方法进行了全面评估,评估方案包括对 80% 的图像(n = 89,484 张)进行三次迭代的 3 倍交叉验证,以及对由剩余 20% 的图像(n = 22,371 张)组成的未见图像集进行测试。评估结果表明,在等效验证集上,引入的策略优于之前发布的方法,f1 分数为 0.82,而在未见数据分区上,引入的策略也取得了具有竞争力的结果,f1 分数为 0.56。
{"title":"Engineered feature embeddings meet deep learning: A novel strategy to improve bone marrow cell classification and model transparency","authors":"Jonathan Tarquino ,&nbsp;Jhonathan Rodríguez ,&nbsp;David Becerra ,&nbsp;Lucia Roa-Peña ,&nbsp;Eduardo Romero","doi":"10.1016/j.jpi.2024.100390","DOIUrl":"10.1016/j.jpi.2024.100390","url":null,"abstract":"<div><p>Cytomorphology evaluation of bone marrow cell is the initial step to diagnose different hematological diseases. This assessment is still manually performed by trained specialists, who may be a bottleneck within the clinical process. Deep learning algorithms are a promising approach to automate this bone marrow cell evaluation. These artificial intelligence models have focused on limited cell subtypes, mainly associated to a particular disease, and are frequently presented as black boxes. The herein introduced strategy presents an engineered feature representation, the region-attention embedding, which improves the deep learning classification performance of a cytomorphology with 21 bone marrow cell subtypes. This embedding is built upon a specific organization of cytology features within a squared matrix by distributing them after pre-segmented cell regions, i.e., cytoplasm, nucleus, and whole-cell. This novel cell image representation, aimed to preserve spatial/regional relations, is used as input of the network. Combination of region-attention embedding and deep learning networks (Xception and ResNet50) provides local relevance associated to image regions, adding up interpretable information to the prediction. Additionally, this approach is evaluated in a public database with the largest number of cell subtypes (21) by a thorough evaluation scheme with three iterations of a 3-fold cross-validation, performed in 80% of the images (<em>n</em> = 89,484), and a testing process in an unseen set of images composed by the remaining 20% of the images (<em>n</em> = 22,371). This evaluation process demonstrates the introduced strategy outperforms previously published approaches in an equivalent validation set, with a f1-score of 0.82, and presented competitive results on the unseen data partition with a f1-score of 0.56.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100390"},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000294/pdfft?md5=87a5b2e97447248282a9f8d40bb281e3&pid=1-s2.0-S2153353924000294-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validation of AI-assisted ThinPrep® Pap test screening using the GeniusTM Digital Diagnostics System 使用 GeniusTM 数字诊断系统对人工智能辅助 ThinPrep® Pap 测试筛查进行验证
Q2 Medicine Pub Date : 2024-07-02 DOI: 10.1016/j.jpi.2024.100391
Richard L. Cantley , Xin Jing , Brian Smola , Wei Hao , Sarah Harrington , Liron Pantanowitz

Advances in whole-slide imaging and artificial intelligence present opportunities for improvement in Pap test screening. To date, there have been limited studies published regarding how best to validate newer AI-based digital systems for screening Pap tests in clinical practice. In this study, we validated the Genius™ Digital Diagnostics System (Hologic) by comparing the performance to traditional manual light microscopic diagnosis of ThinPrep® Pap test slides. A total of 319 ThinPrep® Pap test cases were prospectively assessed by six cytologists and three cytopathologists by light microscopy and digital evaluation and the results compared to the original ground truth Pap test diagnosis. Concordance with the original diagnosis was significantly different by digital and manual light microscopy review when comparing across: (i) exact Bethesda System diagnostic categories (62.1% vs 55.8%, respectively, p = 0.014), (ii) condensed diagnostic categories (76.8% vs 71.5%, respectively, p = 0.027), and (iii) condensed diagnoses based on clinical management (71.5% vs 65.2%, respectively, p = 0.017). Time to evaluate cases was shorter for digital (M = 3.2 min, SD = 2.2) compared to manual (M = 5.9 min, SD = 3.1) review (t(352) = 19.44, p < 0.001, Cohen's d = 1.035, 95% CI [0.905, 1.164]). Not only did our validation study demonstrate that AI-based digital Pap test evaluation had improved diagnostic accuracy and reduced screening time compared to light microscopy, but that participants reported a positive experience using this system.

全切片成像和人工智能的进步为改进巴氏试验筛查带来了机遇。迄今为止,关于如何在临床实践中验证较新的人工智能巴氏试验筛查数字系统的研究还很有限。在本研究中,我们将 Genius™ 数字诊断系统(Hologic)的性能与 ThinPrep® 巴氏试验玻片的传统人工光学显微镜诊断进行了比较,从而对其进行了验证。六位细胞学专家和三位细胞病理学专家通过光学显微镜和数字评估对总共 319 例 ThinPrep® Pap 测试病例进行了前瞻性评估,并将评估结果与原始的地面真实 Pap 测试诊断结果进行了比较。数字光镜检查和人工光镜检查与原始诊断的一致性在以下方面有显著差异:(i) 贝塞斯达系统精确诊断类别(分别为 62.1% 对 55.8%,p = 0.014);(ii) 简化诊断类别(分别为 76.8% 对 71.5%,p = 0.027);(iii) 基于临床管理的简化诊断(分别为 71.5% 对 65.2%,p = 0.017)。与人工复查(M = 5.9 min, SD = 3.1)相比,数字复查(M = 3.2 min, SD = 2.2)的病例评估时间更短(t(352) = 19.44, p < 0.001, Cohen's d = 1.035, 95% CI [0.905, 1.164])。我们的验证研究表明,与光学显微镜检查相比,基于人工智能的数字巴氏试验评估不仅提高了诊断准确性,缩短了筛查时间,而且参与者对该系统的使用体验表示肯定。
{"title":"Validation of AI-assisted ThinPrep® Pap test screening using the GeniusTM Digital Diagnostics System","authors":"Richard L. Cantley ,&nbsp;Xin Jing ,&nbsp;Brian Smola ,&nbsp;Wei Hao ,&nbsp;Sarah Harrington ,&nbsp;Liron Pantanowitz","doi":"10.1016/j.jpi.2024.100391","DOIUrl":"10.1016/j.jpi.2024.100391","url":null,"abstract":"<div><p>Advances in whole-slide imaging and artificial intelligence present opportunities for improvement in Pap test screening. To date, there have been limited studies published regarding how best to validate newer AI-based digital systems for screening Pap tests in clinical practice. In this study, we validated the Genius™ Digital Diagnostics System (Hologic) by comparing the performance to traditional manual light microscopic diagnosis of ThinPrep<strong>®</strong> Pap test slides. A total of 319 ThinPrep<strong>®</strong> Pap test cases were prospectively assessed by six cytologists and three cytopathologists by light microscopy and digital evaluation and the results compared to the original ground truth Pap test diagnosis. Concordance with the original diagnosis was significantly different by digital and manual light microscopy review when comparing across: (i) exact Bethesda System diagnostic categories (62.1% vs 55.8%, respectively, <em>p</em> = 0.014), (ii) condensed diagnostic categories (76.8% vs 71.5%, respectively, <em>p</em> = 0.027), and (iii) condensed diagnoses based on clinical management (71.5% vs 65.2%, respectively, <em>p</em> = 0.017). Time to evaluate cases was shorter for digital (M = 3.2 min, SD = 2.2) compared to manual (M = 5.9 min, SD = 3.1) review (t(352) = 19.44, <em>p</em> &lt; 0.001, Cohen's d = 1.035, 95% CI [0.905, 1.164]). Not only did our validation study demonstrate that AI-based digital Pap test evaluation had improved diagnostic accuracy and reduced screening time compared to light microscopy, but that participants reported a positive experience using this system.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100391"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000300/pdfft?md5=f678b76ba4ddf0bb5fbfba56b65df94c&pid=1-s2.0-S2153353924000300-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141639228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An explainable AI-based blood cell classification using optimized convolutional neural network 利用优化的卷积神经网络实现基于人工智能的可解释血细胞分类
Q2 Medicine Pub Date : 2024-07-02 DOI: 10.1016/j.jpi.2024.100389
Oahidul Islam , Md Assaduzzaman , Md Zahid Hasan

White blood cells (WBCs) are a vital component of the immune system. The efficient and precise classification of WBCs is crucial for medical professionals to diagnose diseases accurately. This study presents an enhanced convolutional neural network (CNN) for detecting blood cells with the help of various image pre-processing techniques. Various image pre-processing techniques, such as padding, thresholding, erosion, dilation, and masking, are utilized to minimize noise and improve feature enhancement. Additionally, performance is further enhanced by experimenting with various architectural structures and hyperparameters to optimize the proposed model. A comparative evaluation is conducted to compare the performance of the proposed model with three transfer learning models, including Inception V3, MobileNetV2, and DenseNet201.The results indicate that the proposed model outperforms existing models, achieving a testing accuracy of 99.12%, precision of 99%, and F1-score of 99%. In addition, We utilized SHAP (Shapley Additive explanations) and LIME (Local Interpretable Model-agnostic Explanations) techniques in our study to improve the interpretability of the proposed model, providing valuable insights into how the model makes decisions. Furthermore, the proposed model has been further explained using the Grad-CAM and Grad-CAM++ techniques, which is a class-discriminative localization approach, to improve trust and transparency. Grad-CAM++ performed slightly better than Grad-CAM in identifying the predicted area's location. Finally, the most efficient model has been integrated into an end-to-end (E2E) system, accessible through both web and Android platforms for medical professionals to classify blood cell.

白细胞(WBC)是免疫系统的重要组成部分。对白细胞进行高效、精确的分类对于医学专家准确诊断疾病至关重要。本研究提出了一种增强型卷积神经网络(CNN),可借助各种图像预处理技术检测血细胞。利用各种图像预处理技术,如填充、阈值处理、侵蚀、扩张和遮蔽,可以最大限度地减少噪音,提高特征增强效果。此外,还通过试验各种架构结构和超参数来优化所提出的模型,从而进一步提高性能。结果表明,拟议模型的性能优于现有模型,测试准确率达到 99.12%,精确率达到 99%,F1 分数达到 99%。此外,我们还在研究中使用了 SHAP(夏普利相加解释)和 LIME(局部可解释模型-不可知解释)技术,以提高所提模型的可解释性,为了解模型如何做出决策提供了宝贵的见解。此外,我们还使用 Grad-CAM 和 Grad-CAM++ 技术进一步解释了所提出的模型。在识别预测区域位置方面,Grad-CAM++ 的表现略好于 Grad-CAM。最后,最有效的模型被集成到一个端到端(E2E)系统中,通过网络和安卓平台供医疗专业人员对血细胞进行分类。
{"title":"An explainable AI-based blood cell classification using optimized convolutional neural network","authors":"Oahidul Islam ,&nbsp;Md Assaduzzaman ,&nbsp;Md Zahid Hasan","doi":"10.1016/j.jpi.2024.100389","DOIUrl":"10.1016/j.jpi.2024.100389","url":null,"abstract":"<div><p>White blood cells (WBCs) are a vital component of the immune system. The efficient and precise classification of WBCs is crucial for medical professionals to diagnose diseases accurately. This study presents an enhanced convolutional neural network (CNN) for detecting blood cells with the help of various image pre-processing techniques. Various image pre-processing techniques, such as padding, thresholding, erosion, dilation, and masking, are utilized to minimize noise and improve feature enhancement. Additionally, performance is further enhanced by experimenting with various architectural structures and hyperparameters to optimize the proposed model. A comparative evaluation is conducted to compare the performance of the proposed model with three transfer learning models, including Inception V3, MobileNetV2, and DenseNet201.The results indicate that the proposed model outperforms existing models, achieving a testing accuracy of 99.12%, precision of 99%, and F1-score of 99%. In addition, We utilized SHAP (Shapley Additive explanations) and LIME (Local Interpretable Model-agnostic Explanations) techniques in our study to improve the interpretability of the proposed model, providing valuable insights into how the model makes decisions. Furthermore, the proposed model has been further explained using the Grad-CAM and Grad-CAM++ techniques, which is a class-discriminative localization approach, to improve trust and transparency. Grad-CAM++ performed slightly better than Grad-CAM in identifying the predicted area's location. Finally, the most efficient model has been integrated into an end-to-end (E2E) system, accessible through both web and Android platforms for medical professionals to classify blood cell.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100389"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000282/pdfft?md5=357d6d2314681f04709e94998615c5a1&pid=1-s2.0-S2153353924000282-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141708134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards interactive AI-authoring with prototypical few-shot classifiers in histopathology 利用组织病理学中的原型少量分类器实现交互式人工智能创作
Q2 Medicine Pub Date : 2024-06-06 DOI: 10.1016/j.jpi.2024.100388
Petr Kuritcyn , Rosalie Kletzander , Sophia Eisenberg , Thomas Wittenberg , Volker Bruns , Katja Evert , Felix Keil , Paul K. Ziegler , Katrin Bankov , Peter Wild , Markus Eckstein , Arndt Hartmann , Carol I. Geppert , Michaela Benz

A vast multitude of tasks in histopathology could potentially benefit from the support of artificial intelligence (AI). Many examples have been shown in the literature and first commercial products with FDA or CE-IVDR clearance are available. However, two key challenges remain: (1) a scarcity of thoroughly annotated images, respectively the laboriousness of this task, and (2) the creation of robust models that can cope with the data heterogeneity in the field (domain generalization). In this work, we investigate how the combination of prototypical few-shot classification models and data augmentation can address both of these challenges. Based on annotated data sets that include multiple centers, multiple scanners, and two tumor entities, we examine the robustness and the adaptability of few-shot classifiers in multiple scenarios. We demonstrate that data from one scanner and one site are sufficient to train robust few-shot classification models by applying domain-specific data augmentation. The models achieved classification performance of around 90% on a multiscanner and multicenter database, which is on par with the accuracy achieved on the primary single-center single-scanner data. Various convolutional neural network (CNN) architectures can be used for feature extraction in the few-shot model. A comparison of nine state-of-the-art architectures yielded that EfficientNet B0 provides the best trade-off between accuracy and inference time. The classification of prototypical few-shot models directly relies on class prototypes derived from example images of each class. Therefore, we investigated the influence of prototypes originating from images from different scanners and evaluated their performance also on the multiscanner database. Again, our few-shot model showed a stable performance with an average absolute deviation in accuracy compared to the primary prototypes of 1.8% points. Finally, we examined the adaptability to a new tumor entity: classification of tissue sections containing urothelial carcinoma into normal, tumor, and necrotic regions. Only three annotations per subclass (e.g., muscle and adipose tissue are subclasses of normal tissue) were provided to adapt the few-shot model, which obtained an overall accuracy of 93.6%. These results demonstrate that prototypical few-shot classification is an ideal technology for realizing an interactive AI authoring system as it only requires few annotations and can be adapted to new tasks without involving retraining of the underlying feature extraction CNN, which would in turn require a selection of hyper-parameters based on data science expert knowledge. Similarly, it can be regarded as a guided annotation system. To this end, we realized a workflow and user interface that targets non-technical users.

组织病理学中的大量任务都有可能受益于人工智能(AI)的支持。许多实例已见诸于文献,第一批获得美国食品及药物管理局(FDA)或 CE-IVDR 认证的商业产品也已问世。然而,两个关键挑战依然存在:(1) 缺乏全面注释的图像,这也是这项任务的艰巨性所在;(2) 如何创建稳健的模型,以应对该领域的数据异质性(领域泛化)。在这项工作中,我们研究了如何将原型少镜头分类模型与数据增强相结合来应对这两个挑战。基于包含多个中心、多台扫描仪和两个肿瘤实体的注释数据集,我们考察了少次分类器在多种情况下的鲁棒性和适应性。我们证明,通过应用特定领域的数据扩增,来自一台扫描仪和一个部位的数据足以训练出稳健的少次分类模型。这些模型在多扫描仪和多中心数据库中的分类性能达到了约 90%,与主要单中心单扫描仪数据的准确性相当。各种卷积神经网络(CNN)架构均可用于少数镜头模型的特征提取。对九种最先进的架构进行比较后发现,EfficientNet B0 在准确性和推理时间之间实现了最佳平衡。原型 few-shot 模型的分类直接依赖于从每个类别的示例图像中提取的类别原型。因此,我们研究了来自不同扫描仪图像的原型的影响,并在多扫描仪数据库中评估了它们的性能。同样,我们的少拍模型显示出稳定的性能,与主要原型相比,准确度的平均绝对偏差为 1.8%。最后,我们检验了对新肿瘤实体的适应性:将含有尿道癌的组织切片分为正常区、肿瘤区和坏死区。每个子类(例如,肌肉和脂肪组织是正常组织的子类)只提供了三个注释,以适应少拍模型,该模型的总体准确率为 93.6%。这些结果表明,原型少镜头分类法是实现交互式人工智能创作系统的理想技术,因为它只需要很少的注释,就能适应新的任务,而无需重新训练底层特征提取 CNN,这反过来又需要根据数据科学专家的知识来选择超参数。同样,它也可以被视为一种引导式注释系统。为此,我们实现了针对非技术用户的工作流程和用户界面。
{"title":"Towards interactive AI-authoring with prototypical few-shot classifiers in histopathology","authors":"Petr Kuritcyn ,&nbsp;Rosalie Kletzander ,&nbsp;Sophia Eisenberg ,&nbsp;Thomas Wittenberg ,&nbsp;Volker Bruns ,&nbsp;Katja Evert ,&nbsp;Felix Keil ,&nbsp;Paul K. Ziegler ,&nbsp;Katrin Bankov ,&nbsp;Peter Wild ,&nbsp;Markus Eckstein ,&nbsp;Arndt Hartmann ,&nbsp;Carol I. Geppert ,&nbsp;Michaela Benz","doi":"10.1016/j.jpi.2024.100388","DOIUrl":"10.1016/j.jpi.2024.100388","url":null,"abstract":"<div><p>A vast multitude of tasks in histopathology could potentially benefit from the support of artificial intelligence (AI). Many examples have been shown in the literature and first commercial products with FDA or CE-IVDR clearance are available. However, two key challenges remain: (1) a scarcity of thoroughly annotated images, respectively the laboriousness of this task, and (2) the creation of robust models that can cope with the data heterogeneity in the field (domain generalization). In this work, we investigate how the combination of prototypical few-shot classification models and data augmentation can address both of these challenges. Based on annotated data sets that include multiple centers, multiple scanners, and two tumor entities, we examine the robustness and the adaptability of few-shot classifiers in multiple scenarios. We demonstrate that data from one scanner and one site are sufficient to train robust few-shot classification models by applying domain-specific data augmentation. The models achieved classification performance of around 90% on a multiscanner and multicenter database, which is on par with the accuracy achieved on the primary single-center single-scanner data. Various convolutional neural network (CNN) architectures can be used for feature extraction in the few-shot model. A comparison of nine state-of-the-art architectures yielded that EfficientNet B0 provides the best trade-off between accuracy and inference time. The classification of prototypical few-shot models directly relies on class prototypes derived from example images of each class. Therefore, we investigated the influence of prototypes originating from images from different scanners and evaluated their performance also on the multiscanner database. Again, our few-shot model showed a stable performance with an average absolute deviation in accuracy compared to the primary prototypes of 1.8% points. Finally, we examined the adaptability to a new tumor entity: classification of tissue sections containing urothelial carcinoma into normal, tumor, and necrotic regions. Only three annotations per subclass (e.g., muscle and adipose tissue are subclasses of normal tissue) were provided to adapt the few-shot model, which obtained an overall accuracy of 93.6%. These results demonstrate that prototypical few-shot classification is an ideal technology for realizing an interactive AI authoring system as it only requires few annotations and can be adapted to new tasks without involving retraining of the underlying feature extraction CNN, which would in turn require a selection of hyper-parameters based on data science expert knowledge. Similarly, it can be regarded as a guided annotation system. To this end, we realized a workflow and user interface that targets non-technical users.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100388"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000270/pdfft?md5=05adcd36f07ac4f905fe1929289c6160&pid=1-s2.0-S2153353924000270-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141415124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Masked pre-training of transformers for histology image analysis 用于组织学图像分析的变换器屏蔽预训练
Q2 Medicine Pub Date : 2024-05-31 DOI: 10.1016/j.jpi.2024.100386
Shuai Jiang , Liesbeth Hondelink , Arief A. Suriawinata , Saeed Hassanpour

In digital pathology, whole-slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction. Vision transformer (ViT) models have recently emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches. However, due to the large number of model parameters and limited labeled data, applying transformer models to WSIs remains challenging. In this study, we propose a pretext task to train the transformer model in a self-supervised manner. Our model, MaskHIT, uses the transformer output to reconstruct masked patches, measured by contrastive loss. We pre-trained MaskHIT model using over 7000 WSIs from TCGA and extensively evaluated its performance in multiple experiments, covering survival prediction, cancer subtype classification, and grade prediction tasks. Our experiments demonstrate that the pre-training procedure enables context-aware understanding of WSIs, facilitates the learning of representative histological features based on patch positions and visual patterns, and is essential for the ViT model to achieve optimal results on WSI-level tasks. The pre-trained MaskHIT surpasses various multiple instance learning approaches by 3% and 2% on survival prediction and cancer subtype classification tasks, and also outperforms recent state-of-the-art transformer-based methods. Finally, a comparison between the attention maps generated by the MaskHIT model with pathologist's annotations indicates that the model can accurately identify clinically relevant histological structures on the whole slide for each task.

在数字病理学中,整幅图像(WSI)被广泛应用于癌症诊断和预后预测等领域。视觉变换器(ViT)模型是最近出现的一种很有前途的方法,它可以对大区域的 WSIs 进行编码,同时保留斑块之间的空间关系。然而,由于模型参数较多且标注数据有限,将变换器模型应用于 WSIs 仍然具有挑战性。在本研究中,我们提出了一个借口任务,以自我监督的方式训练变换器模型。我们的模型 MaskHIT 使用变换器输出来重构被遮蔽的斑块,以对比度损失来衡量。我们使用 TCGA 的 7000 多个 WSI 对 MaskHIT 模型进行了预训练,并在多个实验中对其性能进行了广泛评估,包括生存预测、癌症亚型分类和等级预测任务。我们的实验证明,预训练程序能够实现对 WSI 的上下文感知理解,促进了基于斑块位置和视觉模式的代表性组织学特征的学习,对于 ViT 模型在 WSI 级别任务中取得最佳结果至关重要。在生存预测和癌症亚型分类任务上,预训练的 MaskHIT 比各种多实例学习方法分别高出 3% 和 2%,也优于最近最先进的基于变换器的方法。最后,将 MaskHIT 模型生成的注意图与病理学家的注释进行比较,结果表明该模型能在每项任务中准确识别整张幻灯片上与临床相关的组织结构。
{"title":"Masked pre-training of transformers for histology image analysis","authors":"Shuai Jiang ,&nbsp;Liesbeth Hondelink ,&nbsp;Arief A. Suriawinata ,&nbsp;Saeed Hassanpour","doi":"10.1016/j.jpi.2024.100386","DOIUrl":"https://doi.org/10.1016/j.jpi.2024.100386","url":null,"abstract":"<div><p>In digital pathology, whole-slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction. Vision transformer (ViT) models have recently emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches. However, due to the large number of model parameters and limited labeled data, applying transformer models to WSIs remains challenging. In this study, we propose a pretext task to train the transformer model in a self-supervised manner. Our model, MaskHIT, uses the transformer output to reconstruct masked patches, measured by contrastive loss. We pre-trained MaskHIT model using over 7000 WSIs from TCGA and extensively evaluated its performance in multiple experiments, covering survival prediction, cancer subtype classification, and grade prediction tasks. Our experiments demonstrate that the pre-training procedure enables context-aware understanding of WSIs, facilitates the learning of representative histological features based on patch positions and visual patterns, and is essential for the ViT model to achieve optimal results on WSI-level tasks. The pre-trained MaskHIT surpasses various multiple instance learning approaches by 3% and 2% on survival prediction and cancer subtype classification tasks, and also outperforms recent state-of-the-art transformer-based methods. Finally, a comparison between the attention maps generated by the MaskHIT model with pathologist's annotations indicates that the model can accurately identify clinically relevant histological structures on the whole slide for each task.</p></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100386"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2153353924000257/pdfft?md5=3dfddd9f11d8384fd0c39d65dbfab6b4&pid=1-s2.0-S2153353924000257-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141434521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Pathology Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1