Beatrice S. Knudsen , Alok Jadhav , Lindsey J. Perry , Jeppe Thagaard , Georgios Deftereos , Jian Ying , Ben J. Brintz , Wei Zhang
{"title":"用于评估机器学习/人工智能模型的管道,以量化 PD-L1 免疫组化。","authors":"Beatrice S. Knudsen , Alok Jadhav , Lindsey J. Perry , Jeppe Thagaard , Georgios Deftereos , Jian Ying , Ben J. Brintz , Wei Zhang","doi":"10.1016/j.labinv.2024.102070","DOIUrl":null,"url":null,"abstract":"<div><p>Immunohistochemistry (IHC) is used to guide treatment decisions in multiple cancer types. For treatment with checkpoint inhibitors, programmed death ligand 1 (PD-L1) IHC is used as a companion diagnostic. However, the scoring of PD-L1 is complicated by its expression in cancer and immune cells. Separation of cancer and noncancer regions is needed to calculate tumor proportion scores (TPS) of PD-L1, which is based on the percentage of PD-L1-positive cancer cells. Evaluation of PD-L1 expression requires highly experienced pathologists and is often challenging and time-consuming. Here, we used a multi-institutional cohort of 77 lung cancer cases stained centrally with the PD-L1 22C3 clone. We developed a 4-step pipeline for measuring TPS that includes the coregistration of hematoxylin and eosin, PD-L1, and negative control (NC) digital slides for exclusion of necrosis, segmentation of cancer regions, and quantification of PD-L1+ cells. As cancer segmentation is a challenging step for TPS generation, we trained DeepLab V3 in the Visiopharm software package to outline cancer regions in PD-L1 and NC images and evaluated the model performance by mean intersection over union (mIoU) against manual outlines. Only 14 cases were required to accomplish a mIoU of 0.82 for cancer segmentation in hematoxylin-stained NC cases. For PD-L1-stained slides, a model trained on PD-L1 tiles augmented by registered NC tiles achieved a mIoU of 0.79. In segmented cancer regions from whole slide images, the digital TPS achieved an accuracy of 75% against the manual TPS scores from the pathology report. Major reasons for algorithmic inaccuracies include the inclusion of immune cells in cancer outlines and poor nuclear segmentation of cancer cells. Our transparent and stepwise approach and performance metrics can be applied to any IHC assay to provide pathologists with important insights on when to apply and how to evaluate commercial automated IHC scoring systems.</p></div>","PeriodicalId":17930,"journal":{"name":"Laboratory Investigation","volume":"104 6","pages":"Article 102070"},"PeriodicalIF":5.1000,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Pipeline for Evaluation of Machine Learning/Artificial Intelligence Models to Quantify Programmed Death Ligand 1 Immunohistochemistry\",\"authors\":\"Beatrice S. Knudsen , Alok Jadhav , Lindsey J. Perry , Jeppe Thagaard , Georgios Deftereos , Jian Ying , Ben J. Brintz , Wei Zhang\",\"doi\":\"10.1016/j.labinv.2024.102070\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Immunohistochemistry (IHC) is used to guide treatment decisions in multiple cancer types. For treatment with checkpoint inhibitors, programmed death ligand 1 (PD-L1) IHC is used as a companion diagnostic. However, the scoring of PD-L1 is complicated by its expression in cancer and immune cells. Separation of cancer and noncancer regions is needed to calculate tumor proportion scores (TPS) of PD-L1, which is based on the percentage of PD-L1-positive cancer cells. Evaluation of PD-L1 expression requires highly experienced pathologists and is often challenging and time-consuming. Here, we used a multi-institutional cohort of 77 lung cancer cases stained centrally with the PD-L1 22C3 clone. We developed a 4-step pipeline for measuring TPS that includes the coregistration of hematoxylin and eosin, PD-L1, and negative control (NC) digital slides for exclusion of necrosis, segmentation of cancer regions, and quantification of PD-L1+ cells. As cancer segmentation is a challenging step for TPS generation, we trained DeepLab V3 in the Visiopharm software package to outline cancer regions in PD-L1 and NC images and evaluated the model performance by mean intersection over union (mIoU) against manual outlines. Only 14 cases were required to accomplish a mIoU of 0.82 for cancer segmentation in hematoxylin-stained NC cases. For PD-L1-stained slides, a model trained on PD-L1 tiles augmented by registered NC tiles achieved a mIoU of 0.79. In segmented cancer regions from whole slide images, the digital TPS achieved an accuracy of 75% against the manual TPS scores from the pathology report. Major reasons for algorithmic inaccuracies include the inclusion of immune cells in cancer outlines and poor nuclear segmentation of cancer cells. Our transparent and stepwise approach and performance metrics can be applied to any IHC assay to provide pathologists with important insights on when to apply and how to evaluate commercial automated IHC scoring systems.</p></div>\",\"PeriodicalId\":17930,\"journal\":{\"name\":\"Laboratory Investigation\",\"volume\":\"104 6\",\"pages\":\"Article 102070\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Laboratory Investigation\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0023683724017483\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Laboratory Investigation","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0023683724017483","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
A Pipeline for Evaluation of Machine Learning/Artificial Intelligence Models to Quantify Programmed Death Ligand 1 Immunohistochemistry
Immunohistochemistry (IHC) is used to guide treatment decisions in multiple cancer types. For treatment with checkpoint inhibitors, programmed death ligand 1 (PD-L1) IHC is used as a companion diagnostic. However, the scoring of PD-L1 is complicated by its expression in cancer and immune cells. Separation of cancer and noncancer regions is needed to calculate tumor proportion scores (TPS) of PD-L1, which is based on the percentage of PD-L1-positive cancer cells. Evaluation of PD-L1 expression requires highly experienced pathologists and is often challenging and time-consuming. Here, we used a multi-institutional cohort of 77 lung cancer cases stained centrally with the PD-L1 22C3 clone. We developed a 4-step pipeline for measuring TPS that includes the coregistration of hematoxylin and eosin, PD-L1, and negative control (NC) digital slides for exclusion of necrosis, segmentation of cancer regions, and quantification of PD-L1+ cells. As cancer segmentation is a challenging step for TPS generation, we trained DeepLab V3 in the Visiopharm software package to outline cancer regions in PD-L1 and NC images and evaluated the model performance by mean intersection over union (mIoU) against manual outlines. Only 14 cases were required to accomplish a mIoU of 0.82 for cancer segmentation in hematoxylin-stained NC cases. For PD-L1-stained slides, a model trained on PD-L1 tiles augmented by registered NC tiles achieved a mIoU of 0.79. In segmented cancer regions from whole slide images, the digital TPS achieved an accuracy of 75% against the manual TPS scores from the pathology report. Major reasons for algorithmic inaccuracies include the inclusion of immune cells in cancer outlines and poor nuclear segmentation of cancer cells. Our transparent and stepwise approach and performance metrics can be applied to any IHC assay to provide pathologists with important insights on when to apply and how to evaluate commercial automated IHC scoring systems.
期刊介绍:
Laboratory Investigation is an international journal owned by the United States and Canadian Academy of Pathology. Laboratory Investigation offers prompt publication of high-quality original research in all biomedical disciplines relating to the understanding of human disease and the application of new methods to the diagnosis of disease. Both human and experimental studies are welcome.