Pierre-Antoine Bannier, Glenn Broeckx, Loic Herpin, Remy Dubois, Lydwine Van Praet, Charles Maussion, Frederik Deman, Ellen Amonoo, Anca Mera, Jasmine Timbres, Cheryl Gillett, Elinor Sawyer, Patrycja Gazinska, Piotr Ziolkowski, Magali Lacroix-Triki, Roberto Salgado, Sheeba Irshad
{"title":"Development of a Deep Learning model Tailored for HER2 Detection in Breast Cancer to aid pathologists in interpreting HER2-Low cases","authors":"Pierre-Antoine Bannier, Glenn Broeckx, Loic Herpin, Remy Dubois, Lydwine Van Praet, Charles Maussion, Frederik Deman, Ellen Amonoo, Anca Mera, Jasmine Timbres, Cheryl Gillett, Elinor Sawyer, Patrycja Gazinska, Piotr Ziolkowski, Magali Lacroix-Triki, Roberto Salgado, Sheeba Irshad","doi":"10.1101/2024.07.01.601397","DOIUrl":null,"url":null,"abstract":"Introduction. Over 50% of breast cancer cases are Human epidermal growth factor receptor 2 (HER2) low breast cancer (BC), characterized by HER2 immunohistochemistry (IHC) scores of 1+ or 2+ alongside no amplification on fluorescence in situ hybridization (FISH) testing. The development of new anti-HER2 antibody-drug conjugates (ADCs) for treating HER2-low breast cancers illustrates the importance of accurately assessing HER2 status, particularly HER2-low breast cancer. In this study, we evaluated the performance of a deep learning (DL) model for the assessment of HER2, including an assessment of the causes of discordances of HER2-Null between a pathologist and the DL model. We specifically focussed on aligning the DL model rules with the ASCO/CAP guidelines, including stained cells staining intensity and completeness of membrane staining. Methods. We trained a DL model on a multi-centric cohort of breast cancer cases with HER2-immunohistochemistry scores (n=299). The model was validated on 2 independent multi-centric validation cohorts (n=369 and n=92), with all cases reviewed by 3 senior breast pathologists. All cases underwent a thorough review by three senior breast pathologists, with the ground truth determined by a majority consensus on the final HER2 score among the pathologists. In total, 760 breast cancer cases were utilized throughout the training and validation phases of the study.\nResults. The model concordance with the ground truth (ICC = 0.77 [0.68 - 0.83]; Fisher P = 1.32e-10) is higher than the average agreement among the 3 senior pathologists (ICC = 0.45 [0.17 - 0.65]; Fisher P = 2e-3). In the two validation cohorts, the DL model identifies 95% [93% - 98%] and 97% [91% - 100%] of HER2-low and HER2-positive tumors respectively. Discordant results were characterized by morphological features such as extended fibrosis, a high number of tumor-infiltrating lymphocytes, and necrosis, whilst some artifacts such as non-specific background cytoplasmic stain in the cytoplasm of tumor cells also cause discrepancy.\nConclusion: Deep learning can support pathologists' interpretation of difficult HER2-low cases. Morphological variables and some specific artifacts can cause discrepant HER2-scores between the pathologist and the DL Model.","PeriodicalId":501471,"journal":{"name":"bioRxiv - Pathology","volume":"136 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Pathology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.01.601397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction. Over 50% of breast cancer cases are Human epidermal growth factor receptor 2 (HER2) low breast cancer (BC), characterized by HER2 immunohistochemistry (IHC) scores of 1+ or 2+ alongside no amplification on fluorescence in situ hybridization (FISH) testing. The development of new anti-HER2 antibody-drug conjugates (ADCs) for treating HER2-low breast cancers illustrates the importance of accurately assessing HER2 status, particularly HER2-low breast cancer. In this study, we evaluated the performance of a deep learning (DL) model for the assessment of HER2, including an assessment of the causes of discordances of HER2-Null between a pathologist and the DL model. We specifically focussed on aligning the DL model rules with the ASCO/CAP guidelines, including stained cells staining intensity and completeness of membrane staining. Methods. We trained a DL model on a multi-centric cohort of breast cancer cases with HER2-immunohistochemistry scores (n=299). The model was validated on 2 independent multi-centric validation cohorts (n=369 and n=92), with all cases reviewed by 3 senior breast pathologists. All cases underwent a thorough review by three senior breast pathologists, with the ground truth determined by a majority consensus on the final HER2 score among the pathologists. In total, 760 breast cancer cases were utilized throughout the training and validation phases of the study.
Results. The model concordance with the ground truth (ICC = 0.77 [0.68 - 0.83]; Fisher P = 1.32e-10) is higher than the average agreement among the 3 senior pathologists (ICC = 0.45 [0.17 - 0.65]; Fisher P = 2e-3). In the two validation cohorts, the DL model identifies 95% [93% - 98%] and 97% [91% - 100%] of HER2-low and HER2-positive tumors respectively. Discordant results were characterized by morphological features such as extended fibrosis, a high number of tumor-infiltrating lymphocytes, and necrosis, whilst some artifacts such as non-specific background cytoplasmic stain in the cytoplasm of tumor cells also cause discrepancy.
Conclusion: Deep learning can support pathologists' interpretation of difficult HER2-low cases. Morphological variables and some specific artifacts can cause discrepant HER2-scores between the pathologist and the DL Model.