{"title":"Data-Efficient Computational Pathology Platform for Faster and Cheaper Breast Cancer Subtype Identifications: Development of a Deep Learning Model.","authors":"Kideog Bae, Young Seok Jeon, Yul Hwangbo, Chong Woo Yoo, Nayoung Han, Mengling Feng","doi":"10.2196/45547","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Breast cancer subtyping is a crucial step in determining therapeutic options, but the molecular examination based on immunohistochemical staining is expensive and time-consuming. Deep learning opens up the possibility to predict the subtypes based on the morphological information from hematoxylin and eosin staining, a much cheaper and faster alternative. However, training the predictive model conventionally requires a large number of histology images, which is challenging to collect by a single institute.</p><p><strong>Objective: </strong>We aimed to develop a data-efficient computational pathology platform, 3DHistoNet, which is capable of learning from z-stacked histology images to accurately predict breast cancer subtypes with a small sample size.</p><p><strong>Methods: </strong>We retrospectively examined 401 cases of patients with primary breast carcinoma diagnosed between 2018 and 2020 at the Department of Pathology, National Cancer Center, South Korea. Pathology slides of the patients with breast carcinoma were prepared according to the standard protocols. Age, gender, histologic grade, hormone receptor (estrogen receptor [ER], progesterone receptor [PR], and androgen receptor [AR]) status, erb-B2 receptor tyrosine kinase 2 (HER2) status, and Ki-67 index were evaluated by reviewing medical charts and pathological records.</p><p><strong>Results: </strong>The area under the receiver operating characteristic curve and decision curve were analyzed to evaluate the performance of our 3DHistoNet platform for predicting the ER, PR, AR, HER2, and Ki67 subtype biomarkers with 5-fold cross-validation. We demonstrated that 3DHistoNet can predict all clinically important biomarkers (ER, PR, AR, HER2, and Ki67) with performance exceeding the conventional multiple instance learning models by a considerable margin (area under the receiver operating characteristic curve: 0.75-0.91 vs 0.67-0.8). We further showed that our z-stack histology scanning method can make up for insufficient training data sets without any additional cost incurred. Finally, 3DHistoNet offered an additional capability to generate attention maps that reveal correlations between Ki67 and histomorphological features, which renders the hematoxylin and eosin image in higher fidelity to the pathologist.</p><p><strong>Conclusions: </strong>Our stand-alone, data-efficient pathology platform that can both generate z-stacked images and predict key biomarkers is an appealing tool for breast cancer diagnosis. Its development would encourage morphology-based diagnosis, which is faster, cheaper, and less error-prone compared to the protein quantification method based on immunohistochemical staining.</p>","PeriodicalId":45538,"journal":{"name":"JMIR Cancer","volume":"9 ","pages":"e45547"},"PeriodicalIF":3.3000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10509735/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/45547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Breast cancer subtyping is a crucial step in determining therapeutic options, but the molecular examination based on immunohistochemical staining is expensive and time-consuming. Deep learning opens up the possibility to predict the subtypes based on the morphological information from hematoxylin and eosin staining, a much cheaper and faster alternative. However, training the predictive model conventionally requires a large number of histology images, which is challenging to collect by a single institute.
Objective: We aimed to develop a data-efficient computational pathology platform, 3DHistoNet, which is capable of learning from z-stacked histology images to accurately predict breast cancer subtypes with a small sample size.
Methods: We retrospectively examined 401 cases of patients with primary breast carcinoma diagnosed between 2018 and 2020 at the Department of Pathology, National Cancer Center, South Korea. Pathology slides of the patients with breast carcinoma were prepared according to the standard protocols. Age, gender, histologic grade, hormone receptor (estrogen receptor [ER], progesterone receptor [PR], and androgen receptor [AR]) status, erb-B2 receptor tyrosine kinase 2 (HER2) status, and Ki-67 index were evaluated by reviewing medical charts and pathological records.
Results: The area under the receiver operating characteristic curve and decision curve were analyzed to evaluate the performance of our 3DHistoNet platform for predicting the ER, PR, AR, HER2, and Ki67 subtype biomarkers with 5-fold cross-validation. We demonstrated that 3DHistoNet can predict all clinically important biomarkers (ER, PR, AR, HER2, and Ki67) with performance exceeding the conventional multiple instance learning models by a considerable margin (area under the receiver operating characteristic curve: 0.75-0.91 vs 0.67-0.8). We further showed that our z-stack histology scanning method can make up for insufficient training data sets without any additional cost incurred. Finally, 3DHistoNet offered an additional capability to generate attention maps that reveal correlations between Ki67 and histomorphological features, which renders the hematoxylin and eosin image in higher fidelity to the pathologist.
Conclusions: Our stand-alone, data-efficient pathology platform that can both generate z-stacked images and predict key biomarkers is an appealing tool for breast cancer diagnosis. Its development would encourage morphology-based diagnosis, which is faster, cheaper, and less error-prone compared to the protein quantification method based on immunohistochemical staining.
背景:乳腺癌症分型是确定治疗方案的关键步骤,但基于免疫组织化学染色的分子检查昂贵且耗时。深度学习为基于苏木精和伊红染色的形态学信息预测亚型开辟了可能性,这是一种更便宜、更快的替代方法。然而,训练预测模型通常需要大量的组织学图像,这对于单个研究所来说是一个挑战。目的:我们旨在开发一个数据高效的计算病理学平台3DHistoNet,该平台能够从z堆叠的组织学图像中学习,以小样本量准确预测乳腺癌症亚型。方法:我们回顾性检查了2018年至2020年间在韩国国家癌症中心病理科诊断的401例原发性乳腺癌患者。根据标准方案制备乳腺癌患者的病理切片。年龄、性别、组织学分级、激素受体(雌激素受体[ER]、孕激素受体[PR]和雄激素受体[AR])状态、erb-B2受体酪氨酸激酶2(HER2)状态和Ki-67指数通过查阅病历和病理记录进行评估。结果:分析受试者操作特征曲线和决策曲线下面积,以评估我们的3DHistoNet平台在预测ER、PR、AR、HER2和Ki67亚型生物标志物方面的性能,并进行5倍交叉验证。我们证明3DHistoNet可以预测所有临床上重要的生物标志物(ER、PR、AR、HER2和Ki67),其性能大大超过传统的多实例学习模型(受试者操作特征曲线下面积:0.75-0.91 vs 0.67-0.8)。我们进一步证明,我们的z堆栈组织学扫描方法可以弥补训练数据不足的不足无需任何额外费用。最后,3DHistoNet提供了生成注意力图的额外功能,该功能揭示了Ki67和组织形态学特征之间的相关性,从而使苏木精和伊红图像对病理学家具有更高的保真度。结论:我们的独立、数据高效的病理学平台既可以生成z叠加图像,又可以预测关键生物标志物,是癌症诊断的一个有吸引力的工具。它的发展将鼓励基于形态学的诊断,与基于免疫组织化学染色的蛋白质定量方法相比,这种方法更快、更便宜、更不容易出错。