利用机器学习方法预测乳腺癌新辅助化疗的病理完全反应

IF 7.4 1区医学 Q1 Medicine Breast Cancer Research Pub Date : 2024-10-29 DOI:10.1186/s13058-024-01905-7

Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo

{"title":"利用机器学习方法预测乳腺癌新辅助化疗的病理完全反应","authors":"Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo","doi":"10.1186/s13058-024-01905-7","DOIUrl":null,"url":null,"abstract":"Background: For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.Methods: The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.Results: We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).Conclusions: The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.","PeriodicalId":49227,"journal":{"name":"Breast Cancer Research","volume":"26 1","pages":"148"},"PeriodicalIF":7.4000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520773/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach.\",\"authors\":\"Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo\",\"doi\":\"10.1186/s13058-024-01905-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.Methods: The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.Results: We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).Conclusions: The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.\",\"PeriodicalId\":49227,\"journal\":{\"name\":\"Breast Cancer Research\",\"volume\":\"26 1\",\"pages\":\"148\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520773/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Breast Cancer Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13058-024-01905-7\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Breast Cancer Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13058-024-01905-7","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

摘要

背景：对于接受新辅助化疗（NACT）的乳腺癌患者，现有的利用临床病理特征预测病理完全反应（pCR）的模型大多基于逻辑回归等标准统计模型，而基于机器学习的模型大多利用影像学数据和/或基因表达数据。本研究旨在开发一种稳健、易用的机器学习模型，仅利用临床病理特征预测 pCR，该模型可用于促进不同情况下的临床决策：该模型是在美国国家癌症数据库（NCDB，2018-2020年）和芝加哥大学的外部队列（2010-2020年）中开发和验证的。我们比较了逻辑回归模型和机器学习模型，并考察了纳入定量临床病理特征是否能提高模型性能。我们还进行了决策曲线分析，以评估模型的临床实用性：我们确定了56209名接受NACT治疗的NCDB患者（pCR率：34.0%）。在所有拟合模型中，包含定量临床病理特征的机器学习模型显示出最佳的分辨性能[接收者操作特征曲线下面积（AUC）：0.785，95% 置信度：0.785，95% 置信度：0.785，95% 置信度：0.785]：0.785，95% 置信区间 (CI)：0.778-0.792]，同时校准性能也非常出色。该模型在激素受体阳性/人表皮生长因子受体2阴性（HR+/HER2-）乳腺癌患者中表现最佳（AUC：0.817，95% CI：0.802-0.832）；通过采用7%的预测阈值，该模型实现了90.5%的灵敏度和48.8%的特异性，决策曲线分析发现化疗用量净减少23.1%。在由584名患者（pCR率：33.4%）组成的外部测试集中，该模型在总体（AUC：0.711，95% CI：0.668-0.753）和HR+/HER2-亚组（AUC：0.810，95% CI：0.742-0.878）中均保持了稳健的性能：该研究开发了一种机器学习模型（https://huolab.cri.uchicago.edu/sample-apps/pcrmodel）来预测接受NACT治疗的乳腺癌患者的pCR，该模型表现出了强大的区分度和校准性能。该模型在HR+/HER2-乳腺癌患者中表现尤为突出，有可能识别出不太可能获得pCR的患者，并可考虑化疗以外的其他治疗策略。该模型还可以作为一个稳健的基线模型，在未来的研究中与包含更多细粒度特征的较小数据集进行整合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach.

Background: For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.

Methods: The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.

Results: We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).

Conclusions: The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Breast Cancer Research ONCOLOGY-

CiteScore

12.00

自引率

0.00%

发文量

审稿时长

12 weeks

期刊介绍： Breast Cancer Research, an international, peer-reviewed online journal, publishes original research, reviews, editorials, and reports. It features open-access research articles of exceptional interest across all areas of biology and medicine relevant to breast cancer. This includes normal mammary gland biology, with a special emphasis on the genetic, biochemical, and cellular basis of breast cancer. In addition to basic research, the journal covers preclinical, translational, and clinical studies with a biological basis, including Phase I and Phase II trials.