利用机器学习方法预测乳腺癌新辅助化疗的病理完全反应

IF 7.4 1区 医学 Q1 Medicine Breast Cancer Research Pub Date : 2024-10-29 DOI:10.1186/s13058-024-01905-7
Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo
{"title":"利用机器学习方法预测乳腺癌新辅助化疗的病理完全反应","authors":"Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo","doi":"10.1186/s13058-024-01905-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.</p><p><strong>Methods: </strong>The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.</p><p><strong>Results: </strong>We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).</p><p><strong>Conclusions: </strong>The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.</p>","PeriodicalId":49227,"journal":{"name":"Breast Cancer Research","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520773/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach.\",\"authors\":\"Fangyuan Zhao, Eric Polley, Julian McClellan, Frederick Howard, Olufunmilayo I Olopade, Dezheng Huo\",\"doi\":\"10.1186/s13058-024-01905-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.</p><p><strong>Methods: </strong>The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.</p><p><strong>Results: </strong>We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).</p><p><strong>Conclusions: </strong>The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.</p>\",\"PeriodicalId\":49227,\"journal\":{\"name\":\"Breast Cancer Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520773/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Breast Cancer Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13058-024-01905-7\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Breast Cancer Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13058-024-01905-7","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

背景:对于接受新辅助化疗(NACT)的乳腺癌患者,现有的利用临床病理特征预测病理完全反应(pCR)的模型大多基于逻辑回归等标准统计模型,而基于机器学习的模型大多利用影像学数据和/或基因表达数据。本研究旨在开发一种稳健、易用的机器学习模型,仅利用临床病理特征预测 pCR,该模型可用于促进不同情况下的临床决策:该模型是在美国国家癌症数据库(NCDB,2018-2020年)和芝加哥大学的外部队列(2010-2020年)中开发和验证的。我们比较了逻辑回归模型和机器学习模型,并考察了纳入定量临床病理特征是否能提高模型性能。我们还进行了决策曲线分析,以评估模型的临床实用性:我们确定了56209名接受NACT治疗的NCDB患者(pCR率:34.0%)。在所有拟合模型中,包含定量临床病理特征的机器学习模型显示出最佳的分辨性能[接收者操作特征曲线下面积(AUC):0.785,95% 置信度:0.785,95% 置信度:0.785,95% 置信度:0.785]:0.785,95% 置信区间 (CI):0.778-0.792],同时校准性能也非常出色。该模型在激素受体阳性/人表皮生长因子受体2阴性(HR+/HER2-)乳腺癌患者中表现最佳(AUC:0.817,95% CI:0.802-0.832);通过采用7%的预测阈值,该模型实现了90.5%的灵敏度和48.8%的特异性,决策曲线分析发现化疗用量净减少23.1%。在由584名患者(pCR率:33.4%)组成的外部测试集中,该模型在总体(AUC:0.711,95% CI:0.668-0.753)和HR+/HER2-亚组(AUC:0.810,95% CI:0.742-0.878)中均保持了稳健的性能:该研究开发了一种机器学习模型(https://huolab.cri.uchicago.edu/sample-apps/pcrmodel)来预测接受NACT治疗的乳腺癌患者的pCR,该模型表现出了强大的区分度和校准性能。该模型在HR+/HER2-乳腺癌患者中表现尤为突出,有可能识别出不太可能获得pCR的患者,并可考虑化疗以外的其他治疗策略。该模型还可以作为一个稳健的基线模型,在未来的研究中与包含更多细粒度特征的较小数据集进行整合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach.

Background: For patients with breast cancer undergoing neoadjuvant chemotherapy (NACT), most of the existing prediction models of pathologic complete response (pCR) using clinicopathological features were based on standard statistical models like logistic regression, while models based on machine learning mostly utilized imaging data and/or gene expression data. This study aims to develop a robust and accessible machine learning model to predict pCR using clinicopathological features alone, which can be used to facilitate clinical decision-making in diverse settings.

Methods: The model was developed and validated within the National Cancer Data Base (NCDB, 2018-2020) and an external cohort at the University of Chicago (2010-2020). We compared logistic regression and machine learning models, and examined whether incorporating quantitative clinicopathological features improved model performance. Decision curve analysis was conducted to assess the model's clinical utility.

Results: We identified 56,209 NCDB patients receiving NACT (pCR rate: 34.0%). The machine learning model incorporating quantitative clinicopathological features showed the best discrimination performance among all the fitted models [area under the receiver operating characteristic curve (AUC): 0.785, 95% confidence interval (CI): 0.778-0.792], along with outstanding calibration performance. The model performed best among patients with hormone receptor positive/human epidermal growth factor receptor 2 negative (HR+/HER2-) breast cancer (AUC: 0.817, 95% CI: 0.802-0.832); and by adopting a 7% prediction threshold, the model achieved 90.5% sensitivity and 48.8% specificity, with decision curve analysis finding a 23.1% net reduction in chemotherapy use. In the external testing set of 584 patients (pCR rate: 33.4%), the model maintained robust performance both overall (AUC: 0.711, 95% CI: 0.668-0.753) and in the HR+/HER2- subgroup (AUC: 0.810, 95% CI: 0.742-0.878).

Conclusions: The study developed a machine learning model ( https://huolab.cri.uchicago.edu/sample-apps/pcrmodel ) to predict pCR in breast cancer patients undergoing NACT that demonstrated robust discrimination and calibration performance. The model performed particularly well among patients with HR+/HER2- breast cancer, having the potential to identify patients who are less likely to achieve pCR and can consider alternative treatment strategies over chemotherapy. The model can also serve as a robust baseline model that can be integrated with smaller datasets containing additional granular features in future research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
12.00
自引率
0.00%
发文量
76
审稿时长
12 weeks
期刊介绍: Breast Cancer Research, an international, peer-reviewed online journal, publishes original research, reviews, editorials, and reports. It features open-access research articles of exceptional interest across all areas of biology and medicine relevant to breast cancer. This includes normal mammary gland biology, with a special emphasis on the genetic, biochemical, and cellular basis of breast cancer. In addition to basic research, the journal covers preclinical, translational, and clinical studies with a biological basis, including Phase I and Phase II trials.
期刊最新文献
Differentiating HER2-low and HER2-zero tumors with 21-gene multigene assay in 2,295 h + HER2- breast cancer: a retrospective analysis. PTPN20 promotes metastasis through activating NF-κB signaling in triple-negative breast cancer. The role of heparan sulfate in enhancing the chemotherapeutic response in triple-negative breast cancer. Prediction of menstrual recovery patterns in premenopausal women with breast cancer taking tamoxifen after chemotherapy: an ASTRRA Substudy. Breast cancer in women with previous gestational diabetes: a nationwide register-based cohort study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1