A Machine Learning Model for Predicting the HER2 Positive Expression of Breast Cancer Based on Clinicopathological and Imaging Features.

IF 3.8 2区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Academic Radiology Pub Date : 2025-01-20 DOI:10.1016/j.acra.2025.01.001
Xiaojuan Qin, Wei Yang, Xiaoping Zhou, Yan Yang, Ningmei Zhang
{"title":"A Machine Learning Model for Predicting the HER2 Positive Expression of Breast Cancer Based on Clinicopathological and Imaging Features.","authors":"Xiaojuan Qin, Wei Yang, Xiaoping Zhou, Yan Yang, Ningmei Zhang","doi":"10.1016/j.acra.2025.01.001","DOIUrl":null,"url":null,"abstract":"<p><strong>Rationale and objectives: </strong>To develop a machine learning (ML) model based on clinicopathological and imaging features to predict the Human Epidermal Growth Factor Receptor 2 (HER2) positive expression (HER2-p) of breast cancer (BC), and to compare its performance with that of a logistic regression (LR) model.</p><p><strong>Materials and methods: </strong>A total of 2541 consecutive female patients with pathologically confirmed primary breast lesions were enrolled in this study. Based on chronological order, 2034 patients treated between January 2018 and December 2022 were designated as the retrospective development cohort, while 507 patients treated between January 2023 and May 2024 were designated as the prospective validation cohort. The patients were randomly divided into a train cohort (n=1628) and a test cohort (n=406) in an 8:2 ratio within the development cohort. Pretreatment mammography (MG) and breast MRI data, along with clinicopathological features, were recorded. Extreme Gradient Boosting (XGBoost) in combination with Artificial Neural Network (ANN) and multivariate LR analyses were employed to extract features associated with HER2 positivity in BC and to develop an ANN model (using XGBoost features) and an LR model, respectively. The predictive value was assessed using a receiver operating characteristic (ROC) curve.</p><p><strong>Results: </strong>Following the application of Recursive Feature Elimination with Cross-Validation (RFE-CV) for feature dimensionality reduction, the XGBoost algorithm identified tumor size, suspicious calcifications, Ki-67 index, spiculation, and minimum apparent diffusion coefficient (minimum ADC) as key feature subsets indicative of HER2-p in BC. The constructed ANN model consistently outperformed the LR model, achieving the area under the curve (AUC) of 0.853 (95% CI: 0.837-0.872) in the train cohort, 0.821 (95% CI: 0.798-0.853) in the test cohort, and 0.809 (95% CI: 0.776-0.841) in the validation cohort.</p><p><strong>Conclusion: </strong>The ANN model, built using the significant feature subsets identified by the XGBoost algorithm with RFE-CV, demonstrates potential in predicting HER2-p in BC.</p>","PeriodicalId":50928,"journal":{"name":"Academic Radiology","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Academic Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.acra.2025.01.001","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Rationale and objectives: To develop a machine learning (ML) model based on clinicopathological and imaging features to predict the Human Epidermal Growth Factor Receptor 2 (HER2) positive expression (HER2-p) of breast cancer (BC), and to compare its performance with that of a logistic regression (LR) model.

Materials and methods: A total of 2541 consecutive female patients with pathologically confirmed primary breast lesions were enrolled in this study. Based on chronological order, 2034 patients treated between January 2018 and December 2022 were designated as the retrospective development cohort, while 507 patients treated between January 2023 and May 2024 were designated as the prospective validation cohort. The patients were randomly divided into a train cohort (n=1628) and a test cohort (n=406) in an 8:2 ratio within the development cohort. Pretreatment mammography (MG) and breast MRI data, along with clinicopathological features, were recorded. Extreme Gradient Boosting (XGBoost) in combination with Artificial Neural Network (ANN) and multivariate LR analyses were employed to extract features associated with HER2 positivity in BC and to develop an ANN model (using XGBoost features) and an LR model, respectively. The predictive value was assessed using a receiver operating characteristic (ROC) curve.

Results: Following the application of Recursive Feature Elimination with Cross-Validation (RFE-CV) for feature dimensionality reduction, the XGBoost algorithm identified tumor size, suspicious calcifications, Ki-67 index, spiculation, and minimum apparent diffusion coefficient (minimum ADC) as key feature subsets indicative of HER2-p in BC. The constructed ANN model consistently outperformed the LR model, achieving the area under the curve (AUC) of 0.853 (95% CI: 0.837-0.872) in the train cohort, 0.821 (95% CI: 0.798-0.853) in the test cohort, and 0.809 (95% CI: 0.776-0.841) in the validation cohort.

Conclusion: The ANN model, built using the significant feature subsets identified by the XGBoost algorithm with RFE-CV, demonstrates potential in predicting HER2-p in BC.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于临床病理和影像学特征预测乳腺癌HER2阳性表达的机器学习模型
基本原理和目的:建立基于临床病理和影像学特征的机器学习(ML)模型来预测乳腺癌(BC)中人表皮生长因子受体2 (HER2)阳性表达(HER2-p),并将其性能与逻辑回归(LR)模型进行比较。材料与方法:本研究共纳入2541例经病理证实的乳腺原发性病变女性患者。根据时间顺序,2018年1月至2022年12月期间治疗的2034例患者被指定为回顾性发展队列,而2023年1月至2024年5月期间治疗的507例患者被指定为前瞻性验证队列。在发展队列中,患者按8:2的比例随机分为训练队列(n=1628)和测试队列(n=406)。记录前处理乳房x线摄影(MG)和乳房MRI数据,以及临床病理特征。采用极端梯度增强(XGBoost)结合人工神经网络(ANN)和多元LR分析提取与BC中HER2阳性相关的特征,并分别建立ANN模型(使用XGBoost特征)和LR模型。采用受试者工作特征(ROC)曲线评估预测价值。结果:在应用递归特征消除与交叉验证(RFE-CV)进行特征降维后,XGBoost算法将肿瘤大小、可疑钙化、Ki-67指数、刺状和最小表观扩散系数(minimum apparent diffusion coefficient,最小表观扩散系数)识别为BC中HER2-p的关键特征子集。构建的人工神经网络模型始终优于LR模型,在训练队列中曲线下面积(AUC)为0.853 (95% CI: 0.837-0.872),在测试队列中为0.821 (95% CI: 0.798-0.853),在验证队列中为0.809 (95% CI: 0.776-0.841)。结论:利用XGBoost算法和RFE-CV识别的显著特征子集建立的人工神经网络模型在预测BC的HER2-p方面具有潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Academic Radiology
Academic Radiology 医学-核医学
CiteScore
7.60
自引率
10.40%
发文量
432
审稿时长
18 days
期刊介绍: Academic Radiology publishes original reports of clinical and laboratory investigations in diagnostic imaging, the diagnostic use of radioactive isotopes, computed tomography, positron emission tomography, magnetic resonance imaging, ultrasound, digital subtraction angiography, image-guided interventions and related techniques. It also includes brief technical reports describing original observations, techniques, and instrumental developments; state-of-the-art reports on clinical issues, new technology and other topics of current medical importance; meta-analyses; scientific studies and opinions on radiologic education; and letters to the Editor.
期刊最新文献
Machine Learning Model for Risk Stratification of Papillary Thyroid Carcinoma Based on Radiopathomics. Non-invasive Assessment of Human Epidermal Growth Factor Receptor 2 Expression in Gastric Cancer Based on Deep Learning: A Computed Tomography-based Multicenter Study. Prediction of Radiation Therapy Induced Cardiovascular Toxicity from Pretreatment CT Images in Patients with Thoracic Malignancy via an Optimal Biomarker Approach. Unlocking Innovation: Promoting Scholarly Endeavors During Radiology Residency. Longitudinal Assessment of Pulmonary Involvement and Prognosis in Different Subtypes of COVID-19 Patients After One Year Using Low-Dose CT: A Prospective Observational Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1