The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study

IF 2.5 3区 医学 Q2 OBSTETRICS & GYNECOLOGY Archives of Gynecology and Obstetrics Pub Date : 2025-01-30 DOI:10.1007/s00404-024-07901-8
Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G. Barr, André Pfob
{"title":"The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study","authors":"Lie Cai,&nbsp;Michael Golatta,&nbsp;Chris Sidey-Gibbons,&nbsp;Richard G. Barr,&nbsp;André Pfob","doi":"10.1007/s00404-024-07901-8","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models’ generalizability by validating them on external data generated by both the original updated software versions.</p><h3>Methods</h3><p>We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT 02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).</p><h3>Results</h3><p>We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, <i>p</i> &lt; 0.001 and 0.934 vs. 0.872, <i>p</i> &lt; 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, <i>p</i> = 0.045). SVM was not calibrated.</p><h3>Conclusion</h3><p>In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.</p></div>","PeriodicalId":8330,"journal":{"name":"Archives of Gynecology and Obstetrics","volume":"312 1","pages":"139 - 147"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12176987/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Gynecology and Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s00404-024-07901-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models’ generalizability by validating them on external data generated by both the original updated software versions.

Methods

We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT 02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).

Results

We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, p < 0.001 and 0.934 vs. 0.872, p < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.

Conclusion

In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
更新成像软件对乳腺癌诊断机器学习模型性能的影响:一项多中心回顾性研究。
目的:基于医学(影像)数据的人工智能模型日益发展。但是,生成原始数据的成像软件经常更新。更新的成像软件对人工智能模型性能的影响尚不清楚。我们的目标是利用剪切波弹性成像(SWE)数据开发机器学习模型,以识别乳腺恶性病变,并通过在两个原始更新软件版本生成的外部数据上验证模型,来测试模型的泛化性。方法:利用多中心国际SWE数据(NCT02638935),采用十倍交叉验证,开发并验证了不同的机器学习模型(GLM、MARS、XGBoost、SVM)。将结果与活检标本的组织病理学评估或2年随访进行比较。结果测量为曲线下面积(AUROC)。结果:我们将1288例病例纳入使用原始成像软件的开发集,385例病例纳入使用原始和更新软件的验证集。在外部验证集中,与原始软件数据相比,更新后的GLM和XGBoost模型表现出更好的性能(AUROC分别为0.941和0.902,p)。结论:在使用SWE数据的多中心研究中,一些机器学习模型在弥合原始软件和更新软件之间的差距方面表现出很大的潜力,而另一些则表现出较弱的泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.70
自引率
15.40%
发文量
493
审稿时长
1 months
期刊介绍: Founded in 1870 as "Archiv für Gynaekologie", Archives of Gynecology and Obstetrics has a long and outstanding tradition. Since 1922 the journal has been the Organ of the Deutsche Gesellschaft für Gynäkologie und Geburtshilfe. "The Archives of Gynecology and Obstetrics" is circulated in over 40 countries world wide and is indexed in "PubMed/Medline" and "Science Citation Index Expanded/Journal Citation Report". The journal publishes invited and submitted reviews; peer-reviewed original articles about clinical topics and basic research as well as news and views and guidelines and position statements from all sub-specialties in gynecology and obstetrics.
期刊最新文献
OPM-based fetal magnetocardiography: fetal cardiac time intervals in healthy pregnancies compared to postnatal ECGs. The role of cesarean section surgical techniques in the prevention of isthmocele formation: retrospective cohort study. Aortic isthmus Doppler hemodynamics and ımpacts on perinatal outcomes in pregestational and gestational diabetes mellitus. Is placental immune polarization the missing link in gestational diabetes? Effect of adhesions on laparoscopically-assisted vaginal hysterectomy outcome: a 10-year retrospective, comparative study of 1683 consecutive cases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1