The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study

IF 2.5 3区医学 Q2 OBSTETRICS & GYNECOLOGY Archives of Gynecology and Obstetrics Pub Date : 2025-01-30 DOI:10.1007/s00404-024-07901-8

Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G. Barr, André Pfob

{"title":"The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study","authors":"Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G. Barr, André Pfob","doi":"10.1007/s00404-024-07901-8","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models’ generalizability by validating them on external data generated by both the original updated software versions.</p><h3>Methods</h3><p>We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT 02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).</p><h3>Results</h3><p>We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, <i>p</i> < 0.001 and 0.934 vs. 0.872, <i>p</i> < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, <i>p</i> = 0.045). SVM was not calibrated.</p><h3>Conclusion</h3><p>In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.</p></div>","PeriodicalId":8330,"journal":{"name":"Archives of Gynecology and Obstetrics","volume":"312 1","pages":"139 - 147"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12176987/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Gynecology and Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s00404-024-07901-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models’ generalizability by validating them on external data generated by both the original updated software versions.

Methods

We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT 02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).

Results

We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, p < 0.001 and 0.934 vs. 0.872, p < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.

Conclusion

In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

更新成像软件对乳腺癌诊断机器学习模型性能的影响：一项多中心回顾性研究。

目的：基于医学（影像）数据的人工智能模型日益发展。但是，生成原始数据的成像软件经常更新。更新的成像软件对人工智能模型性能的影响尚不清楚。我们的目标是利用剪切波弹性成像（SWE）数据开发机器学习模型，以识别乳腺恶性病变，并通过在两个原始更新软件版本生成的外部数据上验证模型，来测试模型的泛化性。方法：利用多中心国际SWE数据（NCT02638935），采用十倍交叉验证，开发并验证了不同的机器学习模型（GLM、MARS、XGBoost、SVM）。将结果与活检标本的组织病理学评估或2年随访进行比较。结果测量为曲线下面积（AUROC）。结果：我们将1288例病例纳入使用原始成像软件的开发集，385例病例纳入使用原始和更新软件的验证集。在外部验证集中，与原始软件数据相比，更新后的GLM和XGBoost模型表现出更好的性能（AUROC分别为0.941和0.902,p）。结论：在使用SWE数据的多中心研究中，一些机器学习模型在弥合原始软件和更新软件之间的差距方面表现出很大的潜力，而另一些则表现出较弱的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Archives of Gynecology and Obstetrics 医学-妇产科学

CiteScore

4.70

自引率

15.40%

发文量

493

审稿时长

1 months

期刊介绍： Founded in 1870 as "Archiv für Gynaekologie", Archives of Gynecology and Obstetrics has a long and outstanding tradition. Since 1922 the journal has been the Organ of the Deutsche Gesellschaft für Gynäkologie und Geburtshilfe. "The Archives of Gynecology and Obstetrics" is circulated in over 40 countries world wide and is indexed in "PubMed/Medline" and "Science Citation Index Expanded/Journal Citation Report". The journal publishes invited and submitted reviews; peer-reviewed original articles about clinical topics and basic research as well as news and views and guidelines and position statements from all sub-specialties in gynecology and obstetrics.