Impact of Transfer Learning Using Local Data on Performance of a Deep Learning Model for Screening Mammography.

IF 8.1 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Radiology-Artificial Intelligence Pub Date : 2024-07-01 DOI:10.1148/ryai.230383
James J J Condon, Vincent Trinh, Kelly A Hall, Michelle Reintals, Andrew S Holmes, Lauren Oakden-Rayner, Lyle J Palmer
{"title":"Impact of Transfer Learning Using Local Data on Performance of a Deep Learning Model for Screening Mammography.","authors":"James J J Condon, Vincent Trinh, Kelly A Hall, Michelle Reintals, Andrew S Holmes, Lauren Oakden-Rayner, Lyle J Palmer","doi":"10.1148/ryai.230383","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To investigate the issues of generalizability and replication of deep learning models by assessing performance of a screening mammography deep learning system developed at New York University (NYU) on a local Australian dataset. Materials and Methods In this retrospective study, all individuals with biopsy or surgical pathology-proven lesions and age-matched controls were identified from a South Australian public mammography screening program (January 2010 to December 2016). The primary outcome was deep learning system performance-measured with area under the receiver operating characteristic curve (AUC)-in classifying invasive breast cancer or ductal carcinoma in situ (<i>n</i> = 425) versus no malignancy (<i>n</i> = 490) or benign lesions (<i>n</i> = 44). The NYU system, including models without (NYU1) and with (NYU2) heatmaps, was tested in its original form, after training from scratch (without transfer learning), and after retraining with transfer learning. Results The local test set comprised 959 individuals (mean age, 62.5 years ± 8.5 [SD]; all female). The original AUCs for the NYU1 and NYU2 models were 0.83 (95% CI: 0.82, 0.84) and 0.89 (95% CI: 0.88, 0.89), respectively. When NYU1 and NYU2 were applied in their original form to the local test set, the AUCs were 0.76 (95% CI: 0.73, 0.79) and 0.84 (95% CI: 0.82, 0.87), respectively. After local training without transfer learning, the AUCs were 0.66 (95% CI: 0.62, 0.69) and 0.86 (95% CI: 0.84, 0.88). After retraining with transfer learning, the AUCs were 0.82 (95% CI: 0.80, 0.85) and 0.86 (95% CI: 0.84, 0.88). Conclusion A deep learning system developed using a U.S. dataset showed reduced performance when applied \"out of the box\" to an Australian dataset. Local retraining with transfer learning using available model weights improved model performance. <b>Keywords:</b> Screening Mammography, Convolutional Neural Network (CNN), Deep Learning Algorithms, Breast Cancer <i>Supplemental material is available for this article.</i> © RSNA, 2024 See also commentary by Cadrin-Chênevert in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":8.1000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294949/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose To investigate the issues of generalizability and replication of deep learning models by assessing performance of a screening mammography deep learning system developed at New York University (NYU) on a local Australian dataset. Materials and Methods In this retrospective study, all individuals with biopsy or surgical pathology-proven lesions and age-matched controls were identified from a South Australian public mammography screening program (January 2010 to December 2016). The primary outcome was deep learning system performance-measured with area under the receiver operating characteristic curve (AUC)-in classifying invasive breast cancer or ductal carcinoma in situ (n = 425) versus no malignancy (n = 490) or benign lesions (n = 44). The NYU system, including models without (NYU1) and with (NYU2) heatmaps, was tested in its original form, after training from scratch (without transfer learning), and after retraining with transfer learning. Results The local test set comprised 959 individuals (mean age, 62.5 years ± 8.5 [SD]; all female). The original AUCs for the NYU1 and NYU2 models were 0.83 (95% CI: 0.82, 0.84) and 0.89 (95% CI: 0.88, 0.89), respectively. When NYU1 and NYU2 were applied in their original form to the local test set, the AUCs were 0.76 (95% CI: 0.73, 0.79) and 0.84 (95% CI: 0.82, 0.87), respectively. After local training without transfer learning, the AUCs were 0.66 (95% CI: 0.62, 0.69) and 0.86 (95% CI: 0.84, 0.88). After retraining with transfer learning, the AUCs were 0.82 (95% CI: 0.80, 0.85) and 0.86 (95% CI: 0.84, 0.88). Conclusion A deep learning system developed using a U.S. dataset showed reduced performance when applied "out of the box" to an Australian dataset. Local retraining with transfer learning using available model weights improved model performance. Keywords: Screening Mammography, Convolutional Neural Network (CNN), Deep Learning Algorithms, Breast Cancer Supplemental material is available for this article. © RSNA, 2024 See also commentary by Cadrin-Chênevert in this issue.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用本地数据进行迁移学习对乳腺筛查深度学习模型性能的影响。
"刚刚接受 "的论文经过同行评审,已被接受在《放射学》上发表:人工智能》上发表。这篇文章在以最终版本发表之前,还将经过校对、排版和校对审核。请注意,在制作最终校对稿的过程中,可能会发现影响文章内容的错误。目的 通过评估纽约大学(NYU)在澳大利亚本地数据集上开发的乳腺 X 射线筛查 DL 系统的性能,研究深度学习(DL)模型的可推广性和可复制性问题。材料与方法 在这项回顾性研究中,我们从南澳大利亚公共乳腺放射摄影筛查项目(2010 年 1 月至 2016 年 12 月)中确定了所有活检和手术病理证实病变的个体以及年龄匹配的对照组。主要结果是DL系统在将浸润性乳腺癌或导管原位癌(n = 425)从无恶性病变(n = 490)或良性病变(n = 44)的年龄匹配对照中进行分类时的性能,用接收器操作特征曲线下面积(AUC)来衡量。对 NYU 系统(包括无热图(NYU1)和有热图(NYU2)的模型)进行了原始测试、从头开始训练(无迁移学习;TL)和用迁移学习重新训练后的测试。结果 本地测试集包括 959 人(平均年龄 62.5 岁 [SD, 8.5];均为女性)。NYU1 和 NYU2 模型的原始 AUC 分别为 0.83(95%CI = 0.82-0.84)和 0.89(95%CI = 0.88-0.89)。当以原始形式应用于本地测试集时,AUC 分别为 0.76 (95%CI = 0.73-0.79) 和 0.84 (95%CI = 0.82-0.87)。在不使用 TL 进行局部训练后,AUC 分别为 0.66(95%CI = 0.62-0.69)和 0.86(95%CI = 0.84-0.88)。使用 TL 重新训练后,AUC 分别为 0.82(95%CI = 0.80-0.85)和 0.86(95%CI = 0.84-0.88)。结论 使用美国数据集开发的深度学习系统在 "开箱即用 "澳大利亚数据集时,性能有所下降。利用现有模型权重进行迁移学习的局部再训练提高了模型性能。©RSNA,2024。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
16.20
自引率
1.00%
发文量
0
期刊介绍: Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.
期刊最新文献
AI-integrated Screening to Replace Double Reading of Mammograms: A Population-wide Accuracy and Feasibility Study. Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification. Presurgical Upgrade Prediction of DCIS to Invasive Ductal Carcinoma Using Time-dependent Deep Learning Models with DCE MRI. Artificial Intelligence Outcome Prediction in Neonates with Encephalopathy (AI-OPiNE). Deep Learning to Detect Intracranial Hemorrhage in a National Teleradiology Program and the Impact on Interpretation Time.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1