Deep learning-based breast cancer diagnosis in breast MRI: systematic review and meta-analysis.

IF 4.7 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING European Radiology Pub Date : 2025-02-05 DOI:10.1007/s00330-025-11406-6

Kamarul Amin Abdullah, Sara Marziali, Muzna Nanaa, Lorena Escudero Sánchez, Nicholas R Payne, Fiona J Gilbert

{"title":"Deep learning-based breast cancer diagnosis in breast MRI: systematic review and meta-analysis.","authors":"Kamarul Amin Abdullah, Sara Marziali, Muzna Nanaa, Lorena Escudero Sánchez, Nicholas R Payne, Fiona J Gilbert","doi":"10.1007/s00330-025-11406-6","DOIUrl":null,"url":null,"abstract":"Objectives: The aim of this work is to evaluate the performance of deep learning (DL) models for breast cancer diagnosis with MRI.Materials and methods: A literature search was conducted on Web of Science, PubMed, and IEEE Xplore for relevant studies published from January 2015 to February 2024. The study was registered with the PROSPERO International Prospective Register of Systematic Reviews (protocol no. CRD42024485371). The quality assessment of diagnostic accuracy studies-2 (QUADAS2) tool and the Must AI Criteria-10 (MAIC-10) checklist were used to assess quality and risk of bias. The meta-analysis included studies reporting DL for breast cancer diagnosis and their performance, from which pooled summary estimates for the area under the curve (AUC), sensitivity, and specificity were calculated.Results: A total of 40 studies were included, of which only 21 were eligible for quantitative analysis. Convolutional neural networks (CNNs) were used in 62.5% (25/40) of the implemented models, with the remaining 37.5% (15/40) hybrid composite models (HCMs). The pooled estimates of AUC, sensitivity, and specificity were 0.90 (95% CI: 0.87, 0.93), 88% (95% CI: 86, 91%), and 90% (95% CI: 87, 93%), respectively.Conclusions: DL models used for breast cancer diagnosis on MRI achieve high performance. However, there is considerable inherent variability in this analysis. Therefore, continuous evaluation and refinement of DL models is essential to ensure their practicality in the clinical setting.Key points: Question Can DL models improve diagnostic accuracy in breast MRI, addressing challenges like overfitting and heterogeneity in study designs and imaging sequences? Findings DL achieved high diagnostic accuracy (AUC 0.90, sensitivity 88%, specificity 90%) in breast MRI, with training size significantly impacting performance metrics (p < 0.001). Clinical relevance DL models demonstrate high accuracy in breast cancer diagnosis using MRI, showing the potential to enhance diagnostic confidence and reduce radiologist workload, especially with larger datasets minimizing overfitting and improving clinical reliability.","PeriodicalId":12076,"journal":{"name":"European Radiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00330-025-11406-6","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: The aim of this work is to evaluate the performance of deep learning (DL) models for breast cancer diagnosis with MRI.

Materials and methods: A literature search was conducted on Web of Science, PubMed, and IEEE Xplore for relevant studies published from January 2015 to February 2024. The study was registered with the PROSPERO International Prospective Register of Systematic Reviews (protocol no. CRD42024485371). The quality assessment of diagnostic accuracy studies-2 (QUADAS2) tool and the Must AI Criteria-10 (MAIC-10) checklist were used to assess quality and risk of bias. The meta-analysis included studies reporting DL for breast cancer diagnosis and their performance, from which pooled summary estimates for the area under the curve (AUC), sensitivity, and specificity were calculated.

Results: A total of 40 studies were included, of which only 21 were eligible for quantitative analysis. Convolutional neural networks (CNNs) were used in 62.5% (25/40) of the implemented models, with the remaining 37.5% (15/40) hybrid composite models (HCMs). The pooled estimates of AUC, sensitivity, and specificity were 0.90 (95% CI: 0.87, 0.93), 88% (95% CI: 86, 91%), and 90% (95% CI: 87, 93%), respectively.

Conclusions: DL models used for breast cancer diagnosis on MRI achieve high performance. However, there is considerable inherent variability in this analysis. Therefore, continuous evaluation and refinement of DL models is essential to ensure their practicality in the clinical setting.

Key points: Question Can DL models improve diagnostic accuracy in breast MRI, addressing challenges like overfitting and heterogeneity in study designs and imaging sequences? Findings DL achieved high diagnostic accuracy (AUC 0.90, sensitivity 88%, specificity 90%) in breast MRI, with training size significantly impacting performance metrics (p < 0.001). Clinical relevance DL models demonstrate high accuracy in breast cancer diagnosis using MRI, showing the potential to enhance diagnostic confidence and reduce radiologist workload, especially with larger datasets minimizing overfitting and improving clinical reliability.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

目的本研究旨在评估深度学习（DL）模型在利用核磁共振成像诊断乳腺癌方面的性能：在 Web of Science、PubMed 和 IEEE Xplore 上对 2015 年 1 月至 2024 年 2 月期间发表的相关研究进行了文献检索。该研究已在 PROSPERO 国际前瞻性系统综述注册中心注册（协议编号：CRD42024485371）。诊断准确性研究质量评估-2（QUADAS2）工具和Must AI Criteria-10（MAIC-10）清单用于评估质量和偏倚风险。荟萃分析纳入了报告乳腺癌诊断DL及其性能的研究，并从中计算出曲线下面积（AUC）、灵敏度和特异性的汇总估计值：结果：共纳入 40 项研究，其中只有 21 项符合定量分析条件。62.5%的模型（25/40）使用了卷积神经网络（CNN），其余37.5%的模型（15/40）使用了混合复合模型（HCM）。AUC、灵敏度和特异性的汇总估计值分别为0.90（95% CI：0.87，0.93）、88%（95% CI：86，91%）和90%（95% CI：87，93%）：用于磁共振成像乳腺癌诊断的 DL 模型具有很高的性能。结论：用于磁共振成像乳腺癌诊断的 DL 模型具有较高的性能，但在分析中存在相当大的固有变异性。因此，持续评估和改进 DL 模型对确保其在临床环境中的实用性至关重要：问题 DL 模型能否提高乳腺 MRI 的诊断准确性，解决研究设计和成像序列中的过度拟合和异质性等难题？研究结果 DL 在乳腺 MRI 中达到了很高的诊断准确性（AUC 0.90，灵敏度 88%，特异性 90%），训练规模对性能指标有显著影响（p

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

European Radiology 医学-核医学

CiteScore

11.60

自引率

8.50%

发文量

874

审稿时长

2-4 weeks

期刊介绍： European Radiology (ER) continuously updates scientific knowledge in radiology by publication of strong original articles and state-of-the-art reviews written by leading radiologists. A well balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes ER an indispensable source for current information in this field. This is the Journal of the European Society of Radiology, and the official journal of a number of societies. From 2004-2008 supplements to European Radiology were published under its companion, European Radiology Supplements, ISSN 1613-3749.