Strategies for integrating artificial intelligence into mammography screening programmes: a retrospective simulation analysis

IF 23.8 1区医学 Q1 MEDICAL INFORMATICS Lancet Digital Health Pub Date : 2024-10-23 DOI:10.1016/S2589-7500(24)00173-0

Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD

{"title":"Strategies for integrating artificial intelligence into mammography screening programmes: a retrospective simulation analysis","authors":"Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD","doi":"10.1016/S2589-7500(24)00173-0","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.</div></div><div><h3>Methods</h3><div>We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.</div></div><div><h3>Findings</h3><div>Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.</div></div><div><h3>Interpretation</h3><div>The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.</div></div><div><h3>Funding</h3><div>Vara.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e803-e814"},"PeriodicalIF":23.8000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024001730","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.

Methods

We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.

Findings

Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.

Interpretation

The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.

Funding

Vara.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

将人工智能融入乳腺 X 射线摄影筛查计划的策略：回顾性模拟分析。

背景：将人工智能（AI）整合到乳腺 X 射线摄影筛查中可以为放射科医生提供支持并改善项目指标，但不同技术整合策略的潜力仍未得到充分研究。我们比较了七种人工智能整合策略的项目级绩效指标：我们使用德国（n=1 657 068）、英国（n=223 603）和瑞典（n=22 779）筛查项目中生成的数据集，对将人工智能整合到乳腺放射摄影筛查中的七种策略进行了回顾性比较评估。使用的商用人工智能模型是 Vara 2.10 版，该模型是在德国数据基础上从头开始训练的。我们模拟了每种策略在癌症检出率 (CDR)、召回率和工作量减少方面的表现，并将这些指标与筛查计划的指标进行了比较。我们还评估了每种策略检测出的癌症的分期和等级分布情况，以及人工智能模型对这些癌症进行正确定位的能力：与德国筛查计划（每 1000 次检查的 CDR 为 6-32，每 100 次检查的召回率为 4-11）相比，更换两名读片员（独立人工智能策略）的 CDR 为 6-37（95% CI 6-10-6-64），召回率为 3-80（95% CI 3-67-3-93）；而更换一名读片员的 CDR 为 6-49（6-31-6-67），召回率为 4-01（3-92-4-10），工作量减少了 49%。计划级决策转介的 CDR 为 6-85 (6-61-7-11)，召回率为 3-55 (3-43-3-68)，工作量减少了 84%。与英国方案 8-19 的 CDR 相比，读者级、方案级和推迟到单个读者策略的 CDR 分别为 8-24 (7-82-8-71)、8-59 (8-12-9-06) 和 8-28 (7-86-8-71)，召回率没有增加，工作量分别减少了 37%、81% 和 95%。在瑞典数据集上，程序级决策转介将 CDR 提高了 17-7%，但召回率并未提高，同时阅读工作量减少了 92%：决策转诊策略对癌症检出率和召回率的改善最大，除正常分流外，所有策略都显示出改善筛查指标的潜力：资助：瓦拉

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Lancet Digital Health Multiple-

CiteScore

41.20

自引率

1.60%

发文量

232

审稿时长

13 weeks

期刊介绍： The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health. The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health. We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.