Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD
{"title":"Strategies for integrating artificial intelligence into mammography screening programmes: a retrospective simulation analysis","authors":"Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD","doi":"10.1016/S2589-7500(24)00173-0","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.</div></div><div><h3>Methods</h3><div>We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.</div></div><div><h3>Findings</h3><div>Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.</div></div><div><h3>Interpretation</h3><div>The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.</div></div><div><h3>Funding</h3><div>Vara.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e803-e814"},"PeriodicalIF":23.8000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024001730","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.
Methods
We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.
Findings
Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.
Interpretation
The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.
期刊介绍:
The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health.
The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health.
We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.