Software using artificial intelligence for nodule and cancer detection in CT lung cancer screening: systematic review of test accuracy studies

IF 9 1区医学 Q1 RESPIRATORY SYSTEM Thorax Pub Date : 2024-09-25 DOI:10.1136/thorax-2024-221662

Julia Geppert, Asra Asgharzadeh, Anna Brown, Chris Stinton, Emma J Helm, Surangi Jayakody, Daniel Todkill, Daniel Gallacher, Hesam Ghiasvand, Mubarak Patel, Peter Auguste, Alexander Tsertsvadze, Yen-Fu Chen, Amy Grove, Bethany Shinkins, Aileen Clarke, Sian Taylor-Phillips

{"title":"Software using artificial intelligence for nodule and cancer detection in CT lung cancer screening: systematic review of test accuracy studies","authors":"Julia Geppert, Asra Asgharzadeh, Anna Brown, Chris Stinton, Emma J Helm, Surangi Jayakody, Daniel Todkill, Daniel Gallacher, Hesam Ghiasvand, Mubarak Patel, Peter Auguste, Alexander Tsertsvadze, Yen-Fu Chen, Amy Grove, Bethany Shinkins, Aileen Clarke, Sian Taylor-Phillips","doi":"10.1136/thorax-2024-221662","DOIUrl":null,"url":null,"abstract":"Objectives To examine the accuracy and impact of artificial intelligence (AI) software assistance in lung cancer screening using CT. Methods A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS-2 and QUADAS-C were used to assess risk of bias. We undertook narrative synthesis. Results Eleven studies evaluating six different AI-based software and reporting on 19 770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared with unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (−7% to −3% for correctly detecting/categorising people without actionable nodules; −8% to −6% for correctly detecting/categorising people without malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150–750 cancers detected per million people attending screening but lead to an additional 59 700 to 79 600 people attending screening without cancer receiving unnecessary CT surveillance. Conclusions AI assistance in lung cancer screening may improve sensitivity but increases the number of false-positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design. PROSPERO registration number CRD42021298449. All data relevant to the study are included in the article or uploaded as supplementary information.","PeriodicalId":23284,"journal":{"name":"Thorax","volume":"2 1","pages":""},"PeriodicalIF":9.0000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Thorax","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/thorax-2024-221662","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives To examine the accuracy and impact of artificial intelligence (AI) software assistance in lung cancer screening using CT. Methods A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS-2 and QUADAS-C were used to assess risk of bias. We undertook narrative synthesis. Results Eleven studies evaluating six different AI-based software and reporting on 19 770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared with unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (−7% to −3% for correctly detecting/categorising people without actionable nodules; −8% to −6% for correctly detecting/categorising people without malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150–750 cancers detected per million people attending screening but lead to an additional 59 700 to 79 600 people attending screening without cancer receiving unnecessary CT surveillance. Conclusions AI assistance in lung cancer screening may improve sensitivity but increases the number of false-positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design. PROSPERO registration number CRD42021298449. All data relevant to the study are included in the article or uploaded as supplementary information.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在 CT 肺癌筛查中使用人工智能软件检测结节和癌症：检测准确性研究的系统性回顾

目的研究人工智能 (AI) 软件在 CT 肺癌筛查中的准确性和影响。方法对在 CT 肺癌筛查中自动检测和分析结节的 CE 标记人工智能软件进行系统性回顾。检索了从 2012 年到 2023 年 3 月的多个数据库，包括 Medline、Embase 和 Cochrane CENTRAL。纳入了报告测试准确性或对读取时间或临床管理影响的原始研究。采用 QUADAS-2 和 QUADAS-C 评估偏倚风险。我们进行了叙述性综合。结果有 11 项研究对 6 种不同的人工智能软件进行了评估，并报告了 19 770 名患者的情况。所有研究的偏倚风险都很高，存在多种适用性问题。与无辅助读片相比，人工智能辅助读片速度更快，灵敏度普遍提高（检测/分类可处理结节的灵敏度提高了+5%至+20%；检测/分类恶性结节的灵敏度提高了+3%至+15%），但特异性较低（正确检测/分类无可处理结节者的特异性提高了-7%至-3%；正确检测/分类无恶性结节者的特异性提高了-8%至-6%）。人工智能辅助往往会增加分配到高风险类别的结节比例。假设癌症发病率为 0.5%，这些结果将转化为每一百万名参加筛查的人中额外检测出 150-750 例癌症，但会导致额外 59 700-79600 名参加筛查但未患癌症的人接受不必要的 CT 监测。结论人工智能辅助肺癌筛查可提高灵敏度，但会增加假阳性结果和不必要监测的数量。未来的研究需要提高人工智能辅助读片的特异性，并通过改进研究设计最大限度地降低偏倚风险和适用性问题。PROSPERO 注册号为 CRD42021298449。所有与研究相关的数据均包含在文章中或作为补充信息上传。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Thorax 医学-呼吸系统

CiteScore

16.10

自引率

2.00%

发文量

197

审稿时长

1 months

期刊介绍： Thorax stands as one of the premier respiratory medicine journals globally, featuring clinical and experimental research articles spanning respiratory medicine, pediatrics, immunology, pharmacology, pathology, and surgery. The journal's mission is to publish noteworthy advancements in scientific understanding that are poised to influence clinical practice significantly. This encompasses articles delving into basic and translational mechanisms applicable to clinical material, covering areas such as cell and molecular biology, genetics, epidemiology, and immunology.