Prospective Multi-Site Validation of AI to Detect Tuberculosis and Chest X-Ray Abnormalities.

NEJM AI Pub Date : 2024-10-01 Epub Date: 2024-09-26 DOI:10.1056/aioa2400018
Sahar Kazemzadeh, Atilla P Kiraly, Zaid Nabulsi, Nsala Sanjase, Minyoi Maimbolwa, Brian Shuma, Shahar Jamshy, Christina Chen, Arnav Agharwal, Charles T Lau, Andrew Sellergren, Daniel Golden, Jin Yu, Eric Wu, Yossi Matias, Katherine Chou, Greg S Corrado, Shravya Shetty, Daniel Tse, Krish Eswaran, Yun Liu, Rory Pilgrim, Monde Muyoyeta, Shruthi Prabhakara
{"title":"Prospective Multi-Site Validation of AI to Detect Tuberculosis and Chest X-Ray Abnormalities.","authors":"Sahar Kazemzadeh, Atilla P Kiraly, Zaid Nabulsi, Nsala Sanjase, Minyoi Maimbolwa, Brian Shuma, Shahar Jamshy, Christina Chen, Arnav Agharwal, Charles T Lau, Andrew Sellergren, Daniel Golden, Jin Yu, Eric Wu, Yossi Matias, Katherine Chou, Greg S Corrado, Shravya Shetty, Daniel Tse, Krish Eswaran, Yun Liu, Rory Pilgrim, Monde Muyoyeta, Shruthi Prabhakara","doi":"10.1056/aioa2400018","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Using artificial intelligence (AI) to interpret chest X-rays (CXRs) could support accessible triage tests for active pulmonary tuberculosis (TB) in resource-constrained settings.</p><p><strong>Methods: </strong>The performance of two cloud-based CXR AI systems - one to detect TB and the other to detect CXR abnormalities - in a population with a high TB and human immunodeficiency virus (HIV) burden was evaluated. We recruited 1978 adults who had TB symptoms, were close contacts of known TB patients, or were newly diagnosed with HIV at three clinical sites. The TB-detecting AI (TB AI) scores were converted to binary using two thresholds: a high-sensitivity threshold and an exploratory threshold designed to resemble radiologist performance. Ten radiologists reviewed images for signs of TB, blinded to the reference standard. Primary analysis measured AI detection noninferiority to radiologist performance. Secondary analysis evaluated AI detection as compared with the World Health Organization (WHO) targets (90% sensitivity, 70% specificity). Both used an absolute margin of 5%. The abnormality-detecting AI (abnormality AI) was evaluated for noninferiority to a high-sensitivity target suitable for triaging (90% sensitivity, 50% specificity).</p><p><strong>Results: </strong>Of the 1910 patients analyzed, 1827 (96%) had conclusive TB status, of which 649 (36%) were HIV positive and 192 (11%) were TB positive. The TB AI's sensitivity and specificity were 87% and 70%, respectively, at the high-sensitivity threshold and 78% and 82%, respectively, at the balanced threshold. Radiologists' mean sensitivity was 76% and mean specificity was 82%. At the high-sensitivity threshold, the TB AI was noninferior to average radiologist sensitivity (P<0.001) but not to average radiologist specificity (P=0.99) and was higher than the WHO target for specificity but not sensitivity. At the balanced threshold, the TB AI was comparable to radiologists. The abnormality AI's sensitivity and specificity were 97% and 79%, respectively, with both meeting the prespecified targets.</p><p><strong>Conclusions: </strong>The CXR TB AI was noninferior to radiologists for active pulmonary TB triaging in a population with a high TB and HIV burden. Neither the TB AI nor the radiologists met WHO recommendations for sensitivity in the study population. AI can also be used to detect other CXR abnormalities in the same population.</p>","PeriodicalId":520343,"journal":{"name":"NEJM AI","volume":"1 10","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737584/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NEJM AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1056/aioa2400018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/26 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Using artificial intelligence (AI) to interpret chest X-rays (CXRs) could support accessible triage tests for active pulmonary tuberculosis (TB) in resource-constrained settings.

Methods: The performance of two cloud-based CXR AI systems - one to detect TB and the other to detect CXR abnormalities - in a population with a high TB and human immunodeficiency virus (HIV) burden was evaluated. We recruited 1978 adults who had TB symptoms, were close contacts of known TB patients, or were newly diagnosed with HIV at three clinical sites. The TB-detecting AI (TB AI) scores were converted to binary using two thresholds: a high-sensitivity threshold and an exploratory threshold designed to resemble radiologist performance. Ten radiologists reviewed images for signs of TB, blinded to the reference standard. Primary analysis measured AI detection noninferiority to radiologist performance. Secondary analysis evaluated AI detection as compared with the World Health Organization (WHO) targets (90% sensitivity, 70% specificity). Both used an absolute margin of 5%. The abnormality-detecting AI (abnormality AI) was evaluated for noninferiority to a high-sensitivity target suitable for triaging (90% sensitivity, 50% specificity).

Results: Of the 1910 patients analyzed, 1827 (96%) had conclusive TB status, of which 649 (36%) were HIV positive and 192 (11%) were TB positive. The TB AI's sensitivity and specificity were 87% and 70%, respectively, at the high-sensitivity threshold and 78% and 82%, respectively, at the balanced threshold. Radiologists' mean sensitivity was 76% and mean specificity was 82%. At the high-sensitivity threshold, the TB AI was noninferior to average radiologist sensitivity (P<0.001) but not to average radiologist specificity (P=0.99) and was higher than the WHO target for specificity but not sensitivity. At the balanced threshold, the TB AI was comparable to radiologists. The abnormality AI's sensitivity and specificity were 97% and 79%, respectively, with both meeting the prespecified targets.

Conclusions: The CXR TB AI was noninferior to radiologists for active pulmonary TB triaging in a population with a high TB and HIV burden. Neither the TB AI nor the radiologists met WHO recommendations for sensitivity in the study population. AI can also be used to detect other CXR abnormalities in the same population.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人工智能检测肺结核和胸部x线异常的前瞻性多位点验证。
背景:利用人工智能(AI)解读胸部x光片(cxr),可以在资源受限的情况下为活动性肺结核(TB)提供可获得的分诊检测。方法:评估两种基于云的CXR AI系统的性能,一种用于检测结核病,另一种用于检测CXR异常,在结核病和人类免疫缺陷病毒(HIV)负担高的人群中。我们招募了1978名有结核病症状的成年人,他们是已知结核病患者的密切接触者,或在三个临床站点新诊断为艾滋病毒感染者。TB检测AI (TB AI)评分使用两个阈值转换为二值:一个高灵敏度阈值和一个旨在类似放射科医生表现的探索性阈值。10名放射科医生在不了解参考标准的情况下检查了结核病的影像。初步分析测量了人工智能检测对放射科医生表现的非劣效性。二级分析将人工智能检测与世界卫生组织(WHO)的目标进行比较(灵敏度90%,特异性70%)。两者都使用了5%的绝对利润率。异常检测AI(异常AI)被评估为适合分诊的高灵敏度目标的非劣效性(90%灵敏度,50%特异性)。结果:1910例患者中,1827例(96%)确诊结核,其中649例(36%)为HIV阳性,192例(11%)为结核阳性。在高敏感阈值下,TB AI的敏感性和特异性分别为87%和70%,在平衡阈值下,其敏感性和特异性分别为78%和82%。放射科医生的平均敏感性为76%,平均特异性为82%。在高灵敏度阈值下,结核人工智能不低于放射科医生的平均灵敏度(p结论:CXR结核人工智能不低于放射科医生在结核病和艾滋病高负担人群中进行活动性肺结核分诊的能力。结核病人工智能和放射科医生在研究人群中的敏感性均未达到世卫组织的建议。人工智能还可用于检测同一人群中的其他CXR异常。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Large Language Models for More Efficient Reporting of Hospital Quality Measures. Prospective Multi-Site Validation of AI to Detect Tuberculosis and Chest X-Ray Abnormalities.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1