Machine Learning to Detect Cervical Spine Fractures Missed by Radiologists on CT: Analysis Using Seven Award-Winning Models From the 2022 RSNA Cervical Spine Fracture AI Challenge.

IF 4.7 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING American Journal of Roentgenology Pub Date : 2025-03-19 DOI:10.2214/AJR.24.32076

Yingming Amy Chen, Zixuan Hu, Kevin D Shek, Jefferson Wilson, Fahad Saud S Alotaibi, Christopher D Witiw, Hui Ming Lin, Robyn L Ball, Markand Patel, Shobhit Mathur, Ervin Sejdić, Errol Colak

{"title":"Machine Learning to Detect Cervical Spine Fractures Missed by Radiologists on CT: Analysis Using Seven Award-Winning Models From the 2022 RSNA Cervical Spine Fracture AI Challenge.","authors":"Yingming Amy Chen, Zixuan Hu, Kevin D Shek, Jefferson Wilson, Fahad Saud S Alotaibi, Christopher D Witiw, Hui Ming Lin, Robyn L Ball, Markand Patel, Shobhit Mathur, Ervin Sejdić, Errol Colak","doi":"10.2214/AJR.24.32076","DOIUrl":null,"url":null,"abstract":"BACKGROUND. Available data on radiologists' missed cervical spine fractures are based primarily on studies using human reviewers to identify errors on reevaluation; such studies do not capture the full extent of missed fractures. OBJECTIVE. The purpose of this study was to use machine learning (ML) models to identify cervical spine fractures on CT missed by interpreting radiologists, characterize the nature of these fractures, and assess their clinical significance. METHODS. This retrospective study included all cervical spine CT examinations performed in adult patients in the emergency department between January 1, 2018, and December 31, 2022. Examinations reported as negative for cervical spine fracture were processed by seven award-winning ML models from the 2022 Radiological Society of North America Cervical Spine Fracture AI Challenge; examinations classified as positive by at least four of the seven models were considered to have ML-detected fractures. Two neuroradiologists independently reviewed examinations with ML-detected fractures using ML-derived heat maps to identify those representing true missed fractures. The neuroradiologists further assessed the fractures' extent. Two spine surgeons independently assessed whether missed fractures were clinically significant (i.e., warranting at least one of surgical consultation, MRI, CTA, or collar immobilization). RESULTS. The study included 6671 patients (2414 women, 4257 men; mean age, 54.6 ± 22.1 [SD] years) who underwent a total of 6979 cervical spine CT examinations. Interpreting radiologists reported 6378 examinations as negative for fracture. Of these, 356 had ML-detected fractures (i.e., positive by at least four of seven models). The neuroradiologists classified 40 of these examinations, in 39 unique patients, as having true fractures. ML-detected missed true fractures involved 51 unique sites, most commonly the C7 transverse process (n = 12), C5 spinous process (n = 12), and C6 spinous process (n = 8). The surgeons considered missed fractures clinically significant in 15 of 40 examinations (MRI and collar immobilization [n = 7], MRI and surgical evaluation [n = 1], CTA [n = 9]). Interobserver agreement, expressed as kappa, was 0.88 between neuroradiologists for true fracture classification and 0.94 between surgeons for clinical significance classification. CONCLUSION. ML models identified cervical spine fractures missed by radiologists. These fractures were further characterized to systematically highlight radiologists' common misses. CLINICAL IMPACT. This ML-based framework can be applied in quality improvement efforts, to help refine radiologists' search patterns based on prone-to-miss findings.","PeriodicalId":55529,"journal":{"name":"American Journal of Roentgenology","volume":" ","pages":"1-9"},"PeriodicalIF":4.7000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Roentgenology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2214/AJR.24.32076","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

BACKGROUND. Available data on radiologists' missed cervical spine fractures are based primarily on studies using human reviewers to identify errors on reevaluation; such studies do not capture the full extent of missed fractures. OBJECTIVE. The purpose of this study was to use machine learning (ML) models to identify cervical spine fractures on CT missed by interpreting radiologists, characterize the nature of these fractures, and assess their clinical significance. METHODS. This retrospective study included all cervical spine CT examinations performed in adult patients in the emergency department between January 1, 2018, and December 31, 2022. Examinations reported as negative for cervical spine fracture were processed by seven award-winning ML models from the 2022 Radiological Society of North America Cervical Spine Fracture AI Challenge; examinations classified as positive by at least four of the seven models were considered to have ML-detected fractures. Two neuroradiologists independently reviewed examinations with ML-detected fractures using ML-derived heat maps to identify those representing true missed fractures. The neuroradiologists further assessed the fractures' extent. Two spine surgeons independently assessed whether missed fractures were clinically significant (i.e., warranting at least one of surgical consultation, MRI, CTA, or collar immobilization). RESULTS. The study included 6671 patients (2414 women, 4257 men; mean age, 54.6 ± 22.1 [SD] years) who underwent a total of 6979 cervical spine CT examinations. Interpreting radiologists reported 6378 examinations as negative for fracture. Of these, 356 had ML-detected fractures (i.e., positive by at least four of seven models). The neuroradiologists classified 40 of these examinations, in 39 unique patients, as having true fractures. ML-detected missed true fractures involved 51 unique sites, most commonly the C7 transverse process (n = 12), C5 spinous process (n = 12), and C6 spinous process (n = 8). The surgeons considered missed fractures clinically significant in 15 of 40 examinations (MRI and collar immobilization [n = 7], MRI and surgical evaluation [n = 1], CTA [n = 9]). Interobserver agreement, expressed as kappa, was 0.88 between neuroradiologists for true fracture classification and 0.94 between surgeons for clinical significance classification. CONCLUSION. ML models identified cervical spine fractures missed by radiologists. These fractures were further characterized to systematically highlight radiologists' common misses. CLINICAL IMPACT. This ML-based framework can be applied in quality improvement efforts, to help refine radiologists' search patterns based on prone-to-miss findings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

机器学习检测放射科医生在CT上遗漏的颈椎骨折：使用RSNA 2022颈椎骨折人工智能挑战赛中七个获奖模型进行分析。

背景：放射科医生漏诊颈椎骨折的现有数据主要基于人类审查员在重新评估时识别错误的研究；这样的研究并没有捕捉到骨折缺失的全部范围。目的：利用机器学习（ML）模型识别被解读放射科医生遗漏的CT颈椎骨折，表征这些骨折的性质，并评估其临床意义。方法：本回顾性研究包括2018年1月1日至2022年12月31日在急诊科进行的所有成年患者颈椎CT检查。报告为阴性的颈椎骨折检查由2022年RSNA颈椎骨折人工智能挑战赛中获奖的7个ML模型处理；7个模型中至少有4个的检查结果为阳性，则认为有ml检测的骨折。两名神经放射学家独立审查了机器学习检测到的骨折检查，使用机器学习衍生的热图来识别那些真正缺失的骨折。神经放射科医生进一步评估骨折的程度。两名脊柱外科医生独立评估未骨折是否具有临床意义（即，至少需要外科会诊、MRI、CTA或固定领中的一项）。结果：共纳入6671例患者，其中女性2414例，男性4257例；平均年龄54.6±22.1岁)，共接受颈椎CT检查6979次。口译放射科医生报告6378次检查为骨折阴性。其中，356例有ml检测骨折（即7个模型中≥4个为阳性）。神经放射学家将39例特殊患者的40项检查归类为真正的骨折。ml检测遗漏的真正骨折涉及51个独特部位，最常见的是C7横突（n=12）， C5棘突（n=12）和C6棘突（n=8）。在15/40次检查中，外科医生认为遗漏骨折具有临床意义[MRI和颈圈固定（n=7）， MRI和手术评估（n=1）， CTA (n=9)]。观察者之间的一致性，用kappa表示，神经放射科医生对真实骨折分类的一致性为0.88，外科医生对临床意义分类的一致性为0.94。结论：ML模型识别了放射科医生遗漏的颈椎骨折。进一步对这些骨折进行特征分析，系统地突出放射科医生的常见漏诊。临床影响：这个基于机器学习的框架可以应用于质量改进工作，以帮助改进放射科医生基于预诊结果的搜索模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

American Journal of Roentgenology 医学-核医学

CiteScore

12.80

自引率

4.00%

发文量

920

审稿时长

3 months

期刊介绍： Founded in 1907, the monthly American Journal of Roentgenology (AJR) is the world’s longest continuously published general radiology journal. AJR is recognized as among the specialty’s leading peer-reviewed journals and has a worldwide circulation of close to 25,000. The journal publishes clinically-oriented articles across all radiology subspecialties, seeking relevance to radiologists’ daily practice. The journal publishes hundreds of articles annually with a diverse range of formats, including original research, reviews, clinical perspectives, editorials, and other short reports. The journal engages its audience through a spectrum of social media and digital communication activities.