Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center.

IF 8.1 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Radiology-Artificial Intelligence Pub Date : 2024-11-01 DOI:10.1148/ryai.230550
Zixuan Hu, Markand Patel, Robyn L Ball, Hui Ming Lin, Luciano M Prevedello, Mitra Naseri, Shobhit Mathur, Robert Moreland, Jefferson Wilson, Christopher Witiw, Kristen W Yeom, Qishen Ha, Darragh Hanley, Selim Seferbekov, Hao Chen, Philipp Singer, Christof Henkel, Pascal Pfeiffer, Ian Pan, Harshit Sheoran, Wuqi Li, Adam E Flanders, Felipe C Kitamura, Tyler Richards, Jason Talbott, Ervin Sejdić, Errol Colak
{"title":"Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center.","authors":"Zixuan Hu, Markand Patel, Robyn L Ball, Hui Ming Lin, Luciano M Prevedello, Mitra Naseri, Shobhit Mathur, Robert Moreland, Jefferson Wilson, Christopher Witiw, Kristen W Yeom, Qishen Ha, Darragh Hanley, Selim Seferbekov, Hao Chen, Philipp Singer, Christof Henkel, Pascal Pfeiffer, Ian Pan, Harshit Sheoran, Wuqi Li, Adam E Flanders, Felipe C Kitamura, Tyler Richards, Jason Talbott, Ervin Sejdić, Errol Colak","doi":"10.1148/ryai.230550","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To evaluate the performance of the top models from the RSNA 2022 Cervical Spine Fracture Detection challenge on a clinical test dataset of both noncontrast and contrast-enhanced CT scans acquired at a level I trauma center. Materials and Methods Seven top-performing models in the RSNA 2022 Cervical Spine Fracture Detection challenge were retrospectively evaluated on a clinical test set of 1828 CT scans (from 1829 series: 130 positive for fracture, 1699 negative for fracture; 1308 noncontrast, 521 contrast enhanced) from 1779 patients (mean age, 55.8 years ± 22.1 [SD]; 1154 [64.9%] male patients). Scans were acquired without exclusion criteria over 1 year (January-December 2022) from the emergency department of a neurosurgical and level I trauma center. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. False-positive and false-negative cases were further analyzed by a neuroradiologist. Results Although all seven models showed decreased performance on the clinical test set compared with the challenge dataset, the models maintained high performances. On noncontrast CT scans, the models achieved a mean AUC of 0.89 (range: 0.79-0.92), sensitivity of 67.0% (range: 30.9%-80.0%), and specificity of 92.9% (range: 82.1%-99.0%). On contrast-enhanced CT scans, the models had a mean AUC of 0.88 (range: 0.76-0.94), sensitivity of 81.9% (range: 42.7%-100.0%), and specificity of 72.1% (range: 16.4%-92.8%). The models identified 10 fractures missed by radiologists. False-positive cases were more common in contrast-enhanced scans and observed in patients with degenerative changes on noncontrast scans, while false-negative cases were often associated with degenerative changes and osteopenia. Conclusion The winning models from the 2022 RSNA AI Challenge demonstrated a high performance for cervical spine fracture detection on a clinical test dataset, warranting further evaluation for their use as clinical support tools. <b>Keywords:</b> Feature Detection, Supervised Learning, Convolutional Neural Network (CNN), Genetic Algorithms, CT, Spine, Technology Assessment, Head/Neck <i>Supplemental material is available for this article.</i> © RSNA, 2024 See also commentary by Levi and Politi in this issue.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230550"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11605142/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose To evaluate the performance of the top models from the RSNA 2022 Cervical Spine Fracture Detection challenge on a clinical test dataset of both noncontrast and contrast-enhanced CT scans acquired at a level I trauma center. Materials and Methods Seven top-performing models in the RSNA 2022 Cervical Spine Fracture Detection challenge were retrospectively evaluated on a clinical test set of 1828 CT scans (from 1829 series: 130 positive for fracture, 1699 negative for fracture; 1308 noncontrast, 521 contrast enhanced) from 1779 patients (mean age, 55.8 years ± 22.1 [SD]; 1154 [64.9%] male patients). Scans were acquired without exclusion criteria over 1 year (January-December 2022) from the emergency department of a neurosurgical and level I trauma center. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. False-positive and false-negative cases were further analyzed by a neuroradiologist. Results Although all seven models showed decreased performance on the clinical test set compared with the challenge dataset, the models maintained high performances. On noncontrast CT scans, the models achieved a mean AUC of 0.89 (range: 0.79-0.92), sensitivity of 67.0% (range: 30.9%-80.0%), and specificity of 92.9% (range: 82.1%-99.0%). On contrast-enhanced CT scans, the models had a mean AUC of 0.88 (range: 0.76-0.94), sensitivity of 81.9% (range: 42.7%-100.0%), and specificity of 72.1% (range: 16.4%-92.8%). The models identified 10 fractures missed by radiologists. False-positive cases were more common in contrast-enhanced scans and observed in patients with degenerative changes on noncontrast scans, while false-negative cases were often associated with degenerative changes and osteopenia. Conclusion The winning models from the 2022 RSNA AI Challenge demonstrated a high performance for cervical spine fracture detection on a clinical test dataset, warranting further evaluation for their use as clinical support tools. Keywords: Feature Detection, Supervised Learning, Convolutional Neural Network (CNN), Genetic Algorithms, CT, Spine, Technology Assessment, Head/Neck Supplemental material is available for this article. © RSNA, 2024 See also commentary by Levi and Politi in this issue.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估 2022 年 RSNA 颈椎骨折检测竞赛模型在一级创伤中心的性能。
"刚刚接受 "的论文经过同行评审,已被接受在《放射学》上发表:人工智能》上发表。这篇文章在以最终版本发表之前,还将经过校对、排版和校对审核。请注意,在制作最终校对稿的过程中,可能会发现影响内容的错误。目的 评估 RSNA 2022 颈椎骨折检测挑战赛中的顶级模型在临床测试数据集上的表现,这些数据集包括在一级创伤中心获得的非对比度和对比度增强 CT 扫描。材料与方法 对 RSNA 2022 颈椎骨折检测挑战赛中表现最出色的七个模型进行了回顾性评估,临床测试集包括 1,828 份 CT 扫描(1,829 个系列:1,829 个系列:130 个骨折阳性,1,699 个骨折阴性;1,308 个非对比,521 个对比增强)进行了回顾性评估,这些扫描来自 1,779 名患者(平均年龄 55.8 ± 22.1 岁;1,154 名男性)。扫描数据是在一年内(2022 年 1 月至 12 月)从神经外科和一级创伤中心的急诊科获得的,无排除标准。使用接收者操作特征曲线下面积(AUC)、灵敏度和特异性评估模型性能。假阳性和假阴性病例由神经放射科医生进一步分析。结果 虽然与挑战数据集相比,所有 7 个模型在临床测试集上的性能都有所下降,但这些模型仍然保持了较高的性能。在非对比 CT 扫描中,模型的平均 AUC 为 0.89(范围:0.81-0.92),灵敏度为 67.0%(范围:30.9%-80.0%),特异性为 92.9%(范围:82.1%-99.0%)。在对比增强 CT 扫描中,模型的平均 AUC 为 0.88(范围:0.76-0.94),灵敏度为 81.9%(范围:42.7%-100.0%),特异性为 72.1%(范围:16.4%-92.8%)。这些模型发现了放射科医生漏诊的 10 处骨折。假阳性在对比度增强扫描中更为常见,在非对比度扫描中有退行性病变的患者中也可观察到,而假阴性通常与退行性病变和骨质疏松有关。结论 在 2022 年 RSNA 人工智能挑战赛中获胜的模型在临床测试数据集上表现出了很高的颈椎骨折检测性能,值得进一步评估其作为临床支持工具的用途。©RSNA,2024。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
16.20
自引率
1.00%
发文量
0
期刊介绍: Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.
期刊最新文献
A Serial MRI-based Deep Learning Model to Predict Survival in Patients with Locoregionally Advanced Nasopharyngeal Carcinoma. Accuracy of Fully Automated and Human-assisted AI-based CT Quantification of Pleural Effusion Changes after Thoracentesis. Evaluating the Impact of Changes in AI-derived Case Scores over Time on Digital Breast Tomosynthesis Screening Outcomes. NNFit: A Self-Supervised Deep Learning Method for Accelerated Quantification of High- Resolution Short Echo Time MR Spectroscopy Datasets. Posttraining Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1