Machine Learning to Detect Cervical Spine Fractures Missed by Radiologists on CT: Analysis Using Seven Award-Winning Models From the 2022 RSNA Cervical Spine Fracture AI Challenge.
Yingming Amy Chen, Zixuan Hu, Kevin D Shek, Jefferson Wilson, Fahad Saud S Alotaibi, Christopher D Witiw, Hui Ming Lin, Robyn L Ball, Markand Patel, Shobhit Mathur, Ervin Sejdić, Errol Colak
{"title":"Machine Learning to Detect Cervical Spine Fractures Missed by Radiologists on CT: Analysis Using Seven Award-Winning Models From the 2022 RSNA Cervical Spine Fracture AI Challenge.","authors":"Yingming Amy Chen, Zixuan Hu, Kevin D Shek, Jefferson Wilson, Fahad Saud S Alotaibi, Christopher D Witiw, Hui Ming Lin, Robyn L Ball, Markand Patel, Shobhit Mathur, Ervin Sejdić, Errol Colak","doi":"10.2214/AJR.24.32076","DOIUrl":null,"url":null,"abstract":"<p><p><b>BACKGROUND.</b> Available data on radiologists' missed cervical spine fractures are based primarily on studies using human reviewers to identify errors on reevaluation; such studies do not capture the full extent of missed fractures. <b>OBJECTIVE.</b> The purpose of this study was to use machine learning (ML) models to identify cervical spine fractures on CT missed by interpreting radiologists, characterize the nature of these fractures, and assess their clinical significance. <b>METHODS.</b> This retrospective study included all cervical spine CT examinations performed in adult patients in the emergency department between January 1, 2018, and December 31, 2022. Examinations reported as negative for cervical spine fracture were processed by seven award-winning ML models from the 2022 Radiological Society of North America Cervical Spine Fracture AI Challenge; examinations classified as positive by at least four of the seven models were considered to have ML-detected fractures. Two neuroradiologists independently reviewed examinations with ML-detected fractures using ML-derived heat maps to identify those representing true missed fractures. The neuroradiologists further assessed the fractures' extent. Two spine surgeons independently assessed whether missed fractures were clinically significant (i.e., warranting at least one of surgical consultation, MRI, CTA, or collar immobilization). <b>RESULTS.</b> The study included 6671 patients (2414 women, 4257 men; mean age, 54.6 ± 22.1 [SD] years) who underwent a total of 6979 cervical spine CT examinations. Interpreting radiologists reported 6378 examinations as negative for fracture. Of these, 356 had ML-detected fractures (i.e., positive by at least four of seven models). The neuroradiologists classified 40 of these examinations, in 39 unique patients, as having true fractures. ML-detected missed true fractures involved 51 unique sites, most commonly the C7 transverse process (<i>n</i> = 12), C5 spinous process (<i>n</i> = 12), and C6 spinous process (<i>n</i> = 8). The surgeons considered missed fractures clinically significant in 15 of 40 examinations (MRI and collar immobilization [<i>n</i> = 7], MRI and surgical evaluation [<i>n</i> = 1], CTA [<i>n</i> = 9]). Interobserver agreement, expressed as kappa, was 0.88 between neuroradiologists for true fracture classification and 0.94 between surgeons for clinical significance classification. <b>CONCLUSION.</b> ML models identified cervical spine fractures missed by radiologists. These fractures were further characterized to systematically highlight radiologists' common misses. <b>CLINICAL IMPACT.</b> This ML-based framework can be applied in quality improvement efforts, to help refine radiologists' search patterns based on prone-to-miss findings.</p>","PeriodicalId":55529,"journal":{"name":"American Journal of Roentgenology","volume":" ","pages":"1-9"},"PeriodicalIF":4.7000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Roentgenology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2214/AJR.24.32076","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
BACKGROUND. Available data on radiologists' missed cervical spine fractures are based primarily on studies using human reviewers to identify errors on reevaluation; such studies do not capture the full extent of missed fractures. OBJECTIVE. The purpose of this study was to use machine learning (ML) models to identify cervical spine fractures on CT missed by interpreting radiologists, characterize the nature of these fractures, and assess their clinical significance. METHODS. This retrospective study included all cervical spine CT examinations performed in adult patients in the emergency department between January 1, 2018, and December 31, 2022. Examinations reported as negative for cervical spine fracture were processed by seven award-winning ML models from the 2022 Radiological Society of North America Cervical Spine Fracture AI Challenge; examinations classified as positive by at least four of the seven models were considered to have ML-detected fractures. Two neuroradiologists independently reviewed examinations with ML-detected fractures using ML-derived heat maps to identify those representing true missed fractures. The neuroradiologists further assessed the fractures' extent. Two spine surgeons independently assessed whether missed fractures were clinically significant (i.e., warranting at least one of surgical consultation, MRI, CTA, or collar immobilization). RESULTS. The study included 6671 patients (2414 women, 4257 men; mean age, 54.6 ± 22.1 [SD] years) who underwent a total of 6979 cervical spine CT examinations. Interpreting radiologists reported 6378 examinations as negative for fracture. Of these, 356 had ML-detected fractures (i.e., positive by at least four of seven models). The neuroradiologists classified 40 of these examinations, in 39 unique patients, as having true fractures. ML-detected missed true fractures involved 51 unique sites, most commonly the C7 transverse process (n = 12), C5 spinous process (n = 12), and C6 spinous process (n = 8). The surgeons considered missed fractures clinically significant in 15 of 40 examinations (MRI and collar immobilization [n = 7], MRI and surgical evaluation [n = 1], CTA [n = 9]). Interobserver agreement, expressed as kappa, was 0.88 between neuroradiologists for true fracture classification and 0.94 between surgeons for clinical significance classification. CONCLUSION. ML models identified cervical spine fractures missed by radiologists. These fractures were further characterized to systematically highlight radiologists' common misses. CLINICAL IMPACT. This ML-based framework can be applied in quality improvement efforts, to help refine radiologists' search patterns based on prone-to-miss findings.
期刊介绍:
Founded in 1907, the monthly American Journal of Roentgenology (AJR) is the world’s longest continuously published general radiology journal. AJR is recognized as among the specialty’s leading peer-reviewed journals and has a worldwide circulation of close to 25,000. The journal publishes clinically-oriented articles across all radiology subspecialties, seeking relevance to radiologists’ daily practice. The journal publishes hundreds of articles annually with a diverse range of formats, including original research, reviews, clinical perspectives, editorials, and other short reports. The journal engages its audience through a spectrum of social media and digital communication activities.