Aric Lee MBBS, MMed, FRCR , Junran Wu BSc , Changshuo Liu BSc , Andrew Makmur MBBS, BmedSc, MMed, FRCR , Yong Han Ting MBBS, FRCR , Shannon Lee MBBS , Matthew Ding Zhou Chan MBBS , Desmond Shi Wei Lim MBBS, MMed, FRCR , Vanessa Mei Hui Khoo MBBS, MMed, FRCR , Jonathan Sng MBBS, MMed, FRCR , Han Yang Ong MBBS, MMed, FRCR , Amos Tan MBBS, MMed, FRCR , Shuliang Ge MBBS, MMed, FRCR , Faimee Erwan Muhamat Nor MB BCh BAO (Hons), FRCR, MMed , Yi Ting Lim MBBS, MMed, FRCR , Joey Chan Yiing Beh MBBS, FRCR , Qai Ven Yap BSc , Jiong Hao Tan MBBS, MRCS, MMed, FRCS , Naresh Kumar MBBS, MS Orth, DNB Orth, FRCS Ed, FRCS, DM , Beng Chin Ooi PhD , James Thomas Patrick Decourcy Hallinan MBChB, BSc, FRCR
{"title":"Using deep learning to enhance reporting efficiency and accuracy in degenerative cervical spine MRI","authors":"Aric Lee MBBS, MMed, FRCR , Junran Wu BSc , Changshuo Liu BSc , Andrew Makmur MBBS, BmedSc, MMed, FRCR , Yong Han Ting MBBS, FRCR , Shannon Lee MBBS , Matthew Ding Zhou Chan MBBS , Desmond Shi Wei Lim MBBS, MMed, FRCR , Vanessa Mei Hui Khoo MBBS, MMed, FRCR , Jonathan Sng MBBS, MMed, FRCR , Han Yang Ong MBBS, MMed, FRCR , Amos Tan MBBS, MMed, FRCR , Shuliang Ge MBBS, MMed, FRCR , Faimee Erwan Muhamat Nor MB BCh BAO (Hons), FRCR, MMed , Yi Ting Lim MBBS, MMed, FRCR , Joey Chan Yiing Beh MBBS, FRCR , Qai Ven Yap BSc , Jiong Hao Tan MBBS, MRCS, MMed, FRCS , Naresh Kumar MBBS, MS Orth, DNB Orth, FRCS Ed, FRCS, DM , Beng Chin Ooi PhD , James Thomas Patrick Decourcy Hallinan MBChB, BSc, FRCR","doi":"10.1016/j.spinee.2025.03.009","DOIUrl":null,"url":null,"abstract":"<div><h3>BACKGROUND CONTEXT</h3><div><span>Cervical spine MRI is essential for evaluating degenerative </span>cervical spondylosis (DCS) but is time-consuming to report and subject to interobserver variability. The integration of artificial intelligence in medical imaging offers potential solutions to enhance productivity and diagnostic consistency.</div></div><div><h3>PURPOSE</h3><div>To assess whether a transformer-based deep learning model (DLM) can improve the efficiency and accuracy of radiologists in reporting DCS MRIs.</div></div><div><h3>STUDY DESIGN/SETTING</h3><div>Retrospective study using external DCS MRIs from December 2015 to August 2018.</div></div><div><h3>PATIENT SAMPLE</h3><div>The test dataset comprised 50 preoperative DCS MRIs (2,555 images) from 50 patients (mean age = 60 years ± SD 14; 13 women [26%]), excluding cases with instrumentation.</div></div><div><h3>OUTCOME MEASURES</h3><div><span>Primary outcomes were interpretation time and interobserver agreement (Gwet's kappa) among radiologists grading spinal canal and neural foramina </span>stenosis with and without DLM-assistance.</div></div><div><h3>METHODS</h3><div>A transformer-based DLM was used to classify spinal canal (grades 0/1/2/3) and neural foramina (grades 0/1/2) stenosis at each disc level. Two experienced musculoskeletal radiologists (both with 12-years-of-experience) provided reference standard labels in consensus. Ten radiologists (0–7 years of experience) graded DCS MRIs with and without DLM-assistance, with a 1-month washout period between sessions to minimize recall bias. Interpretation time and interobserver agreement were assessed.</div></div><div><h3>RESULTS</h3><div>DLM-assistance significantly improved interpretation time by 69 to 308 s (p<.001), reducing mean time from 159–490 s (SD 27–649) to 90–182 s (SD 42–218). Radiology residents experienced the largest time savings. DLM-assistance improved interobserver agreement across all stenosis gradings compared to baseline. For dichotomous spinal canal grading, residents had the largest improvement in agreement (κ = 0.63 to 0.77, p<.001). Conversely, for dichotomous neural foramina grading, musculoskeletal radiologists had the largest improvement (κ=0.60 to 0.72, p<.001). Notably, independent DLM performance alone was equivalent or superior to all readers.</div></div><div><h3>CONCLUSIONS</h3><div>The integration of a deep learning model into the radiological assessment of DCS MRI improved radiologists’ interpretation time and interobserver agreement, regardless of experience level.</div></div>","PeriodicalId":49484,"journal":{"name":"Spine Journal","volume":"25 9","pages":"Pages 1942-1950"},"PeriodicalIF":4.7000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1529943025001573","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
BACKGROUND CONTEXT
Cervical spine MRI is essential for evaluating degenerative cervical spondylosis (DCS) but is time-consuming to report and subject to interobserver variability. The integration of artificial intelligence in medical imaging offers potential solutions to enhance productivity and diagnostic consistency.
PURPOSE
To assess whether a transformer-based deep learning model (DLM) can improve the efficiency and accuracy of radiologists in reporting DCS MRIs.
STUDY DESIGN/SETTING
Retrospective study using external DCS MRIs from December 2015 to August 2018.
PATIENT SAMPLE
The test dataset comprised 50 preoperative DCS MRIs (2,555 images) from 50 patients (mean age = 60 years ± SD 14; 13 women [26%]), excluding cases with instrumentation.
OUTCOME MEASURES
Primary outcomes were interpretation time and interobserver agreement (Gwet's kappa) among radiologists grading spinal canal and neural foramina stenosis with and without DLM-assistance.
METHODS
A transformer-based DLM was used to classify spinal canal (grades 0/1/2/3) and neural foramina (grades 0/1/2) stenosis at each disc level. Two experienced musculoskeletal radiologists (both with 12-years-of-experience) provided reference standard labels in consensus. Ten radiologists (0–7 years of experience) graded DCS MRIs with and without DLM-assistance, with a 1-month washout period between sessions to minimize recall bias. Interpretation time and interobserver agreement were assessed.
RESULTS
DLM-assistance significantly improved interpretation time by 69 to 308 s (p<.001), reducing mean time from 159–490 s (SD 27–649) to 90–182 s (SD 42–218). Radiology residents experienced the largest time savings. DLM-assistance improved interobserver agreement across all stenosis gradings compared to baseline. For dichotomous spinal canal grading, residents had the largest improvement in agreement (κ = 0.63 to 0.77, p<.001). Conversely, for dichotomous neural foramina grading, musculoskeletal radiologists had the largest improvement (κ=0.60 to 0.72, p<.001). Notably, independent DLM performance alone was equivalent or superior to all readers.
CONCLUSIONS
The integration of a deep learning model into the radiological assessment of DCS MRI improved radiologists’ interpretation time and interobserver agreement, regardless of experience level.
期刊介绍:
The Spine Journal, the official journal of the North American Spine Society, is an international and multidisciplinary journal that publishes original, peer-reviewed articles on research and treatment related to the spine and spine care, including basic science and clinical investigations. It is a condition of publication that manuscripts submitted to The Spine Journal have not been published, and will not be simultaneously submitted or published elsewhere. The Spine Journal also publishes major reviews of specific topics by acknowledged authorities, technical notes, teaching editorials, and other special features, Letters to the Editor-in-Chief are encouraged.