Wongthawat Liawrungrueang, Watcharaporn Cholamjiak, Peem Sarasombath, Khanathip Jitpakdee, Vit Kotheeranurak
{"title":"Artificial Intelligence Classification for Detecting and Grading Lumbar Intervertebral Disc Degeneration.","authors":"Wongthawat Liawrungrueang, Watcharaporn Cholamjiak, Peem Sarasombath, Khanathip Jitpakdee, Vit Kotheeranurak","doi":"10.22603/ssrr.2024-0154","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Intervertebral disc degeneration (IDD) is a primary cause of chronic back pain and disability, highlighting the need for precise detection and grading for effective treatment. This study focuses on developing and validating a convolutional neural network (CNN) with a You Only Look Once (YOLO) architecture model using the Pfirrmann grading system to classify and grade lumbar intervertebral disc degeneration based on magnetic resonance imaging (MRI) scans.</p><p><strong>Methods: </strong>We developed a deep learning model trained on a dataset of anonymized MRI studies of patients with symptomatic back pain. MRI images were segmented and annotated by radiologists according to the Pfirrmann grading for the datasets. The segmentation MRI-disc image dataset was prepared for three groups: a training set (1,000), a testing set (500), and an external validation set (500) to assess model generalizability without overlapping images. The model's performance was evaluated using accuracy, sensitivity, specificity, F1 score, prediction error, and ROC-AUC.</p><p><strong>Results: </strong>The AI model showed high performance across all metrics. For Grade I IDD, the model achieved an accuracy of 97%, 95%, and 92% in the training, testing, and external validation sets, respectively. For Grade II, the sensitivity was 100% in both training and testing sets and 98% in the validation set. For Grade III, the specificity was 95.4% in the training set and 94% in both testing and validation sets. For Grade IV, the F1 score was 97.77% in the training set and 95% in both testing and validation sets. For Grade V, the prediction error was 2.3%, 2%, and 2.5% in the training, testing, and validation sets, respectively. The overall ROC-AUC was 97%, 92%, and 95% in the training, testing, and validation sets, respectively.</p><p><strong>Conclusions: </strong>The AI-based classification model exhibits high accuracy, sensitivity, and specificity in detecting and grading lumbar IDD using the Pfirrmann grading. AI has significantly enhanced diagnostic precision and reliability, providing a powerful tool for clinicians in managing IDD. The potential impact is substantial, although further clinical validation is necessary before integrating this model into routine practice.</p>","PeriodicalId":22253,"journal":{"name":"Spine Surgery and Related Research","volume":"8 6","pages":"552-559"},"PeriodicalIF":1.2000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11625717/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Surgery and Related Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22603/ssrr.2024-0154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/27 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Intervertebral disc degeneration (IDD) is a primary cause of chronic back pain and disability, highlighting the need for precise detection and grading for effective treatment. This study focuses on developing and validating a convolutional neural network (CNN) with a You Only Look Once (YOLO) architecture model using the Pfirrmann grading system to classify and grade lumbar intervertebral disc degeneration based on magnetic resonance imaging (MRI) scans.
Methods: We developed a deep learning model trained on a dataset of anonymized MRI studies of patients with symptomatic back pain. MRI images were segmented and annotated by radiologists according to the Pfirrmann grading for the datasets. The segmentation MRI-disc image dataset was prepared for three groups: a training set (1,000), a testing set (500), and an external validation set (500) to assess model generalizability without overlapping images. The model's performance was evaluated using accuracy, sensitivity, specificity, F1 score, prediction error, and ROC-AUC.
Results: The AI model showed high performance across all metrics. For Grade I IDD, the model achieved an accuracy of 97%, 95%, and 92% in the training, testing, and external validation sets, respectively. For Grade II, the sensitivity was 100% in both training and testing sets and 98% in the validation set. For Grade III, the specificity was 95.4% in the training set and 94% in both testing and validation sets. For Grade IV, the F1 score was 97.77% in the training set and 95% in both testing and validation sets. For Grade V, the prediction error was 2.3%, 2%, and 2.5% in the training, testing, and validation sets, respectively. The overall ROC-AUC was 97%, 92%, and 95% in the training, testing, and validation sets, respectively.
Conclusions: The AI-based classification model exhibits high accuracy, sensitivity, and specificity in detecting and grading lumbar IDD using the Pfirrmann grading. AI has significantly enhanced diagnostic precision and reliability, providing a powerful tool for clinicians in managing IDD. The potential impact is substantial, although further clinical validation is necessary before integrating this model into routine practice.