Wongthawat Liawrungrueang, Watcharaporn Cholamjiak, Peem Sarasombath, Khanathip Jitpakdee, Vit Kotheeranurak
{"title":"腰椎间盘退变检测与分级的人工智能分类。","authors":"Wongthawat Liawrungrueang, Watcharaporn Cholamjiak, Peem Sarasombath, Khanathip Jitpakdee, Vit Kotheeranurak","doi":"10.22603/ssrr.2024-0154","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Intervertebral disc degeneration (IDD) is a primary cause of chronic back pain and disability, highlighting the need for precise detection and grading for effective treatment. This study focuses on developing and validating a convolutional neural network (CNN) with a You Only Look Once (YOLO) architecture model using the Pfirrmann grading system to classify and grade lumbar intervertebral disc degeneration based on magnetic resonance imaging (MRI) scans.</p><p><strong>Methods: </strong>We developed a deep learning model trained on a dataset of anonymized MRI studies of patients with symptomatic back pain. MRI images were segmented and annotated by radiologists according to the Pfirrmann grading for the datasets. The segmentation MRI-disc image dataset was prepared for three groups: a training set (1,000), a testing set (500), and an external validation set (500) to assess model generalizability without overlapping images. The model's performance was evaluated using accuracy, sensitivity, specificity, F1 score, prediction error, and ROC-AUC.</p><p><strong>Results: </strong>The AI model showed high performance across all metrics. For Grade I IDD, the model achieved an accuracy of 97%, 95%, and 92% in the training, testing, and external validation sets, respectively. For Grade II, the sensitivity was 100% in both training and testing sets and 98% in the validation set. For Grade III, the specificity was 95.4% in the training set and 94% in both testing and validation sets. For Grade IV, the F1 score was 97.77% in the training set and 95% in both testing and validation sets. For Grade V, the prediction error was 2.3%, 2%, and 2.5% in the training, testing, and validation sets, respectively. The overall ROC-AUC was 97%, 92%, and 95% in the training, testing, and validation sets, respectively.</p><p><strong>Conclusions: </strong>The AI-based classification model exhibits high accuracy, sensitivity, and specificity in detecting and grading lumbar IDD using the Pfirrmann grading. AI has significantly enhanced diagnostic precision and reliability, providing a powerful tool for clinicians in managing IDD. The potential impact is substantial, although further clinical validation is necessary before integrating this model into routine practice.</p>","PeriodicalId":22253,"journal":{"name":"Spine Surgery and Related Research","volume":"8 6","pages":"552-559"},"PeriodicalIF":1.2000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11625717/pdf/","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence Classification for Detecting and Grading Lumbar Intervertebral Disc Degeneration.\",\"authors\":\"Wongthawat Liawrungrueang, Watcharaporn Cholamjiak, Peem Sarasombath, Khanathip Jitpakdee, Vit Kotheeranurak\",\"doi\":\"10.22603/ssrr.2024-0154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Intervertebral disc degeneration (IDD) is a primary cause of chronic back pain and disability, highlighting the need for precise detection and grading for effective treatment. This study focuses on developing and validating a convolutional neural network (CNN) with a You Only Look Once (YOLO) architecture model using the Pfirrmann grading system to classify and grade lumbar intervertebral disc degeneration based on magnetic resonance imaging (MRI) scans.</p><p><strong>Methods: </strong>We developed a deep learning model trained on a dataset of anonymized MRI studies of patients with symptomatic back pain. MRI images were segmented and annotated by radiologists according to the Pfirrmann grading for the datasets. The segmentation MRI-disc image dataset was prepared for three groups: a training set (1,000), a testing set (500), and an external validation set (500) to assess model generalizability without overlapping images. The model's performance was evaluated using accuracy, sensitivity, specificity, F1 score, prediction error, and ROC-AUC.</p><p><strong>Results: </strong>The AI model showed high performance across all metrics. For Grade I IDD, the model achieved an accuracy of 97%, 95%, and 92% in the training, testing, and external validation sets, respectively. For Grade II, the sensitivity was 100% in both training and testing sets and 98% in the validation set. For Grade III, the specificity was 95.4% in the training set and 94% in both testing and validation sets. For Grade IV, the F1 score was 97.77% in the training set and 95% in both testing and validation sets. For Grade V, the prediction error was 2.3%, 2%, and 2.5% in the training, testing, and validation sets, respectively. The overall ROC-AUC was 97%, 92%, and 95% in the training, testing, and validation sets, respectively.</p><p><strong>Conclusions: </strong>The AI-based classification model exhibits high accuracy, sensitivity, and specificity in detecting and grading lumbar IDD using the Pfirrmann grading. AI has significantly enhanced diagnostic precision and reliability, providing a powerful tool for clinicians in managing IDD. The potential impact is substantial, although further clinical validation is necessary before integrating this model into routine practice.</p>\",\"PeriodicalId\":22253,\"journal\":{\"name\":\"Spine Surgery and Related Research\",\"volume\":\"8 6\",\"pages\":\"552-559\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11625717/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spine Surgery and Related Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22603/ssrr.2024-0154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/11/27 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Surgery and Related Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22603/ssrr.2024-0154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/27 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
摘要
椎间盘退变(IDD)是慢性背部疼痛和残疾的主要原因,强调了精确检测和分级以有效治疗的必要性。本研究的重点是开发和验证卷积神经网络(CNN)与You Only Look Once (YOLO)架构模型,使用Pfirrmann分级系统基于磁共振成像(MRI)扫描对腰椎间盘退变进行分类和分级。方法:我们开发了一个深度学习模型,该模型是在症状性背痛患者的匿名MRI研究数据集上训练的。放射科医生根据数据集的Pfirrmann分级对MRI图像进行分割和注释。分割mri光盘图像数据集分为三组:训练集(1,000),测试集(500)和外部验证集(500),以评估模型在没有重叠图像的情况下的泛化性。通过准确性、敏感性、特异性、F1评分、预测误差和ROC-AUC来评估模型的性能。结果:人工智能模型在所有指标上都表现出色。对于I级IDD,该模型在训练集、测试集和外部验证集的准确率分别为97%、95%和92%。对于二级,训练集和测试集的灵敏度均为100%,验证集的灵敏度为98%。对于III级,特异性在训练集中为95.4%,在测试和验证集中均为94%。对于等级IV,训练集的F1得分为97.77%,测试集和验证集的F1得分均为95%。对于等级V,训练集、测试集和验证集的预测误差分别为2.3%、2%和2.5%。总的ROC-AUC在训练集、测试集和验证集分别为97%、92%和95%。结论:基于人工智能的分类模型在使用Pfirrmann分级检测和分级腰椎IDD方面具有较高的准确性、敏感性和特异性。人工智能大大提高了诊断的准确性和可靠性,为临床医生管理IDD提供了有力的工具。潜在的影响是巨大的,尽管在将该模型整合到常规实践之前需要进一步的临床验证。
Artificial Intelligence Classification for Detecting and Grading Lumbar Intervertebral Disc Degeneration.
Introduction: Intervertebral disc degeneration (IDD) is a primary cause of chronic back pain and disability, highlighting the need for precise detection and grading for effective treatment. This study focuses on developing and validating a convolutional neural network (CNN) with a You Only Look Once (YOLO) architecture model using the Pfirrmann grading system to classify and grade lumbar intervertebral disc degeneration based on magnetic resonance imaging (MRI) scans.
Methods: We developed a deep learning model trained on a dataset of anonymized MRI studies of patients with symptomatic back pain. MRI images were segmented and annotated by radiologists according to the Pfirrmann grading for the datasets. The segmentation MRI-disc image dataset was prepared for three groups: a training set (1,000), a testing set (500), and an external validation set (500) to assess model generalizability without overlapping images. The model's performance was evaluated using accuracy, sensitivity, specificity, F1 score, prediction error, and ROC-AUC.
Results: The AI model showed high performance across all metrics. For Grade I IDD, the model achieved an accuracy of 97%, 95%, and 92% in the training, testing, and external validation sets, respectively. For Grade II, the sensitivity was 100% in both training and testing sets and 98% in the validation set. For Grade III, the specificity was 95.4% in the training set and 94% in both testing and validation sets. For Grade IV, the F1 score was 97.77% in the training set and 95% in both testing and validation sets. For Grade V, the prediction error was 2.3%, 2%, and 2.5% in the training, testing, and validation sets, respectively. The overall ROC-AUC was 97%, 92%, and 95% in the training, testing, and validation sets, respectively.
Conclusions: The AI-based classification model exhibits high accuracy, sensitivity, and specificity in detecting and grading lumbar IDD using the Pfirrmann grading. AI has significantly enhanced diagnostic precision and reliability, providing a powerful tool for clinicians in managing IDD. The potential impact is substantial, although further clinical validation is necessary before integrating this model into routine practice.