E J A Verheijen, T Kapogiannis, D Munteh, J Chabros, M Staring, T R Smith, C L A Vleggeert-Lankamp
{"title":"Artificial intelligence for segmentation and classification in lumbar spinal stenosis: an overview of current methods.","authors":"E J A Verheijen, T Kapogiannis, D Munteh, J Chabros, M Staring, T R Smith, C L A Vleggeert-Lankamp","doi":"10.1007/s00586-025-08672-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Lumbar spinal stenosis (LSS) is a frequently occurring condition defined by narrowing of the spinal or nerve root canal due to degenerative changes. Physicians use MRI scans to determine the severity of stenosis, occasionally complementing it with X-ray or CT scans during the diagnostic work-up. However, manual grading of stenosis is time-consuming and induces inter-reader variability as a standardized grading system is lacking. Machine Learning (ML) has the potential to aid physicians in this process by automating segmentation and classification of LSS. However, it is unclear what models currently exist to perform these tasks.</p><p><strong>Methods: </strong>A systematic review of literature was performed by searching the Cochrane Library, Embase, Emcare, PubMed, and Web of Science databases for studies describing an ML-based algorithm to perform segmentation or classification of the lumbar spine for LSS. Risk of bias was assessed through an adjusted version of the Newcastle-Ottawa Quality Assessment Scale that was more applicable to ML studies. Qualitative analyses were performed based on type of algorithm (conventional ML or Deep Learning (DL)) and task (segmentation or classification).</p><p><strong>Results: </strong>A total of 27 articles were included of which nine on segmentation, 16 on classification and 2 on both tasks. The majority of studies focused on algorithms for MRI analysis. There was wide variety among the outcome measures used to express model performance. Overall, ML algorithms are able to perform segmentation and classification tasks excellently. DL methods tend to demonstrate better performance than conventional ML models. For segmentation the best performing DL models were U-Net based. For classification U-Net and unspecified CNNs powered the models that performed the best for the majority of outcome metrics. The number of models with external validation was limited.</p><p><strong>Conclusion: </strong>DL models achieve excellent performance for segmentation and classification tasks for LSS, outperforming conventional ML algorithms. However, comparisons between studies are challenging due to the variety in outcome measures and test datasets. Future studies should focus on the segmentation task using DL models and utilize a standardized set of outcome measures and publicly available test dataset to express model performance. In addition, these models need to be externally validated to assess generalizability.</p>","PeriodicalId":12323,"journal":{"name":"European Spine Journal","volume":" ","pages":"1146-1155"},"PeriodicalIF":2.7000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00586-025-08672-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/30 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Lumbar spinal stenosis (LSS) is a frequently occurring condition defined by narrowing of the spinal or nerve root canal due to degenerative changes. Physicians use MRI scans to determine the severity of stenosis, occasionally complementing it with X-ray or CT scans during the diagnostic work-up. However, manual grading of stenosis is time-consuming and induces inter-reader variability as a standardized grading system is lacking. Machine Learning (ML) has the potential to aid physicians in this process by automating segmentation and classification of LSS. However, it is unclear what models currently exist to perform these tasks.
Methods: A systematic review of literature was performed by searching the Cochrane Library, Embase, Emcare, PubMed, and Web of Science databases for studies describing an ML-based algorithm to perform segmentation or classification of the lumbar spine for LSS. Risk of bias was assessed through an adjusted version of the Newcastle-Ottawa Quality Assessment Scale that was more applicable to ML studies. Qualitative analyses were performed based on type of algorithm (conventional ML or Deep Learning (DL)) and task (segmentation or classification).
Results: A total of 27 articles were included of which nine on segmentation, 16 on classification and 2 on both tasks. The majority of studies focused on algorithms for MRI analysis. There was wide variety among the outcome measures used to express model performance. Overall, ML algorithms are able to perform segmentation and classification tasks excellently. DL methods tend to demonstrate better performance than conventional ML models. For segmentation the best performing DL models were U-Net based. For classification U-Net and unspecified CNNs powered the models that performed the best for the majority of outcome metrics. The number of models with external validation was limited.
Conclusion: DL models achieve excellent performance for segmentation and classification tasks for LSS, outperforming conventional ML algorithms. However, comparisons between studies are challenging due to the variety in outcome measures and test datasets. Future studies should focus on the segmentation task using DL models and utilize a standardized set of outcome measures and publicly available test dataset to express model performance. In addition, these models need to be externally validated to assess generalizability.
目的:腰椎管狭窄症(LSS)是一种由退行性改变引起的脊髓或神经根管狭窄所定义的常见病。医生使用MRI扫描来确定狭窄的严重程度,偶尔在诊断过程中辅以x射线或CT扫描。然而,由于缺乏标准化的分级系统,人工对狭窄进行分级既费时又容易引起读者之间的差异。机器学习(ML)有可能通过自动分割和分类LSS来帮助医生完成这一过程。然而,目前还不清楚哪些模型可以执行这些任务。方法:通过检索Cochrane Library、Embase、Emcare、PubMed和Web of Science数据库,对描述基于ml的算法对腰椎进行LSS分割或分类的研究进行系统的文献回顾。通过调整后的纽卡斯尔-渥太华质量评估量表评估偏倚风险,该量表更适用于ML研究。根据算法类型(传统ML或深度学习(DL))和任务(分割或分类)进行定性分析。结果:共收录了27篇文章,其中分词9篇,分类16篇,两篇都有。大多数研究集中在MRI分析的算法上。用于表达模型性能的结果测量方法多种多样。总体而言,ML算法能够出色地执行分割和分类任务。DL方法往往比传统的ML模型表现出更好的性能。对于分割,表现最好的DL模型是基于U-Net的。对于分类,U-Net和未指定的cnn为大多数结果指标表现最好的模型提供了动力。具有外部验证的模型数量有限。结论:深度学习模型在LSS的分割和分类任务上取得了优异的性能,优于传统的ML算法。然而,由于结果测量和测试数据集的多样性,研究之间的比较具有挑战性。未来的研究应该集中在使用深度学习模型的分割任务上,并利用一组标准化的结果度量和公开可用的测试数据集来表达模型的性能。此外,这些模型需要外部验证以评估通用性。
期刊介绍:
"European Spine Journal" is a publication founded in response to the increasing trend toward specialization in spinal surgery and spinal pathology in general. The Journal is devoted to all spine related disciplines, including functional and surgical anatomy of the spine, biomechanics and pathophysiology, diagnostic procedures, and neurology, surgery and outcomes. The aim of "European Spine Journal" is to support the further development of highly innovative spine treatments including but not restricted to surgery and to provide an integrated and balanced view of diagnostic, research and treatment procedures as well as outcomes that will enhance effective collaboration among specialists worldwide. The “European Spine Journal” also participates in education by means of videos, interactive meetings and the endorsement of educative efforts.
Official publication of EUROSPINE, The Spine Society of Europe