Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study.

IF 2.1 2区 医学 Q2 ORTHOPEDICS Orthopaedic Surgery Pub Date : 2025-01-01 Epub Date: 2024-12-05 DOI:10.1111/os.14280
Li-Peng Xing, Gang Liu, Hao-Chen Zhang, Lei Wang, Shan Zhu, Man Du La Hua Bao, Yan-Ni Wang, Chao Chen, Zhi Wang, Xin-Yu Liu, Shuai Zhang, Qiang Yang
{"title":"Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study.","authors":"Li-Peng Xing, Gang Liu, Hao-Chen Zhang, Lei Wang, Shan Zhu, Man Du La Hua Bao, Yan-Ni Wang, Chao Chen, Zhi Wang, Xin-Yu Liu, Shuai Zhang, Qiang Yang","doi":"10.1111/os.14280","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance.</p><p><strong>Methods: </strong>A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map.</p><p><strong>Results: </strong>On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI-assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368-0.681.</p><p><strong>Conclusions: </strong>YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.</p>","PeriodicalId":19566,"journal":{"name":"Orthopaedic Surgery","volume":" ","pages":"233-243"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735353/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Orthopaedic Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/os.14280","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance.

Methods: A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map.

Results: On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI-assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368-0.681.

Conclusions: YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估CNN架构对MRI动态变化的自动检测和分级:比较研究。
目的:Modic changes (MCs)分类系统是磁共振成像(MRI)中应用最广泛的表征软骨下椎体骨髓改变的方法。然而,由于其半定量的性质,它对MRI的变化表现出高度的敏感性。2021年,该分类体系的作者进一步提出了一种定量可靠的MC分级方法。然而,目前还缺乏对mc进行分级的自动化工具。本研究开发并研究了卷积神经网络(CNN)在基于最大垂直范围的mc检测和分级中的性能。为了验证性能,我们测试了CNN的泛化性能,CNN与初级医生的性能,以及AI辅助后初级医生的一致性。方法:回顾性分析139例MCs患者的mri,并由脊柱外科医生注释。在139名患者中,109名患者的mri是在2020年6月至2021年6月期间使用飞利浦扫描仪获得的,构成数据集1。其余30例患者在2022年6月至2023年3月期间通过Philips和United Imaging扫描仪获得mri,形成数据集2。YOLOv8和YOLOv5是在PyCharm中使用Python语言开发的,基于PyTorch深度学习框架,应用数据增强和迁移学习来增强模型泛化。将模型的性能与准确率、召回率、F1分数和mAP50进行比较。它还测试了泛化性,并将其与初级医生在第二个数据集(数据集2)上的表现进行了比较。事后,初级医生在CNN的帮助下对数据集2进行了评分。此外,使用类激活映射热图显示感兴趣的区域。结果:在未见测试集上,YOLOv8和YOLOv5模型的准确率分别为81.60%和61.59%,召回率分别为80.90%和67.16%,mAP50分别为84.40%和68.88%,F1分别为0.81和0.60。在数据集2上,YOLOv8和junior doctor的准确率分别为95.1%和72.5%,召回率分别为68.3%和60.6%。在人工智能辅助实验中,初级医生和高级脊柱外科医生之间的一致性较Cohen的0.368-0.681有明显提高。结论:YOLOv8对MCs的检测和分级明显优于YOLOv5。YOLOv8的性能优于初级医生,可以增强初级医生的能力,提高诊断的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Orthopaedic Surgery
Orthopaedic Surgery ORTHOPEDICS-
CiteScore
3.40
自引率
14.30%
发文量
374
审稿时长
20 weeks
期刊介绍: Orthopaedic Surgery (OS) is the official journal of the Chinese Orthopaedic Association, focusing on all aspects of orthopaedic technique and surgery. The journal publishes peer-reviewed articles in the following categories: Original Articles, Clinical Articles, Review Articles, Guidelines, Editorials, Commentaries, Surgical Techniques, Case Reports and Meeting Reports.
期刊最新文献
Percutaneous Endoscopic Lumbar Decompression Using a Novel L-Shaped Impactor for Treating Lumbar Foraminal and Nerve Root Canal Stenosis in Elderly Patients With High Iliac Crests: A Retrospective Cohort Study. MRI-Based Inter-Vertebral Signal Concordance Index Is a Risk Factor of Osteoporotic Vertebral Compression Fracture: A Novel Index Inspired by Entropy. Epidemiology of Knee Articular Cartilage Injuries in Patients Undergoing Arthroscopy: Insights From 25,293 Procedures at a High-Volume Center. Hip Labral Reconstruction With a Synthetic Graft: Development and Preclinical Validation. Changes in Fall Risk After Total Hip Arthroplasty for Dysplastic Hip Osteoarthritis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1