Interobserver Agreement and Performance of Concurrent AI Assistance for Radiographic Evaluation of Knee Osteoarthritis.

IF 12.1 1区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Radiology Pub Date : 2024-07-01 DOI:10.1148/radiol.233341
Mathias W Brejnebøl, Anders Lenskjold, Katharina Ziegeler, Huib Ruitenbeek, Felix C Müller, Janus U Nybing, Jacob J Visser, Loes M Schiphouwer, Jorrit Jasper, Behschad Bashian, Haoyin Cao, Maximilian Muellner, Sebastian A Dahlmann, Dimitar I Radev, Ann Ganestam, Camilla T Nielsen, Carsten U Stroemmen, Edwin H G Oei, Kay-Geert A Hermann, Mikael Boesen
{"title":"Interobserver Agreement and Performance of Concurrent AI Assistance for Radiographic Evaluation of Knee Osteoarthritis.","authors":"Mathias W Brejnebøl, Anders Lenskjold, Katharina Ziegeler, Huib Ruitenbeek, Felix C Müller, Janus U Nybing, Jacob J Visser, Loes M Schiphouwer, Jorrit Jasper, Behschad Bashian, Haoyin Cao, Maximilian Muellner, Sebastian A Dahlmann, Dimitar I Radev, Ann Ganestam, Camilla T Nielsen, Carsten U Stroemmen, Edwin H G Oei, Kay-Geert A Hermann, Mikael Boesen","doi":"10.1148/radiol.233341","DOIUrl":null,"url":null,"abstract":"<p><p>Background Due to conflicting findings in the literature, there are concerns about a lack of objectivity in grading knee osteoarthritis (KOA) on radiographs. Purpose To examine how artificial intelligence (AI) assistance affects the performance and interobserver agreement of radiologists and orthopedists of various experience levels when evaluating KOA on radiographs according to the established Kellgren-Lawrence (KL) grading system. Materials and Methods In this retrospective observer performance study, consecutive standing knee radiographs from patients with suspected KOA were collected from three participating European centers between April 2019 and May 2022. Each center recruited four readers across radiology and orthopedic surgery at in-training and board-certified experience levels. KL grading (KL-0 = no KOA, KL-4 = severe KOA) on the frontal view was assessed by readers with and without assistance from a commercial AI tool. The majority vote of three musculoskeletal radiology consultants established the reference standard. The ordinal receiver operating characteristic method was used to estimate grading performance. Light kappa was used to estimate interrater agreement, and bootstrapped <i>t</i> statistics were used to compare groups. Results Seventy-five studies were included from each center, totaling 225 studies (mean patient age, 55 years ± 15 [SD]; 113 female patients). The KL grades were KL-0, 24.0% (<i>n</i> = 54); KL-1, 28.0% (<i>n</i> = 63); KL-2, 21.8% (<i>n</i> = 49); KL-3, 18.7% (<i>n</i> = 42); and KL-4, 7.6% (<i>n</i> = 17). Eleven readers completed their readings. Three of the six junior readers showed higher KL grading performance with versus without AI assistance (area under the receiver operating characteristic curve, 0.81 ± 0.017 [SEM] vs 0.88 ± 0.011 [<i>P</i> < .001]; 0.76 ± 0.018 vs 0.86 ± 0.013 [<i>P</i> < .001]; and 0.89 ± 0.011 vs 0.91 ± 0.009 [<i>P</i> = .008]). Interobserver agreement for KL grading among all readers was higher with versus without AI assistance (κ = 0.77 ± 0.018 [SEM] vs 0.85 ± 0.013; <i>P</i> < .001). Board-certified radiologists achieved almost perfect agreement for KL grading when assisted by AI (κ = 0.90 ± 0.01), which was higher than that achieved by the reference readers independently (κ = 0.84 ± 0.017; <i>P</i> = .01). Conclusion AI assistance increased junior readers' radiographic KOA grading performance and increased interobserver agreement for osteoarthritis grading across all readers and experience levels. Published under a CC BY 4.0 license. <i>Supplemental material is available for this article.</i></p>","PeriodicalId":20896,"journal":{"name":"Radiology","volume":null,"pages":null},"PeriodicalIF":12.1000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1148/radiol.233341","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background Due to conflicting findings in the literature, there are concerns about a lack of objectivity in grading knee osteoarthritis (KOA) on radiographs. Purpose To examine how artificial intelligence (AI) assistance affects the performance and interobserver agreement of radiologists and orthopedists of various experience levels when evaluating KOA on radiographs according to the established Kellgren-Lawrence (KL) grading system. Materials and Methods In this retrospective observer performance study, consecutive standing knee radiographs from patients with suspected KOA were collected from three participating European centers between April 2019 and May 2022. Each center recruited four readers across radiology and orthopedic surgery at in-training and board-certified experience levels. KL grading (KL-0 = no KOA, KL-4 = severe KOA) on the frontal view was assessed by readers with and without assistance from a commercial AI tool. The majority vote of three musculoskeletal radiology consultants established the reference standard. The ordinal receiver operating characteristic method was used to estimate grading performance. Light kappa was used to estimate interrater agreement, and bootstrapped t statistics were used to compare groups. Results Seventy-five studies were included from each center, totaling 225 studies (mean patient age, 55 years ± 15 [SD]; 113 female patients). The KL grades were KL-0, 24.0% (n = 54); KL-1, 28.0% (n = 63); KL-2, 21.8% (n = 49); KL-3, 18.7% (n = 42); and KL-4, 7.6% (n = 17). Eleven readers completed their readings. Three of the six junior readers showed higher KL grading performance with versus without AI assistance (area under the receiver operating characteristic curve, 0.81 ± 0.017 [SEM] vs 0.88 ± 0.011 [P < .001]; 0.76 ± 0.018 vs 0.86 ± 0.013 [P < .001]; and 0.89 ± 0.011 vs 0.91 ± 0.009 [P = .008]). Interobserver agreement for KL grading among all readers was higher with versus without AI assistance (κ = 0.77 ± 0.018 [SEM] vs 0.85 ± 0.013; P < .001). Board-certified radiologists achieved almost perfect agreement for KL grading when assisted by AI (κ = 0.90 ± 0.01), which was higher than that achieved by the reference readers independently (κ = 0.84 ± 0.017; P = .01). Conclusion AI assistance increased junior readers' radiographic KOA grading performance and increased interobserver agreement for osteoarthritis grading across all readers and experience levels. Published under a CC BY 4.0 license. Supplemental material is available for this article.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
膝关节骨性关节炎放射学评估的观察者间一致性和同期人工智能辅助性能
背景 由于文献中的研究结果相互矛盾,人们担心在对X光片上的膝关节骨性关节炎(KOA)进行分级时缺乏客观性。目的 研究人工智能(AI)辅助如何影响具有不同经验水平的放射科医生和骨科医生根据既定的 Kellgren-Lawrence (KL)分级系统对 X 光片上的 KOA 进行评估时的表现和观察者之间的一致性。材料与方法 在这项回顾性观察者表现研究中,2019 年 4 月至 2022 年 5 月期间,从三个参与研究的欧洲中心收集了疑似 KOA 患者的连续站立膝关节 X 光片。每个中心都招募了四名放射科和骨科手术科的读片员,他们分别具有在训和委员会认证的经验水平。正面视图上的 KL 分级(KL-0 = 无 KOA,KL-4 = 严重 KOA)由读者在商业人工智能工具的协助下和不协助下进行评估。三位肌肉骨骼放射学顾问以多数票确定了参考标准。采用序数接收器操作特征法估算分级结果。Light kappa 用于估算检查者之间的一致性,自引导 t 统计用于比较组别。结果 每个中心共纳入 75 项研究,共计 225 项研究(患者平均年龄为 55 岁 ± 15 [SD];113 名女性患者)。KL 分级为:KL-0,24.0%(n = 54);KL-1,28.0%(n = 63);KL-2,21.8%(n = 49);KL-3,18.7%(n = 42);KL-4,7.6%(n = 17)。11 名读者完成了阅读。六名初级读者中有三人在有人工智能辅助的情况下,KL 分级成绩高于无人工智能辅助的情况(接收者操作特征曲线下面积,0.81 ± 0.017 [SEM] vs 0.88 ± 0.011 [P < .001];0.76 ± 0.018 vs 0.86 ± 0.013 [P < .001];0.89 ± 0.011 vs 0.91 ± 0.009 [P=0.008])。在有人工智能辅助的情况下(κ = 0.77 ± 0.018 [SEM] vs 0.85 ± 0.013; P < .001),所有读片者的 KL 分级的观察者间一致性更高。经认证的放射科医师在人工智能协助下对 KL 进行分级时几乎完全一致(κ = 0.90 ± 0.01),高于参考读者独立分级的结果(κ = 0.84 ± 0.017; P = .01)。结论 人工智能辅助提高了初级读者的放射学KOA分级能力,并增加了所有读者和经验水平的骨关节炎分级的观察者间一致性。以 CC BY 4.0 许可发布。本文有补充材料。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Radiology
Radiology 医学-核医学
CiteScore
35.20
自引率
3.00%
发文量
596
审稿时长
3.6 months
期刊介绍: Published regularly since 1923 by the Radiological Society of North America (RSNA), Radiology has long been recognized as the authoritative reference for the most current, clinically relevant and highest quality research in the field of radiology. Each month the journal publishes approximately 240 pages of peer-reviewed original research, authoritative reviews, well-balanced commentary on significant articles, and expert opinion on new techniques and technologies. Radiology publishes cutting edge and impactful imaging research articles in radiology and medical imaging in order to help improve human health.
期刊最新文献
Risk Factors for Pneumothorax Following Lung Biopsy: Another Peek at Air Leak. Sex-specific Associations between Left Ventricular Remodeling at MRI and Long-term Cardiovascular Risk. The Clinical Weight of Left Ventricular Mass and Shape. Assessment of Nonmass Lesions Detected with Screening Breast US Based on Mammographic Findings. CT-guided Coaxial Lung Biopsy: Number of Cores and Association with Complications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1