Daan V. Loeffen , Frank M. Zijta , Tim A. Boymans , Joachim E. Wildberger , Estelle C. Nijssen
{"title":"AI for fracture diagnosis in clinical practice: Four approaches to systematic AI-implementation and their impact on AI-effectiveness","authors":"Daan V. Loeffen , Frank M. Zijta , Tim A. Boymans , Joachim E. Wildberger , Estelle C. Nijssen","doi":"10.1016/j.ejrad.2025.112113","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Artificial Intelligence (AI) has been shown to enhance fracture-detection-accuracy, but the most effective AI-implementation in clinical practice is less well understood. In the current study, four approaches to AI-implementation are evaluated for their impact on AI-effectiveness.</div></div><div><h3>Materials and Methods</h3><div>Retrospective single-center study based on all consecutive, around-the-clock radiographic examinations for suspected fractures, and accompanying clinical-practice radiologist-diagnoses, between January and March 2023. These image-sets were independently analysed by a dedicated bone-fracture-detection-AI. Findings were combined with radiologist clinical-practice diagnoses to simulate the four AI-implementation methods deemed most relevant to clinical workflows: AI-standalone (radiologist-findings not consulted); AI-problem-solving (AI-findings consulted when radiologist in doubt); AI-triage (radiologist-findings consulted when AI in doubt); and AI-safety net (AI-findings consulted when radiologist diagnosis negative). Reference-standard diagnoses were established by two senior musculoskeletal-radiologists (by consensus in cases of disagreement). Radiologist- and radiologist + AI diagnoses were compared for false negatives (FN), false positives (FP) and their clinical consequences. Experience-level-subgroups radiologists-in-training-, non-musculoskeletal-radiologists, and dedicated musculoskeletal-radiologists were analysed separately.</div></div><div><h3>Results</h3><div>1508 image-sets were included (1227 unique patients; 40 radiologist-readers). Radiologist results were: 2.7 % FN (40/1508), 28 with clinical consequences; 1.2 % FP (18/1508), 2 received full-fracture treatments (11.1 %). All AI-implementation methods changed overall FN and FP with statistical significance (p < 0.001): AI-standalone 1.5 % FN (23/1508; 11 consequences), 6.8 % FP (103/1508); AI-problem-solving 3.2 % FN (48/1508; 31 consequences), 0.6 % FP (9/1508); AI-triage 2.1 % FN (32/1508; 18 consequences), 1.7 % FP (26/1508); AI-safety net 0.07 % FN (1/1508; 1 consequence), 7.6 % FP (115/1508). Subgroups show similar trends, except AI-triage increased FN for all subgroups except radiologists-in-training.</div></div><div><h3>Conclusion</h3><div>Implementation methods have a large impact on AI-effectiveness. These results suggest AI should not be considered for problem-solving or triage at this time; AI standalone performs better than either and may be a source of assistance where radiologists are unavailable. Best results were obtained implementing AI as safety net, which eliminates missed fractures with serious clinical consequences; even though false positives are increased, unnecessary treatments are limited.</div></div>","PeriodicalId":12063,"journal":{"name":"European Journal of Radiology","volume":"187 ","pages":"Article 112113"},"PeriodicalIF":3.3000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0720048X25001998","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
Artificial Intelligence (AI) has been shown to enhance fracture-detection-accuracy, but the most effective AI-implementation in clinical practice is less well understood. In the current study, four approaches to AI-implementation are evaluated for their impact on AI-effectiveness.
Materials and Methods
Retrospective single-center study based on all consecutive, around-the-clock radiographic examinations for suspected fractures, and accompanying clinical-practice radiologist-diagnoses, between January and March 2023. These image-sets were independently analysed by a dedicated bone-fracture-detection-AI. Findings were combined with radiologist clinical-practice diagnoses to simulate the four AI-implementation methods deemed most relevant to clinical workflows: AI-standalone (radiologist-findings not consulted); AI-problem-solving (AI-findings consulted when radiologist in doubt); AI-triage (radiologist-findings consulted when AI in doubt); and AI-safety net (AI-findings consulted when radiologist diagnosis negative). Reference-standard diagnoses were established by two senior musculoskeletal-radiologists (by consensus in cases of disagreement). Radiologist- and radiologist + AI diagnoses were compared for false negatives (FN), false positives (FP) and their clinical consequences. Experience-level-subgroups radiologists-in-training-, non-musculoskeletal-radiologists, and dedicated musculoskeletal-radiologists were analysed separately.
Results
1508 image-sets were included (1227 unique patients; 40 radiologist-readers). Radiologist results were: 2.7 % FN (40/1508), 28 with clinical consequences; 1.2 % FP (18/1508), 2 received full-fracture treatments (11.1 %). All AI-implementation methods changed overall FN and FP with statistical significance (p < 0.001): AI-standalone 1.5 % FN (23/1508; 11 consequences), 6.8 % FP (103/1508); AI-problem-solving 3.2 % FN (48/1508; 31 consequences), 0.6 % FP (9/1508); AI-triage 2.1 % FN (32/1508; 18 consequences), 1.7 % FP (26/1508); AI-safety net 0.07 % FN (1/1508; 1 consequence), 7.6 % FP (115/1508). Subgroups show similar trends, except AI-triage increased FN for all subgroups except radiologists-in-training.
Conclusion
Implementation methods have a large impact on AI-effectiveness. These results suggest AI should not be considered for problem-solving or triage at this time; AI standalone performs better than either and may be a source of assistance where radiologists are unavailable. Best results were obtained implementing AI as safety net, which eliminates missed fractures with serious clinical consequences; even though false positives are increased, unnecessary treatments are limited.
期刊介绍:
European Journal of Radiology is an international journal which aims to communicate to its readers, state-of-the-art information on imaging developments in the form of high quality original research articles and timely reviews on current developments in the field.
Its audience includes clinicians at all levels of training including radiology trainees, newly qualified imaging specialists and the experienced radiologist. Its aim is to inform efficient, appropriate and evidence-based imaging practice to the benefit of patients worldwide.