Artificial intelligence and pelvic fracture diagnosis on X-rays: a preliminary study on performance, workflow integration and radiologists' feedback assessment in a spoke emergency hospital
Francesca Rosa , Duccio Buccicardi , Adolfo Romano , Fabio Borda , Maria Chiara D’Auria , Alessandro Gastaldo
{"title":"Artificial intelligence and pelvic fracture diagnosis on X-rays: a preliminary study on performance, workflow integration and radiologists' feedback assessment in a spoke emergency hospital","authors":"Francesca Rosa , Duccio Buccicardi , Adolfo Romano , Fabio Borda , Maria Chiara D’Auria , Alessandro Gastaldo","doi":"10.1016/j.ejro.2023.100504","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>The aim of our study is to evaluate artificial intelligence (AI) support in pelvic fracture diagnosis on X-rays, focusing on performance, workflow integration and radiologists’ feedback in a spoke emergency hospital.</p></div><div><h3>Materials and methods</h3><p>Between August and November 2021, a total of 235 sites of fracture or suspected fracture were evaluated and enrolled in the prospective study. Radiologist’s specificity, sensibility accuracy, positive and negative predictive values were compared to AI. Cohen's kappa was used to calculate the agreement between AI and radiologist. We also reviewed the AI workflow integration process, focusing on potential issues and assessed radiologists’ opinion on AI via a survey.</p></div><div><h3>Results</h3><p>The radiologist performance in accuracy, sensitivity and specificity was better than AI but McNemar test demonstrated no statistically significant difference between AI and radiologist’s performance (<em>p</em> = 0.32). Calculated Cohen’s K of 0.64.</p></div><div><h3>Conclusion</h3><p>Contrary to expectations, our preliminary results did not prove a real improvement of patient outcome nor in reporting time but demonstrated AI high NPV (94,62%) and non-inferiority to radiologist performance. Moreover, the commercially available AI algorithm used in our study automatically learn from data and so we expect a progressive performance improvement. AI could be considered as a promising tool to rule-out fractures (especially when used as a “second reader”) and to prioritize positive cases, especially in increasing workload scenarios (ED, nightshifts) but further research is needed to evaluate the real impact on the clinical practice.</p></div>","PeriodicalId":38076,"journal":{"name":"European Journal of Radiology Open","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10359726/pdf/","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352047723000308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 1
Abstract
Purpose
The aim of our study is to evaluate artificial intelligence (AI) support in pelvic fracture diagnosis on X-rays, focusing on performance, workflow integration and radiologists’ feedback in a spoke emergency hospital.
Materials and methods
Between August and November 2021, a total of 235 sites of fracture or suspected fracture were evaluated and enrolled in the prospective study. Radiologist’s specificity, sensibility accuracy, positive and negative predictive values were compared to AI. Cohen's kappa was used to calculate the agreement between AI and radiologist. We also reviewed the AI workflow integration process, focusing on potential issues and assessed radiologists’ opinion on AI via a survey.
Results
The radiologist performance in accuracy, sensitivity and specificity was better than AI but McNemar test demonstrated no statistically significant difference between AI and radiologist’s performance (p = 0.32). Calculated Cohen’s K of 0.64.
Conclusion
Contrary to expectations, our preliminary results did not prove a real improvement of patient outcome nor in reporting time but demonstrated AI high NPV (94,62%) and non-inferiority to radiologist performance. Moreover, the commercially available AI algorithm used in our study automatically learn from data and so we expect a progressive performance improvement. AI could be considered as a promising tool to rule-out fractures (especially when used as a “second reader”) and to prioritize positive cases, especially in increasing workload scenarios (ED, nightshifts) but further research is needed to evaluate the real impact on the clinical practice.