Comparison between artificial intelligence solution and radiologist for the detection of pelvic, hip and extremity fractures on radiographs in adult using CT as standard of reference.
Maxime Pastor, Djamel Dabli, Raphaël Lonjon, Chris Serrand, Fehmi Snene, Fayssal Trad, Fabien de Oliveira, Jean-Paul Beregi, Joël Greffier
{"title":"Comparison between artificial intelligence solution and radiologist for the detection of pelvic, hip and extremity fractures on radiographs in adult using CT as standard of reference.","authors":"Maxime Pastor, Djamel Dabli, Raphaël Lonjon, Chris Serrand, Fehmi Snene, Fayssal Trad, Fabien de Oliveira, Jean-Paul Beregi, Joël Greffier","doi":"10.1016/j.diii.2024.09.004","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The purpose of this study was to compare the diagnostic performance of an artificial intelligence (AI) solution for the detection of fractures of pelvic, proximal femur or extremity fractures in adults with radiologist interpretation of radiographs, using standard dose CT examination as the standard of reference.</p><p><strong>Materials and methods: </strong>This retrospective study included 94 adult patients with suspected bone fractures who underwent a standard dose CT examination and radiographs of the pelvis and/or hip and extremities at our institution between January 2022 and August 2023. For all patients, an AI solution was used retrospectively on the radiographs to detect and localize bone fractures of the pelvis and/or hip and extremities. Results of the AI solution were compared to the reading of each radiograph by a radiologist using McNemar test. The results of standard dose CT examination as interpreted by a senior radiologist were used as the standard of reference.</p><p><strong>Result: </strong>A total of 94 patients (63 women; mean age, 56.4 ± 22.5 [standard deviation] years) were included. Forty-seven patients had at least one fracture, and a total of 71 fractures were deemed present using the standard of reference (25 hand/wrist, 16 pelvis, 30 foot/ankle). Using the standard of reference, the analysis of radiographs by the AI solution resulted in 58 true positive, 13 false negative, 33 true negative and 15 false positive findings, yielding 82 % sensitivity (58/71; 95 % confidence interval [CI]: 71-89 %), 69 % specificity (33/48; 95 % CI: 55-80 %), and 76 % accuracy (91/119; 95 % CI: 69-84 %). Using the standard of reference, the reading of the radiologist resulted in 65 true positive, 6 false negative, 42 true negative and 6 false positive findings, yielding 92 % sensitivity (65/71; 95 % CI: 82-96 %), 88 % specificity (42/48; 95 % CI: 75-94 %), and 90 % accuracy (107/119; 95 % CI: 85-95 %). The radiologist outperformed the AI solution in terms of sensitivity (P = 0.045), specificity (P = 0.016), and accuracy (P < 0.001).</p><p><strong>Conclusion: </strong>In this study, the radiologist outperformed the AI solution for the diagnosis of pelvic, hip and extremity fractures of the using radiographs. This raises the question of whether a strong standard of reference for evaluating AI solutions should be used in future studies comparing AI and human reading in fracture detection using radiographs.</p>","PeriodicalId":48656,"journal":{"name":"Diagnostic and Interventional Imaging","volume":null,"pages":null},"PeriodicalIF":4.9000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and Interventional Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.diii.2024.09.004","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The purpose of this study was to compare the diagnostic performance of an artificial intelligence (AI) solution for the detection of fractures of pelvic, proximal femur or extremity fractures in adults with radiologist interpretation of radiographs, using standard dose CT examination as the standard of reference.
Materials and methods: This retrospective study included 94 adult patients with suspected bone fractures who underwent a standard dose CT examination and radiographs of the pelvis and/or hip and extremities at our institution between January 2022 and August 2023. For all patients, an AI solution was used retrospectively on the radiographs to detect and localize bone fractures of the pelvis and/or hip and extremities. Results of the AI solution were compared to the reading of each radiograph by a radiologist using McNemar test. The results of standard dose CT examination as interpreted by a senior radiologist were used as the standard of reference.
Result: A total of 94 patients (63 women; mean age, 56.4 ± 22.5 [standard deviation] years) were included. Forty-seven patients had at least one fracture, and a total of 71 fractures were deemed present using the standard of reference (25 hand/wrist, 16 pelvis, 30 foot/ankle). Using the standard of reference, the analysis of radiographs by the AI solution resulted in 58 true positive, 13 false negative, 33 true negative and 15 false positive findings, yielding 82 % sensitivity (58/71; 95 % confidence interval [CI]: 71-89 %), 69 % specificity (33/48; 95 % CI: 55-80 %), and 76 % accuracy (91/119; 95 % CI: 69-84 %). Using the standard of reference, the reading of the radiologist resulted in 65 true positive, 6 false negative, 42 true negative and 6 false positive findings, yielding 92 % sensitivity (65/71; 95 % CI: 82-96 %), 88 % specificity (42/48; 95 % CI: 75-94 %), and 90 % accuracy (107/119; 95 % CI: 85-95 %). The radiologist outperformed the AI solution in terms of sensitivity (P = 0.045), specificity (P = 0.016), and accuracy (P < 0.001).
Conclusion: In this study, the radiologist outperformed the AI solution for the diagnosis of pelvic, hip and extremity fractures of the using radiographs. This raises the question of whether a strong standard of reference for evaluating AI solutions should be used in future studies comparing AI and human reading in fracture detection using radiographs.
期刊介绍:
Diagnostic and Interventional Imaging accepts publications originating from any part of the world based only on their scientific merit. The Journal focuses on illustrated articles with great iconographic topics and aims at aiding sharpening clinical decision-making skills as well as following high research topics. All articles are published in English.
Diagnostic and Interventional Imaging publishes editorials, technical notes, letters, original and review articles on abdominal, breast, cancer, cardiac, emergency, forensic medicine, head and neck, musculoskeletal, gastrointestinal, genitourinary, interventional, obstetric, pediatric, thoracic and vascular imaging, neuroradiology, nuclear medicine, as well as contrast material, computer developments, health policies and practice, and medical physics relevant to imaging.