Comparison of radiological interpretation made by veterinary radiologists and state-of-the-art commercial AI software for canine and feline radiographic studies.

IF 2.6 2区农林科学 Q1 VETERINARY SCIENCES Frontiers in Veterinary Science Pub Date : 2025-02-21 eCollection Date: 2025-01-01 DOI:10.3389/fvets.2025.1502790

Yero S Ndiaye, Peter Cramton, Chavdar Chernev, Axel Ockenfels, Tobias Schwarz

{"title":"Comparison of radiological interpretation made by veterinary radiologists and state-of-the-art commercial AI software for canine and feline radiographic studies.","authors":"Yero S Ndiaye, Peter Cramton, Chavdar Chernev, Axel Ockenfels, Tobias Schwarz","doi":"10.3389/fvets.2025.1502790","DOIUrl":null,"url":null,"abstract":"Introduction: As human medical diagnostic expertise is scarcely available, especially in veterinary care, artificial intelligence (AI) has been increasingly used as a remedy. AI's promise comes from improving human diagnostics or providing good diagnostics at lower cost, increasing access. This study analyzed the diagnostic performance of a widely used AI radiology software vs. veterinary radiologists in interpreting canine and feline radiographs. We aimed to establish whether the performance of commonly used AI matches the performance of a typical radiologist and thus can be reliably used. Secondly, we try to identify in which cases AI is effective.Methods: Fifty canine and feline radiographic studies in DICOM format were anonymized and reported by 11 board-certified veterinary radiologists (ECVDI or ACVR) and processed with commercial and widely used AI software dedicated to small animal radiography (SignalRAY®, SignalPET® Dallas, TX, USA). The AI software used a deep-learning algorithm and returned a coded abnormal or normal diagnosis for each finding in the study. The radiologists provided a written report in English. All reports' findings were coded into categories matching the codes from the AI software and classified as normal or abnormal. The sensitivity, specificity, and accuracy of each radiologist and the AI software were calculated. The variance in agreement between each radiologist and the AI software was measured to calculate the ambiguity of each radiological finding.Results: AI matched the best radiologist in accuracy and was more specific but less sensitive than human radiologists. AI did better than the median radiologist overall in low- and high-ambiguity cases. In high-ambiguity cases, AI's accuracy remained high, though it was less effective at detecting abnormalities but better at identifying normal findings. The study confirmed AI's reliability, especially in low-ambiguity scenarios.Conclusion: Our findings suggest that AI performs almost as well as the best veterinary radiologist in all settings of descriptive radiographic findings. However, its strengths lie more in confirming normality than detecting abnormalities, and it does not provide differential diagnoses. Therefore, the broader use of AI could reliably increase diagnostic availability but requires further human input. Given the unique strengths of human experts and AI and the differences in sensitivity vs. specificity and low-ambiguity vs. high-ambiguity settings, AI will likely complement rather than replace human experts.","PeriodicalId":12772,"journal":{"name":"Frontiers in Veterinary Science","volume":"12 ","pages":"1502790"},"PeriodicalIF":2.6000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11886591/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Veterinary Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3389/fvets.2025.1502790","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"VETERINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: As human medical diagnostic expertise is scarcely available, especially in veterinary care, artificial intelligence (AI) has been increasingly used as a remedy. AI's promise comes from improving human diagnostics or providing good diagnostics at lower cost, increasing access. This study analyzed the diagnostic performance of a widely used AI radiology software vs. veterinary radiologists in interpreting canine and feline radiographs. We aimed to establish whether the performance of commonly used AI matches the performance of a typical radiologist and thus can be reliably used. Secondly, we try to identify in which cases AI is effective.

Methods: Fifty canine and feline radiographic studies in DICOM format were anonymized and reported by 11 board-certified veterinary radiologists (ECVDI or ACVR) and processed with commercial and widely used AI software dedicated to small animal radiography (SignalRAY^®, SignalPET^® Dallas, TX, USA). The AI software used a deep-learning algorithm and returned a coded abnormal or normal diagnosis for each finding in the study. The radiologists provided a written report in English. All reports' findings were coded into categories matching the codes from the AI software and classified as normal or abnormal. The sensitivity, specificity, and accuracy of each radiologist and the AI software were calculated. The variance in agreement between each radiologist and the AI software was measured to calculate the ambiguity of each radiological finding.

Results: AI matched the best radiologist in accuracy and was more specific but less sensitive than human radiologists. AI did better than the median radiologist overall in low- and high-ambiguity cases. In high-ambiguity cases, AI's accuracy remained high, though it was less effective at detecting abnormalities but better at identifying normal findings. The study confirmed AI's reliability, especially in low-ambiguity scenarios.

Conclusion: Our findings suggest that AI performs almost as well as the best veterinary radiologist in all settings of descriptive radiographic findings. However, its strengths lie more in confirming normality than detecting abnormalities, and it does not provide differential diagnoses. Therefore, the broader use of AI could reliably increase diagnostic availability but requires further human input. Given the unique strengths of human experts and AI and the differences in sensitivity vs. specificity and low-ambiguity vs. high-ambiguity settings, AI will likely complement rather than replace human experts.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Frontiers in Veterinary Science Veterinary-General Veterinary

CiteScore

4.80

自引率

9.40%

发文量

1870

审稿时长

14 weeks

期刊介绍： Frontiers in Veterinary Science is a global, peer-reviewed, Open Access journal that bridges animal and human health, brings a comparative approach to medical and surgical challenges, and advances innovative biotechnology and therapy. Veterinary research today is interdisciplinary, collaborative, and socially relevant, transforming how we understand and investigate animal health and disease. Fundamental research in emerging infectious diseases, predictive genomics, stem cell therapy, and translational modelling is grounded within the integrative social context of public and environmental health, wildlife conservation, novel biomarkers, societal well-being, and cutting-edge clinical practice and specialization. Frontiers in Veterinary Science brings a 21st-century approach—networked, collaborative, and Open Access—to communicate this progress and innovation to both the specialist and to the wider audience of readers in the field. Frontiers in Veterinary Science publishes articles on outstanding discoveries across a wide spectrum of translational, foundational, and clinical research. The journal''s mission is to bring all relevant veterinary sciences together on a single platform with the goal of improving animal and human health.