An Inherently Interpretable AI model improves Screening Speed and Accuracy for Early Diabetic Retinopathy

medRxiv - Ophthalmology Pub Date : 2024-06-27 DOI:10.1101/2024.06.27.24309574

Kerol R. Djoumessi Donteu, Ziwei Huang, Laura Kuehlewein, Annekatrin Rickmann, Natalia Simon, Lisa M. Koch, Philipp Berens

{"title":"An Inherently Interpretable AI model improves Screening Speed and Accuracy for Early Diabetic Retinopathy","authors":"Kerol R. Djoumessi Donteu, Ziwei Huang, Laura Kuehlewein, Annekatrin Rickmann, Natalia Simon, Lisa M. Koch, Philipp Berens","doi":"10.1101/2024.06.27.24309574","DOIUrl":null,"url":null,"abstract":"Background: Diabetic retinopathy (DR) is a frequent concomitant disease of diabetes, affecting millions worldwide. Screening for this disease based on fundus images has been one of the first successful use cases for modern artificial intelligence in medicine. Current state-of-the-art systems typically use black-box models to make referral decisions, requiring post-hoc methods for AI-human interaction.\nMethods: In this retrospective reader study, we evaluated an inherently interpretable deep learning model, which explicitly models the local evidence of DR as part of its network architecture, for early DR screening. We trained the network on 34,350 high-quality fundus images from a publicly available dataset and validated its state-of-the-art performance on a large range of ten external datasets. We obtained detailed lesion annotations from ophthalmologists on 65 images to study if the class evidence maps highlight clinically relevant information. Finally, we tested the clinical usefulness of our model in a reader study, where we compared screening for DR without AI support to screening with AI support with and without AI explanations.\nResults: The inherently interpretable deep learning model obtained an accuracy of .906 [.900-.913] (95%-confidence interval) and an AUC of .904 [.894-.913] on the internal test set and similar performance on external datasets. High evidence regions directly extracted from the model contained clinically relevant lesions such as microaneurysms or haemorrhages with a high precision of .960 [.941 - .976]. Decision support by the model highlighting high-evidence regions in the image improved screening accuracy for difficult decisions and improved screening speed.\nInterpretation: Inherently interpretable deep learning models can reach state-of-the-art performance and support screening for early DR by improving human-AI collaboration.","PeriodicalId":501390,"journal":{"name":"medRxiv - Ophthalmology","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.06.27.24309574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Diabetic retinopathy (DR) is a frequent concomitant disease of diabetes, affecting millions worldwide. Screening for this disease based on fundus images has been one of the first successful use cases for modern artificial intelligence in medicine. Current state-of-the-art systems typically use black-box models to make referral decisions, requiring post-hoc methods for AI-human interaction. Methods: In this retrospective reader study, we evaluated an inherently interpretable deep learning model, which explicitly models the local evidence of DR as part of its network architecture, for early DR screening. We trained the network on 34,350 high-quality fundus images from a publicly available dataset and validated its state-of-the-art performance on a large range of ten external datasets. We obtained detailed lesion annotations from ophthalmologists on 65 images to study if the class evidence maps highlight clinically relevant information. Finally, we tested the clinical usefulness of our model in a reader study, where we compared screening for DR without AI support to screening with AI support with and without AI explanations. Results: The inherently interpretable deep learning model obtained an accuracy of .906 [.900-.913] (95%-confidence interval) and an AUC of .904 [.894-.913] on the internal test set and similar performance on external datasets. High evidence regions directly extracted from the model contained clinically relevant lesions such as microaneurysms or haemorrhages with a high precision of .960 [.941 - .976]. Decision support by the model highlighting high-evidence regions in the image improved screening accuracy for difficult decisions and improved screening speed. Interpretation: Inherently interpretable deep learning models can reach state-of-the-art performance and support screening for early DR by improving human-AI collaboration.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

本质上可解释的人工智能模型提高了早期糖尿病视网膜病变的筛查速度和准确性

背景：糖尿病视网膜病变（DR糖尿病视网膜病变（DR）是一种常见的糖尿病并发症，影响着全球数百万人。根据眼底图像筛查这种疾病是现代人工智能在医学领域的首批成功应用案例之一。目前最先进的系统通常使用黑盒模型来做出转诊决定，这需要人工智能与人类互动的事后方法：在这项回顾性读者研究中，我们评估了一种内在可解释的深度学习模型，该模型明确地将 DR 的局部证据作为其网络架构的一部分，用于 DR 的早期筛查。我们在一个公开数据集的 34,350 张高质量眼底图像上对该网络进行了训练，并在大量的十个外部数据集上验证了其最先进的性能。我们从眼科医生处获得了 65 幅图像的详细病变注释，以研究类证据图是否能突出显示临床相关信息。最后，我们在一项读者研究中测试了我们模型的临床实用性，在这项研究中，我们比较了没有人工智能支持的 DR 筛查和有人工智能支持的筛查，以及有人工智能解释和没有人工智能解释的筛查：结果：内在可解释深度学习模型在内部测试集上的准确率为 0.906 [.900-.913]（95% 置信区间），AUC 为 0.904 [.894-.913]，在外部数据集上的表现类似。从模型中直接提取的高证据区域包含微动脉瘤或出血等临床相关病变，精确度高达 0.960 [.941-.976]。通过模型突出图像中的高证据区域来提供决策支持，提高了疑难决策的筛查准确性，并提高了筛查速度：本质上可解释的深度学习模型可以达到最先进的性能，并通过改善人与人工智能的协作来支持早期 DR 的筛查。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

medRxiv - Ophthalmology

自引率

0.00%

发文量