Applying Conformal Prediction to a Deep Learning Model for Intracranial Hemorrhage Detection to Improve Trustworthiness.

IF 8.1 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Radiology-Artificial Intelligence Pub Date : 2024-11-27 DOI:10.1148/ryai.240032

Cooper Gamble, Shahriar Faghani, Bradley J Erickson

{"title":"Applying Conformal Prediction to a Deep Learning Model for Intracranial Hemorrhage Detection to Improve Trustworthiness.","authors":"Cooper Gamble, Shahriar Faghani, Bradley J Erickson","doi":"10.1148/ryai.240032","DOIUrl":null,"url":null,"abstract":"\"Just Accepted\" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To apply conformal prediction to a deep learning (DL) model for intracranial hemorrhage (ICH) detection and evaluate model performance in detection as well as model accuracy in identifying challenging cases. Materials and Methods This was a retrospective (November 2017 through December 2017) study of 491 noncontrast head CT volumes from the CQ500 dataset in which three senior radiologists annotated sections containing ICH. The dataset was split into definite and challenging (uncertain) subsets, where challenging images were defined as those in which there was disagreement among readers. A DL model was trained on 146 patients (mean age = 45.7, 70 females, 76 males) from the definite data (training dataset) to perform ICH localization and classification into five classes. To develop an uncertainty-aware DL model, 1,546 sections of the definite data (calibration dataset) was used for Mondrian conformal prediction (MCP). The uncertainty-aware DL model was tested on 8,401 definite and challenging sections to assess its ability to identify challenging sections. The difference in predictive performance (P value) and ability to identify challenging sections (accuracy) were reported. Results After the MCP procedure, the model achieved an F1 score of 0.920 for ICH classification on the test dataset. Additionally, it correctly identified 6,837 of the 6,856 total challenging sections as challenging (99.7% accuracy). It did not incorrectly label any definite sections as challenging. Conclusion The uncertainty-aware MCP-augmented DL model achieved high performance in ICH detection and high accuracy in identifying challenging sections, suggesting its usefulness in automated ICH detection and potential to increase trustworthiness of DL models in radiology. ©RSNA, 2024.","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240032"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.240032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To apply conformal prediction to a deep learning (DL) model for intracranial hemorrhage (ICH) detection and evaluate model performance in detection as well as model accuracy in identifying challenging cases. Materials and Methods This was a retrospective (November 2017 through December 2017) study of 491 noncontrast head CT volumes from the CQ500 dataset in which three senior radiologists annotated sections containing ICH. The dataset was split into definite and challenging (uncertain) subsets, where challenging images were defined as those in which there was disagreement among readers. A DL model was trained on 146 patients (mean age = 45.7, 70 females, 76 males) from the definite data (training dataset) to perform ICH localization and classification into five classes. To develop an uncertainty-aware DL model, 1,546 sections of the definite data (calibration dataset) was used for Mondrian conformal prediction (MCP). The uncertainty-aware DL model was tested on 8,401 definite and challenging sections to assess its ability to identify challenging sections. The difference in predictive performance (P value) and ability to identify challenging sections (accuracy) were reported. Results After the MCP procedure, the model achieved an F1 score of 0.920 for ICH classification on the test dataset. Additionally, it correctly identified 6,837 of the 6,856 total challenging sections as challenging (99.7% accuracy). It did not incorrectly label any definite sections as challenging. Conclusion The uncertainty-aware MCP-augmented DL model achieved high performance in ICH detection and high accuracy in identifying challenging sections, suggesting its usefulness in automated ICH detection and potential to increase trustworthiness of DL models in radiology. ©RSNA, 2024.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在颅内出血检测的深度学习模型中应用共形预测，提高可信度。

"刚刚接受 "的论文经过同行评审，已被接受在《放射学》上发表：人工智能》上发表。这篇文章在以最终版本发表之前，还将经过校对、排版和校对审核。请注意，在制作最终校对稿的过程中，可能会发现一些错误，从而影响文章内容。目的将保形预测应用于颅内出血（ICH）检测的深度学习（DL）模型，并评估模型在检测方面的性能以及模型在识别挑战性病例方面的准确性。材料与方法这是一项回顾性研究（2017 年 11 月至 2017 年 12 月），研究对象是 CQ500 数据集中的 491 张非对比头部 CT 卷，其中有三位资深放射科医生对含有 ICH 的切片进行了注释。数据集被分成确定和具有挑战性（不确定）的子集，其中具有挑战性的图像被定义为读者之间存在分歧的图像。对明确数据（训练数据集）中的 146 名患者（平均年龄 45.7 岁，女性 70 人，男性 76 人）进行了 DL 模型训练，以进行 ICH 定位并将其分为五类。为了开发不确定性感知 DL 模型，使用了 1,546 个明确数据（校准数据集）进行蒙德里安共形预测 (MCP)。不确定性感知 DL 模型在 8,401 个确定断面和挑战断面上进行了测试，以评估其识别挑战断面的能力。报告了预测性能的差异（P 值）和识别挑战性路段的能力（准确性）。结果经过 MCP 程序后，该模型在测试数据集上的非物质文化遗产分类 F1 得分为 0.920。此外，在总共 6856 个具有挑战性的部分中，该模型正确识别了 6837 个具有挑战性的部分（准确率为 99.7%）。它没有错误地将任何明确的部分标记为具有挑战性。结论不确定性感知的 MCP 增强 DL 模型在 ICH 检测中取得了很高的性能，在识别具有挑战性的切片方面也有很高的准确性，这表明它在自动 ICH 检测中非常有用，并有可能提高 DL 模型在放射学中的可信度。©RSNA，2024。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Radiology-Artificial Intelligence

CiteScore

16.20

自引率

1.00%

发文量

期刊介绍： Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.