Improving disease classification performance and explainability of deep learning models in radiology with heatmap generators.

Frontiers in radiology Pub Date : 2022-10-11 eCollection Date: 2022-01-01 DOI:10.3389/fradi.2022.991683

Akino Watanabe, Sara Ketabi, Khashayar Namdar, Farzad Khalvati

{"title":"Improving disease classification performance and explainability of deep learning models in radiology with heatmap generators.","authors":"Akino Watanabe, Sara Ketabi, Khashayar Namdar, Farzad Khalvati","doi":"10.3389/fradi.2022.991683","DOIUrl":null,"url":null,"abstract":"<p><p>As deep learning is widely used in the radiology field, the explainability of Artificial Intelligence (AI) models is becoming increasingly essential to gain clinicians' trust when using the models for diagnosis. In this research, three experiment sets were conducted with a U-Net architecture to improve the disease classification performance while enhancing the heatmaps corresponding to the model's focus through incorporating heatmap generators during training. All experiments used the dataset that contained chest radiographs, associated labels from one of the three conditions [\"normal\", \"congestive heart failure (CHF)\", and \"pneumonia\"], and numerical information regarding a radiologist's eye-gaze coordinates on the images. The paper that introduced this dataset developed a U-Net model, which was treated as the baseline model for this research, to show how the eye-gaze data can be used in multi-modal training for explainability improvement and disease classification. To compare the classification performances among this research's three experiment sets and the baseline model, the 95% confidence intervals (CI) of the area under the receiver operating characteristic curve (AUC) were measured. The best method achieved an AUC of 0.913 with a 95% CI of [0.860, 0.966]. \"Pneumonia\" and \"CHF\" classes, which the baseline model struggled the most to classify, had the greatest improvements, resulting in AUCs of 0.859 with a 95% CI of [0.732, 0.957] and 0.962 with a 95% CI of [0.933, 0.989], respectively. The decoder of the U-Net for the best-performing proposed method generated heatmaps that highlight the determining image parts in model classifications. These predicted heatmaps, which can be used for the explainability of the model, also improved to align well with the radiologist's eye-gaze data. Hence, this work showed that incorporating heatmap generators and eye-gaze information into training can simultaneously improve disease classification and provide explainable visuals that align well with how the radiologist viewed the chest radiographs when making diagnosis.</p>","PeriodicalId":73101,"journal":{"name":"Frontiers in radiology","volume":"2 ","pages":"991683"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10365129/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in radiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fradi.2022.991683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

As deep learning is widely used in the radiology field, the explainability of Artificial Intelligence (AI) models is becoming increasingly essential to gain clinicians' trust when using the models for diagnosis. In this research, three experiment sets were conducted with a U-Net architecture to improve the disease classification performance while enhancing the heatmaps corresponding to the model's focus through incorporating heatmap generators during training. All experiments used the dataset that contained chest radiographs, associated labels from one of the three conditions ["normal", "congestive heart failure (CHF)", and "pneumonia"], and numerical information regarding a radiologist's eye-gaze coordinates on the images. The paper that introduced this dataset developed a U-Net model, which was treated as the baseline model for this research, to show how the eye-gaze data can be used in multi-modal training for explainability improvement and disease classification. To compare the classification performances among this research's three experiment sets and the baseline model, the 95% confidence intervals (CI) of the area under the receiver operating characteristic curve (AUC) were measured. The best method achieved an AUC of 0.913 with a 95% CI of [0.860, 0.966]. "Pneumonia" and "CHF" classes, which the baseline model struggled the most to classify, had the greatest improvements, resulting in AUCs of 0.859 with a 95% CI of [0.732, 0.957] and 0.962 with a 95% CI of [0.933, 0.989], respectively. The decoder of the U-Net for the best-performing proposed method generated heatmaps that highlight the determining image parts in model classifications. These predicted heatmaps, which can be used for the explainability of the model, also improved to align well with the radiologist's eye-gaze data. Hence, this work showed that incorporating heatmap generators and eye-gaze information into training can simultaneously improve disease classification and provide explainable visuals that align well with how the radiologist viewed the chest radiographs when making diagnosis.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用热图生成器提高放射学中深度学习模型的疾病分类性能和可解释性。

随着深度学习在放射学领域的广泛应用，人工智能（AI）模型的可解释性变得越来越重要，以便在使用模型进行诊断时赢得临床医生的信任。本研究采用 U-Net 架构进行了三组实验，以提高疾病分类性能，同时通过在训练过程中加入热图生成器来增强与模型重点相对应的热图。所有实验都使用了包含胸片、三种病症之一的相关标签（"正常"、"充血性心力衰竭（CHF）"和 "肺炎"）以及放射科医生在图像上注视坐标的数字信息的数据集。介绍该数据集的论文开发了一个 U-Net 模型，并将其作为本研究的基线模型，以展示如何在多模态训练中使用眼球数据来提高可解释性和进行疾病分类。为了比较本研究的三个实验组和基线模型的分类性能，测量了接收者工作特征曲线下面积（AUC）的 95% 置信区间（CI）。最佳方法的 AUC 为 0.913，95% CI 为 [0.860, 0.966]。基线模型最难分类的 "肺炎 "和 "心房颤动 "类别得到了最大改善，AUC 分别为 0.859（95% CI 为 [0.732，0.957]）和 0.962（95% CI 为 [0.933，0.989]）。性能最佳的建议方法的 U-Net 解码器生成的热图突出了模型分类中的决定性图像部分。这些预测的热图可用于解释模型的可解释性，同时也与放射科医生的眼动数据保持一致。因此，这项工作表明，在训练中加入热图生成器和眼动信息可同时改进疾病分类，并提供可解释的视觉效果，使之与放射科医生在诊断时查看胸片的方式完全一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Frontiers in radiology

CiteScore

1.20

自引率

0.00%

发文量