利用不确定性估计增强胸部 CT 中肺部结节恶性肿瘤风险估计的深度学习模型。

IF 4.7 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING European Radiology Pub Date : 2024-10-01 Epub Date: 2024-03-27 DOI:10.1007/s00330-024-10714-7

Dré Peeters, Natália Alves, Kiran V Venkadesh, Renate Dinnessen, Zaigham Saghir, Ernst T Scholten, Cornelia Schaefer-Prokop, Rozemarijn Vliegenthart, Mathias Prokop, Colin Jacobs

{"title":"利用不确定性估计增强胸部 CT 中肺部结节恶性肿瘤风险估计的深度学习模型。","authors":"Dré Peeters, Natália Alves, Kiran V Venkadesh, Renate Dinnessen, Zaigham Saghir, Ernst T Scholten, Cornelia Schaefer-Prokop, Rozemarijn Vliegenthart, Mathias Prokop, Colin Jacobs","doi":"10.1007/s00330-024-10714-7","DOIUrl":null,"url":null,"abstract":"Objective: To investigate the effect of uncertainty estimation on the performance of a Deep Learning (DL) algorithm for estimating malignancy risk of pulmonary nodules.Methods and materials: In this retrospective study, we integrated an uncertainty estimation method into a previously developed DL algorithm for nodule malignancy risk estimation. Uncertainty thresholds were developed using CT data from the Danish Lung Cancer Screening Trial (DLCST), containing 883 nodules (65 malignant) collected between 2004 and 2010. We used thresholds on the 90th and 95th percentiles of the uncertainty score distribution to categorize nodules into certain and uncertain groups. External validation was performed on clinical CT data from a tertiary academic center containing 374 nodules (207 malignant) collected between 2004 and 2012. DL performance was measured using area under the ROC curve (AUC) for the full set of nodules, for the certain cases and for the uncertain cases. Additionally, nodule characteristics were compared to identify trends for inducing uncertainty.Results: The DL algorithm performed significantly worse in the uncertain group compared to the certain group of DLCST (AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001) and the clinical dataset (AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001). The uncertain group included larger benign nodules as well as more part-solid and non-solid nodules than the certain group.Conclusion: The integrated uncertainty estimation showed excellent performance for identifying uncertain cases in which the DL-based nodule malignancy risk estimation algorithm had significantly worse performance.Clinical relevance statement: Deep Learning algorithms often lack the ability to gauge and communicate uncertainty. For safe clinical implementation, uncertainty estimation is of pivotal importance to identify cases where the deep learning algorithm harbors doubt in its prediction.Key points: • Deep learning (DL) algorithms often lack uncertainty estimation, which potentially reduce the risk of errors and improve safety during clinical adoption of the DL algorithm. • Uncertainty estimation identifies pulmonary nodules in which the discriminative performance of the DL algorithm is significantly worse. • Uncertainty estimation can further enhance the benefits of the DL algorithm and improve its safety and trustworthiness.","PeriodicalId":12076,"journal":{"name":"European Radiology","volume":" ","pages":"6639-6651"},"PeriodicalIF":4.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11399205/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation.\",\"authors\":\"Dré Peeters, Natália Alves, Kiran V Venkadesh, Renate Dinnessen, Zaigham Saghir, Ernst T Scholten, Cornelia Schaefer-Prokop, Rozemarijn Vliegenthart, Mathias Prokop, Colin Jacobs\",\"doi\":\"10.1007/s00330-024-10714-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective: To investigate the effect of uncertainty estimation on the performance of a Deep Learning (DL) algorithm for estimating malignancy risk of pulmonary nodules.Methods and materials: In this retrospective study, we integrated an uncertainty estimation method into a previously developed DL algorithm for nodule malignancy risk estimation. Uncertainty thresholds were developed using CT data from the Danish Lung Cancer Screening Trial (DLCST), containing 883 nodules (65 malignant) collected between 2004 and 2010. We used thresholds on the 90th and 95th percentiles of the uncertainty score distribution to categorize nodules into certain and uncertain groups. External validation was performed on clinical CT data from a tertiary academic center containing 374 nodules (207 malignant) collected between 2004 and 2012. DL performance was measured using area under the ROC curve (AUC) for the full set of nodules, for the certain cases and for the uncertain cases. Additionally, nodule characteristics were compared to identify trends for inducing uncertainty.Results: The DL algorithm performed significantly worse in the uncertain group compared to the certain group of DLCST (AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001) and the clinical dataset (AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001). The uncertain group included larger benign nodules as well as more part-solid and non-solid nodules than the certain group.Conclusion: The integrated uncertainty estimation showed excellent performance for identifying uncertain cases in which the DL-based nodule malignancy risk estimation algorithm had significantly worse performance.Clinical relevance statement: Deep Learning algorithms often lack the ability to gauge and communicate uncertainty. For safe clinical implementation, uncertainty estimation is of pivotal importance to identify cases where the deep learning algorithm harbors doubt in its prediction.Key points: • Deep learning (DL) algorithms often lack uncertainty estimation, which potentially reduce the risk of errors and improve safety during clinical adoption of the DL algorithm. • Uncertainty estimation identifies pulmonary nodules in which the discriminative performance of the DL algorithm is significantly worse. • Uncertainty estimation can further enhance the benefits of the DL algorithm and improve its safety and trustworthiness.\",\"PeriodicalId\":12076,\"journal\":{\"name\":\"European Radiology\",\"volume\":\" \",\"pages\":\"6639-6651\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11399205/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00330-024-10714-7\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/3/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00330-024-10714-7","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

目的研究不确定性估计对用于估计肺结节恶性风险的深度学习（DL）算法性能的影响：在这项回顾性研究中，我们将不确定性估计方法整合到之前开发的用于估计肺结节恶性风险的深度学习算法中。我们使用丹麦肺癌筛查试验（DLCST）的 CT 数据开发了不确定性阈值，这些数据包含 2004 年至 2010 年间收集的 883 个结节（65 个恶性）。我们使用不确定性评分分布的第 90 和第 95 百分位数阈值将结节分为确定组和不确定组。我们对一家三级学术中心的临床 CT 数据进行了外部验证，这些数据包含 2004 年至 2012 年间收集的 374 个结节（207 个恶性）。采用 ROC 曲线下面积（AUC）对全套结节、确定病例和不确定病例的 DL 性能进行了测量。此外，还对结节特征进行了比较，以确定诱发不确定性的趋势：与 DLCST 的特定组（AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001）和临床数据集（AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001）相比，DL 算法在不确定组的表现明显较差。与确定组相比，不确定组包括更大的良性结节以及更多的部分实性和非实性结节：综合不确定性估计在识别不确定病例方面表现出色，而基于深度学习的结节恶性风险估计算法在识别不确定病例方面表现明显较差：深度学习算法通常缺乏衡量和交流不确定性的能力。为了安全地在临床上实施，不确定性估计对于识别深度学习算法在预测中存在疑问的病例至关重要：- 深度学习（DL）算法通常缺乏不确定性估计，而不确定性估计有可能降低错误风险，并提高深度学习算法在临床应用中的安全性。- 不确定性估计可识别出深度学习算法分辨性能明显较差的肺结节。- 不确定性估计可进一步提高 DL 算法的优势，并提高其安全性和可信度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation.

Objective: To investigate the effect of uncertainty estimation on the performance of a Deep Learning (DL) algorithm for estimating malignancy risk of pulmonary nodules.

Methods and materials: In this retrospective study, we integrated an uncertainty estimation method into a previously developed DL algorithm for nodule malignancy risk estimation. Uncertainty thresholds were developed using CT data from the Danish Lung Cancer Screening Trial (DLCST), containing 883 nodules (65 malignant) collected between 2004 and 2010. We used thresholds on the 90th and 95th percentiles of the uncertainty score distribution to categorize nodules into certain and uncertain groups. External validation was performed on clinical CT data from a tertiary academic center containing 374 nodules (207 malignant) collected between 2004 and 2012. DL performance was measured using area under the ROC curve (AUC) for the full set of nodules, for the certain cases and for the uncertain cases. Additionally, nodule characteristics were compared to identify trends for inducing uncertainty.

Results: The DL algorithm performed significantly worse in the uncertain group compared to the certain group of DLCST (AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001) and the clinical dataset (AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001). The uncertain group included larger benign nodules as well as more part-solid and non-solid nodules than the certain group.

Conclusion: The integrated uncertainty estimation showed excellent performance for identifying uncertain cases in which the DL-based nodule malignancy risk estimation algorithm had significantly worse performance.

Clinical relevance statement: Deep Learning algorithms often lack the ability to gauge and communicate uncertainty. For safe clinical implementation, uncertainty estimation is of pivotal importance to identify cases where the deep learning algorithm harbors doubt in its prediction.

Key points: • Deep learning (DL) algorithms often lack uncertainty estimation, which potentially reduce the risk of errors and improve safety during clinical adoption of the DL algorithm. • Uncertainty estimation identifies pulmonary nodules in which the discriminative performance of the DL algorithm is significantly worse. • Uncertainty estimation can further enhance the benefits of the DL algorithm and improve its safety and trustworthiness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Radiology 医学-核医学

CiteScore

11.60

自引率

8.50%

发文量

874

审稿时长

2-4 weeks

期刊介绍： European Radiology (ER) continuously updates scientific knowledge in radiology by publication of strong original articles and state-of-the-art reviews written by leading radiologists. A well balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes ER an indispensable source for current information in this field. This is the Journal of the European Society of Radiology, and the official journal of a number of societies. From 2004-2008 supplements to European Radiology were published under its companion, European Radiology Supplements, ISSN 1613-3749.