Machine learning models' assessment: trust and performance.

IF 2.6 4区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Medical & Biological Engineering & Computing Pub Date : 2024-11-01 Epub Date: 2024-06-08 DOI:10.1007/s11517-024-03145-5
S Sousa, S Paredes, T Rocha, J Henriques, J Sousa, L Gonçalves
{"title":"Machine learning models' assessment: trust and performance.","authors":"S Sousa, S Paredes, T Rocha, J Henriques, J Sousa, L Gonçalves","doi":"10.1007/s11517-024-03145-5","DOIUrl":null,"url":null,"abstract":"<p><p>The common black box nature of machine learning models is an obstacle to their application in health care context. Their widespread application is limited by a significant \"lack of trust.\" So, the main goal of this work is the development of an evaluation approach that can assess, simultaneously, trust and performance. Trust assessment is based on (i) model robustness (stability assessment), (ii) confidence (95% CI of geometric mean), and (iii) interpretability (comparison of respective features ranking with clinical evidence). Performance is assessed through geometric mean. For validation, in patients' stratification in cardiovascular risk assessment, a Portuguese dataset (N=1544) was applied. Five different models were compared: (i) GRACE score, the most common risk assessment tool in Portugal for patients with acute coronary syndrome; (ii) logistic regression; (iii) Naïve Bayes; (iv) decision trees; and (v) rule-based approach, previously developed by this team. The obtained results confirm that the simultaneous assessment of trust and performance can be successfully implemented. The rule-based approach seems to have potential for clinical application. It provides a high level of trust in the respective operation while outperformed the GRACE model's performance, enhancing the required physicians' acceptance. This may increase the possibility to effectively aid the clinical decision.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":"3397-3410"},"PeriodicalIF":2.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11485107/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical & Biological Engineering & Computing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11517-024-03145-5","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/8 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The common black box nature of machine learning models is an obstacle to their application in health care context. Their widespread application is limited by a significant "lack of trust." So, the main goal of this work is the development of an evaluation approach that can assess, simultaneously, trust and performance. Trust assessment is based on (i) model robustness (stability assessment), (ii) confidence (95% CI of geometric mean), and (iii) interpretability (comparison of respective features ranking with clinical evidence). Performance is assessed through geometric mean. For validation, in patients' stratification in cardiovascular risk assessment, a Portuguese dataset (N=1544) was applied. Five different models were compared: (i) GRACE score, the most common risk assessment tool in Portugal for patients with acute coronary syndrome; (ii) logistic regression; (iii) Naïve Bayes; (iv) decision trees; and (v) rule-based approach, previously developed by this team. The obtained results confirm that the simultaneous assessment of trust and performance can be successfully implemented. The rule-based approach seems to have potential for clinical application. It provides a high level of trust in the respective operation while outperformed the GRACE model's performance, enhancing the required physicians' acceptance. This may increase the possibility to effectively aid the clinical decision.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习模型评估:信任与性能。
机器学习模型常见的黑箱性质是其在医疗保健领域应用的一个障碍。它们的广泛应用受到严重的 "信任缺失 "的限制。因此,这项工作的主要目标是开发一种可同时评估信任度和性能的评估方法。信任度评估基于:(i) 模型稳健性(稳定性评估);(ii) 可信度(几何平均数的 95% CI);(iii) 可解释性(各自特征排名与临床证据的比较)。通过几何平均数评估性能。为了验证心血管风险评估中的患者分层,应用了葡萄牙数据集(N=1544)。比较了五种不同的模型:(i) GRACE 评分(葡萄牙最常用的急性冠状动脉综合征患者风险评估工具);(ii) 逻辑回归;(iii) 奈夫贝叶斯;(iv) 决策树;(v) 本团队之前开发的基于规则的方法。所得结果证实,同时评估信任度和绩效的方法可以成功实施。基于规则的方法似乎具有临床应用潜力。它为各自的操作提供了较高的信任度,同时在性能上优于 GRACE 模型,提高了所需医生的接受度。这可能会增加有效辅助临床决策的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Medical & Biological Engineering & Computing
Medical & Biological Engineering & Computing 医学-工程:生物医学
CiteScore
6.00
自引率
3.10%
发文量
249
审稿时长
3.5 months
期刊介绍: Founded in 1963, Medical & Biological Engineering & Computing (MBEC) continues to serve the biomedical engineering community, covering the entire spectrum of biomedical and clinical engineering. The journal presents exciting and vital experimental and theoretical developments in biomedical science and technology, and reports on advances in computer-based methodologies in these multidisciplinary subjects. The journal also incorporates new and evolving technologies including cellular engineering and molecular imaging. MBEC publishes original research articles as well as reviews and technical notes. Its Rapid Communications category focuses on material of immediate value to the readership, while the Controversies section provides a forum to exchange views on selected issues, stimulating a vigorous and informed debate in this exciting and high profile field. MBEC is an official journal of the International Federation of Medical and Biological Engineering (IFMBE).
期刊最新文献
Spatial temperature monitoring of preterm infants using a multi-modal camera setup. Diffusion correction of Beer-Lambert law in visible light optical coherence tomography for retinal vessels. Machine learning-based stratification of chagas heart failure severity using ECG power spectral biomarkers. An automated method for precise definition of fracture reduction targets. Deep learning for carotid Doppler spectra classification.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1