Low responsiveness of machine learning models to critical or deteriorating health conditions.

IF 5.4 Q1 MEDICINE, RESEARCH & EXPERIMENTAL Communications medicine Pub Date : 2025-03-11 DOI:10.1038/s43856-025-00775-0
Tanmoy Sarkar Pias, Sharmin Afrose, Moon Das Tuli, Ipsita Hamid Trisha, Xinwei Deng, Charles B Nemeroff, Danfeng Daphne Yao
{"title":"Low responsiveness of machine learning models to critical or deteriorating health conditions.","authors":"Tanmoy Sarkar Pias, Sharmin Afrose, Moon Das Tuli, Ipsita Hamid Trisha, Xinwei Deng, Charles B Nemeroff, Danfeng Daphne Yao","doi":"10.1038/s43856-025-00775-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Machine learning (ML) based mortality prediction models can be immensely useful in intensive care units. Such a model should generate warnings to alert physicians when a patient's condition rapidly deteriorates, or their vitals are in highly abnormal ranges. Before clinical deployment, it is important to comprehensively assess a model's ability to recognize critical patient conditions.</p><p><strong>Methods: </strong>We develop multiple medical ML testing approaches, including a gradient ascent method and neural activation map. We systematically assess these machine learning models' ability to respond to serious medical conditions using additional test cases, some of which are time series. Guided by medical doctors, our evaluation involves multiple machine learning models, resampling techniques, and four datasets for two clinical prediction tasks.</p><p><strong>Results: </strong>We identify serious deficiencies in the models' responsiveness, with the models being unable to recognize severely impaired medical conditions or rapidly deteriorating health. For in-hospital mortality prediction, the models tested using our synthesized cases fail to recognize 66% of the injuries. In some instances, the models fail to generate adequate mortality risk scores for all test cases. Our study identifies similar kinds of deficiencies in the responsiveness of 5-year breast and lung cancer prediction models.</p><p><strong>Conclusions: </strong>Using generated test cases, we find that statistical machine-learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. Most of the ML models tested fail to respond adequately to critically ill patients. How to incorporate medical knowledge into clinical machine learning models is an important future research direction.</p>","PeriodicalId":72646,"journal":{"name":"Communications medicine","volume":"5 1","pages":"62"},"PeriodicalIF":5.4000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11897252/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s43856-025-00775-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Machine learning (ML) based mortality prediction models can be immensely useful in intensive care units. Such a model should generate warnings to alert physicians when a patient's condition rapidly deteriorates, or their vitals are in highly abnormal ranges. Before clinical deployment, it is important to comprehensively assess a model's ability to recognize critical patient conditions.

Methods: We develop multiple medical ML testing approaches, including a gradient ascent method and neural activation map. We systematically assess these machine learning models' ability to respond to serious medical conditions using additional test cases, some of which are time series. Guided by medical doctors, our evaluation involves multiple machine learning models, resampling techniques, and four datasets for two clinical prediction tasks.

Results: We identify serious deficiencies in the models' responsiveness, with the models being unable to recognize severely impaired medical conditions or rapidly deteriorating health. For in-hospital mortality prediction, the models tested using our synthesized cases fail to recognize 66% of the injuries. In some instances, the models fail to generate adequate mortality risk scores for all test cases. Our study identifies similar kinds of deficiencies in the responsiveness of 5-year breast and lung cancer prediction models.

Conclusions: Using generated test cases, we find that statistical machine-learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. Most of the ML models tested fail to respond adequately to critically ill patients. How to incorporate medical knowledge into clinical machine learning models is an important future research direction.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习模型对严重或恶化的健康状况的响应性较低。
背景:基于机器学习(ML)的死亡率预测模型在重症监护病房非常有用。当病人的病情迅速恶化,或者他们的生命体征处于高度异常的范围时,这种模型应该产生警告,提醒医生。在临床部署之前,重要的是要全面评估模型识别危重患者病情的能力。方法:我们开发了多种医学ML测试方法,包括梯度上升法和神经激活图。我们使用额外的测试用例(其中一些是时间序列)系统地评估了这些机器学习模型对严重医疗状况的响应能力。在医生的指导下,我们的评估涉及多个机器学习模型、重采样技术和四个数据集,用于两个临床预测任务。结果:我们发现模型的响应性存在严重缺陷,模型无法识别严重受损的医疗条件或迅速恶化的健康状况。对于住院死亡率预测,使用我们的综合病例测试的模型不能识别66%的伤害。在某些情况下,模型不能为所有测试用例生成足够的死亡风险评分。我们的研究发现了5年乳腺癌和肺癌预测模型在响应性方面的类似缺陷。结论:使用生成的测试用例,我们发现仅从患者数据训练的统计机器学习模型严重不足,并且存在许多危险的盲点。大多数被测试的ML模型不能对危重患者做出充分的反应。如何将医学知识融入临床机器学习模型是未来重要的研究方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Early-life immunological and microbial differences between East African and North European children. Increased physical performance after personalised physiotherapy and nutritional counselling in adults with post-COVID-19 condition: a feasibility randomised trial. Longitudinal deep multi-omics profiling in a CLN3Δex7/8 minipig model identifies biomarker signatures of disease. Synergistic elastase and papain injury drives abdominal aortic aneurysm formation and rupture in mice. Modelling of onchocerciasis-associated skin and ocular disease and the impact of ivermectin treatment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1