Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning

IF 6.6 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE ACM Transactions on Intelligent Systems and Technology Pub Date : 2024-02-01 DOI:10.1145/3643856

Simi Job, Xiaohui Tao, Lin Li, Haoran Xie, Taotao Cai, Jianming Yong, Qing Li

{"title":"Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning","authors":"Simi Job, Xiaohui Tao, Lin Li, Haoran Xie, Taotao Cai, Jianming Yong, Qing Li","doi":"10.1145/3643856","DOIUrl":null,"url":null,"abstract":"Personalized clinical decision support systems are increasingly being adopted due to the emergence of data-driven technologies, with this approach now gaining recognition in critical care. The task of incorporating diverse patient conditions and treatment procedures into critical care decision-making can be challenging due to the heterogeneous nature of medical data. Advances in Artificial Intelligence (AI), particularly Reinforcement Learning (RL) techniques, enables the development of personalized treatment strategies for severe illnesses by using a learning agent to recommend optimal policies. In this study, we propose a Deep Reinforcement Learning (DRL) model with a tailored reward function and an LSTM-GRU-derived state representation to formulate optimal treatment policies for vasopressor administration in stabilizing patient physiological states in critical care settings. Using an ICU dataset and the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we focus on patients with Acute Respiratory Distress Syndrome (ARDS) that has led to Sepsis, to derive optimal policies that can prioritize patient recovery over patient survival. Both the DDQN (RepDRL-DDQN) and Dueling DDQN (RepDRL-DDDQN) versions of the DRL model surpass the baseline performance, with the proposed model’s learning agent achieving an optimal learning process across our performance measuring schemes. The robust state representation served as the foundation for enhancing the model’s performance, ultimately providing an optimal treatment policy focused on rapid patient recovery.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"24 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3643856","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Personalized clinical decision support systems are increasingly being adopted due to the emergence of data-driven technologies, with this approach now gaining recognition in critical care. The task of incorporating diverse patient conditions and treatment procedures into critical care decision-making can be challenging due to the heterogeneous nature of medical data. Advances in Artificial Intelligence (AI), particularly Reinforcement Learning (RL) techniques, enables the development of personalized treatment strategies for severe illnesses by using a learning agent to recommend optimal policies. In this study, we propose a Deep Reinforcement Learning (DRL) model with a tailored reward function and an LSTM-GRU-derived state representation to formulate optimal treatment policies for vasopressor administration in stabilizing patient physiological states in critical care settings. Using an ICU dataset and the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we focus on patients with Acute Respiratory Distress Syndrome (ARDS) that has led to Sepsis, to derive optimal policies that can prioritize patient recovery over patient survival. Both the DDQN (RepDRL-DDQN) and Dueling DDQN (RepDRL-DDDQN) versions of the DRL model surpass the baseline performance, with the proposed model’s learning agent achieving an optimal learning process across our performance measuring schemes. The robust state representation served as the foundation for enhancing the model’s performance, ultimately providing an optimal treatment policy focused on rapid patient recovery.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用深度强化学习优化危重病人的治疗策略

由于数据驱动技术的出现，个性化临床决策支持系统正被越来越多地采用，这种方法现在在重症监护领域也得到了认可。由于医疗数据的异质性，将不同的患者病情和治疗程序纳入重症监护决策是一项具有挑战性的任务。人工智能（AI）的进步，尤其是强化学习（RL）技术的进步，使得通过使用学习代理推荐最佳策略来制定重症个性化治疗策略成为可能。在本研究中，我们提出了一种具有定制奖励函数和 LSTM-GRU 衍生状态表示的深度强化学习（DRL）模型，用于制定最佳治疗策略，在重症监护环境中稳定患者的生理状态。利用重症监护室数据集和重症监护医疗信息市场（MIMIC-III）数据集，我们重点研究了导致败血症的急性呼吸窘迫综合征（ARDS）患者，得出了优先考虑患者康复而不是患者生存的最佳策略。DRL模型的DDQN（RepDRL-DDQN）和Dueling DDQN（RepDRL-DDDQN）版本都超过了基线性能，所提出模型的学习代理在我们的性能测量方案中实现了最佳学习过程。稳健的状态表示是提高模型性能的基础，最终提供了以患者快速康复为重点的最佳治疗策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

9.30

自引率

2.00%

发文量

131

期刊介绍： ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world. ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.