Enhancing military medical evacuation dispatching with armed escort management: comparing model-based reinforcement learning approaches

Andrew G Gelbard, Phillip R. Jenkins, Matthew J. Robbins
{"title":"Enhancing military medical evacuation dispatching with armed escort management: comparing model-based reinforcement learning approaches","authors":"Andrew G Gelbard, Phillip R. Jenkins, Matthew J. Robbins","doi":"10.1177/15485129241229762","DOIUrl":null,"url":null,"abstract":"The military medical evacuation (MEDEVAC) dispatching problem involves determining optimal policies for evacuating combat casualties to maximize patient survivability during military operations. This study explores a variation of the MEDEVAC dispatching problem, focusing on controlling armed escorts using a Markov decision process (MDP) model and model-based reinforcement learning (RL) approaches. A discounted, continuous-time MDP model over an infinite horizon is developed to maximize the expected total discounted reward of the system. Two model-based RL solution approaches are proposed: one utilizing semi-gradient descent Q-learning and another employing semi-gradient descent SARSA. A computational example, set in western and central Africa during contingency operations, assesses the performance of the RL-generated policies against the myopic policy, which military medical planners currently employ. Solution quality is derived from expected response time, a crucial determinant of life-saving potential in MEDEVAC operations. The research also explores sensitivity analysis and excursion scenarios to evaluate the RL-generated policies further. By explicitly controlling armed escort assets, dispatching authorities can better manage the location and allocation of these resources throughout combat operations. The findings of this study have the potential to inform military medical planning, operations, and tactics, ultimately leading to improved MEDEVAC system performance and higher patient survivability rates.","PeriodicalId":508000,"journal":{"name":"The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/15485129241229762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The military medical evacuation (MEDEVAC) dispatching problem involves determining optimal policies for evacuating combat casualties to maximize patient survivability during military operations. This study explores a variation of the MEDEVAC dispatching problem, focusing on controlling armed escorts using a Markov decision process (MDP) model and model-based reinforcement learning (RL) approaches. A discounted, continuous-time MDP model over an infinite horizon is developed to maximize the expected total discounted reward of the system. Two model-based RL solution approaches are proposed: one utilizing semi-gradient descent Q-learning and another employing semi-gradient descent SARSA. A computational example, set in western and central Africa during contingency operations, assesses the performance of the RL-generated policies against the myopic policy, which military medical planners currently employ. Solution quality is derived from expected response time, a crucial determinant of life-saving potential in MEDEVAC operations. The research also explores sensitivity analysis and excursion scenarios to evaluate the RL-generated policies further. By explicitly controlling armed escort assets, dispatching authorities can better manage the location and allocation of these resources throughout combat operations. The findings of this study have the potential to inform military medical planning, operations, and tactics, ultimately leading to improved MEDEVAC system performance and higher patient survivability rates.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用武装护送管理加强军事医疗后送调度:比较基于模型的强化学习方法
军事医疗后送(MEDEVAC)调度问题涉及在军事行动中确定后送作战伤员的最佳策略,以最大限度地提高病人的存活率。本研究探讨了 MEDEVAC 调度问题的一个变种,重点是使用马尔可夫决策过程 (MDP) 模型和基于模型的强化学习 (RL) 方法控制武装护送。为了使系统的预期总贴现回报最大化,我们建立了一个无限视距的连续时间贴现 MDP 模型。提出了两种基于模型的 RL 解决方法:一种是利用半梯度下降 Q-learning 方法,另一种是利用半梯度下降 SARSA 方法。在应急行动期间,以非洲西部和中部为背景的一个计算实例评估了 RL 生成的策略与军事医疗规划人员目前采用的近视策略的性能对比。解决方案的质量来自预期响应时间,这是 MEDEVAC 行动中挽救生命潜力的关键决定因素。研究还探讨了敏感性分析和偏离情景,以进一步评估 RL 生成的策略。通过明确控制武装护送资产,调度当局可以在整个作战行动中更好地管理这些资源的位置和分配。本研究的结果有可能为军事医疗规划、行动和战术提供参考,最终提高医疗后送系统的性能和病人存活率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A quantitative comparison of two different models of minefield transit Cutting edge technologies for future military applications: trends and challenges Developing an IoT-enabled probabilistic model for quick identification of hidden radioactive materials in maritime port operations to strengthen global supply chain security Statement of requirements on the accuracy of rocket CFD analysis using exterior ballistics for example rocket models Enhancing military medical evacuation dispatching with armed escort management: comparing model-based reinforcement learning approaches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1