Occupation Measure Heuristics to Solve Stochastic Shortest Path with Dead Ends

Milton Condori Fernández, Leliane Nunes de Barros, Karina Valdivia Delgado
{"title":"Occupation Measure Heuristics to Solve Stochastic Shortest Path with Dead Ends","authors":"Milton Condori Fernández, Leliane Nunes de Barros, Karina Valdivia Delgado","doi":"10.1109/bracis.2018.00096","DOIUrl":null,"url":null,"abstract":"The most efficient approach to solve probabilistic planning problems is based on stochastic shortest path (SSP) and uses heuristic search to find a policy that minimizes the expected accumulated cost to the goal (MINCOST criterion). However, this approach can only solve problems with dead ends (states from which it is not possible to reach the goal) if an efficient dead end detection heuristic is used. Another solution would be to plan in two phases: maximizing the probability to reach the goal (MAXPROB) and then minimizing the expected cost (MINCOST). While there exist several heuristics to solve MINCOST, there is not known efficient heuristics to solve MAXPROB. A recent work proposes the first heuristic that takes into account the probabilities, called h pom, which solves a relaxed version of an SSP as a linear program in the dual space. However, to solve large problems with dead ends, h pom must be augmented with a dead end detection heuristic (e.g., h_pom and h_max ). In this work, we propose two new heuristics based on h pom. The first, h^pe_pom (s), estimates the minimal cost of state s to reach the goal, including new variables and constraints for the dead ends and adding an expected penalty for reaching them. The second, h ppom (s), estimates the maximum probability of state s to reach the goal, and is used to solve MAXPROB problems by ignoring action costs; We claim that h ppom (s) is the first heuristic for MAXPROB. Empirical results show that h^pe_pom can solve larger planning instances when compared to h pom h_max.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bracis.2018.00096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The most efficient approach to solve probabilistic planning problems is based on stochastic shortest path (SSP) and uses heuristic search to find a policy that minimizes the expected accumulated cost to the goal (MINCOST criterion). However, this approach can only solve problems with dead ends (states from which it is not possible to reach the goal) if an efficient dead end detection heuristic is used. Another solution would be to plan in two phases: maximizing the probability to reach the goal (MAXPROB) and then minimizing the expected cost (MINCOST). While there exist several heuristics to solve MINCOST, there is not known efficient heuristics to solve MAXPROB. A recent work proposes the first heuristic that takes into account the probabilities, called h pom, which solves a relaxed version of an SSP as a linear program in the dual space. However, to solve large problems with dead ends, h pom must be augmented with a dead end detection heuristic (e.g., h_pom and h_max ). In this work, we propose two new heuristics based on h pom. The first, h^pe_pom (s), estimates the minimal cost of state s to reach the goal, including new variables and constraints for the dead ends and adding an expected penalty for reaching them. The second, h ppom (s), estimates the maximum probability of state s to reach the goal, and is used to solve MAXPROB problems by ignoring action costs; We claim that h ppom (s) is the first heuristic for MAXPROB. Empirical results show that h^pe_pom can solve larger planning instances when compared to h pom h_max.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
带死角随机最短路径的职业测度启发式求解
求解概率规划问题的最有效方法是基于随机最短路径(SSP),并使用启发式搜索来找到一个策略,该策略使到目标的期望累积成本最小(MINCOST准则)。然而,如果使用有效的死角检测启发式方法,这种方法只能解决死角问题(不可能达到目标的状态)。另一个解决方案是分两个阶段进行计划:最大化达到目标的概率(MAXPROB),然后最小化预期成本(MINCOST)。虽然存在几种求解MINCOST的启发式方法,但没有已知的求解MAXPROB的有效启发式方法。最近的一项工作提出了第一个考虑概率的启发式算法,称为h - pom,它将SSP作为对偶空间中的线性规划解决了一个宽松版本。然而,为了解决有死角的大问题,必须用死角检测启发式(例如,h_pom和h_max)来增强h - pom。在这项工作中,我们提出了两种新的启发式算法。第一个,h^pe_pom (s),估计达到目标的状态s的最小成本,包括死角的新变量和约束,并添加达到它们的预期惩罚。第二个是h ppom (s),估计状态s达到目标的最大概率,并通过忽略行动成本来解决MAXPROB问题;我们声称h (s)是MAXPROB的第一个启发式。实验结果表明,与h_max算法相比,h^pe_pom算法可以求解更大的规划实例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring the Data Using Extended Association Rule Network SPt: A Text Mining Process to Extract Relevant Areas from SW Documents to Exploratory Tests Gene Essentiality Prediction Using Topological Features From Metabolic Networks Bio-Inspired and Heuristic Methods Applied to a Benchmark of the Task Scheduling Problem A New Genetic Algorithm-Based Pruning Approach for Optimum-Path Forest
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1