基于线性POMDP代价的活动状态估计联合熵最小化

2022 American Control Conference (ACC) Pub Date : 2022-06-08 DOI:10.23919/ACC53348.2022.9867569

Timothy L. Molloy, G. Nair

{"title":"基于线性POMDP代价的活动状态估计联合熵最小化","authors":"Timothy L. Molloy, G. Nair","doi":"10.23919/ACC53348.2022.9867569","DOIUrl":null,"url":null,"abstract":"Active state estimation is the problem of controlling a partially observed Markov decision process (POMDP) to minimize the uncertainty associated with its latent states. Selecting meaningful, yet tractable, measures of uncertainty to optimize is a key challenge in active state estimation, with the vast majority of popular uncertainty measures leading to POMDP costs that are nonlinear in the belief state, which makes them difficult (and often impossible) to optimize directly using standard POMDP solvers. To address this challenge, in this paper we propose the joint entropy of the state, observation, and control trajectories of POMDPs as a novel tractable uncertainty measure for active state estimation. By expressing the joint entropy in stage-additive form, we show that joint-entropy-minimization (JEM) problems can be reformulated as standard POMDPs with cost functions that are linear in the belief state. Linearity of the costs is of considerable practical significance since it enables the solution of our JEM problems directly using standard POMDP solvers. We illustrate JEM in simulations where it reduces the probability of error in state trajectory estimates whilst being more computationally efficient than competing active state estimation formulations.","PeriodicalId":366299,"journal":{"name":"2022 American Control Conference (ACC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"JEM: Joint Entropy Minimization for Active State Estimation with Linear POMDP Costs\",\"authors\":\"Timothy L. Molloy, G. Nair\",\"doi\":\"10.23919/ACC53348.2022.9867569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Active state estimation is the problem of controlling a partially observed Markov decision process (POMDP) to minimize the uncertainty associated with its latent states. Selecting meaningful, yet tractable, measures of uncertainty to optimize is a key challenge in active state estimation, with the vast majority of popular uncertainty measures leading to POMDP costs that are nonlinear in the belief state, which makes them difficult (and often impossible) to optimize directly using standard POMDP solvers. To address this challenge, in this paper we propose the joint entropy of the state, observation, and control trajectories of POMDPs as a novel tractable uncertainty measure for active state estimation. By expressing the joint entropy in stage-additive form, we show that joint-entropy-minimization (JEM) problems can be reformulated as standard POMDPs with cost functions that are linear in the belief state. Linearity of the costs is of considerable practical significance since it enables the solution of our JEM problems directly using standard POMDP solvers. We illustrate JEM in simulations where it reduces the probability of error in state trajectory estimates whilst being more computationally efficient than competing active state estimation formulations.\",\"PeriodicalId\":366299,\"journal\":{\"name\":\"2022 American Control Conference (ACC)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 American Control Conference (ACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ACC53348.2022.9867569\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC53348.2022.9867569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

主动状态估计是控制部分观察到的马尔可夫决策过程(POMDP)以最小化与其潜在状态相关的不确定性的问题。在主动状态估计中，选择有意义且易于处理的不确定性度量进行优化是一个关键挑战，绝大多数流行的不确定性度量导致POMDP成本在信念状态下是非线性的，这使得直接使用标准POMDP求解器进行优化变得困难(通常是不可能的)。为了解决这一挑战，本文提出了pomdp的状态、观察和控制轨迹的联合熵作为一种新的可处理的不确定性度量，用于主动状态估计。通过将联合熵表示为阶段加性形式，我们证明了联合熵最小化问题可以被重新表述为在信念状态下成本函数为线性的标准pomdp问题。成本的线性具有相当大的实际意义，因为它可以直接使用标准的POMDP求解器解决我们的JEM问题。我们在模拟中说明了JEM，它减少了状态轨迹估计中的错误概率，同时比竞争的主动状态估计公式更具计算效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

JEM: Joint Entropy Minimization for Active State Estimation with Linear POMDP Costs

Active state estimation is the problem of controlling a partially observed Markov decision process (POMDP) to minimize the uncertainty associated with its latent states. Selecting meaningful, yet tractable, measures of uncertainty to optimize is a key challenge in active state estimation, with the vast majority of popular uncertainty measures leading to POMDP costs that are nonlinear in the belief state, which makes them difficult (and often impossible) to optimize directly using standard POMDP solvers. To address this challenge, in this paper we propose the joint entropy of the state, observation, and control trajectories of POMDPs as a novel tractable uncertainty measure for active state estimation. By expressing the joint entropy in stage-additive form, we show that joint-entropy-minimization (JEM) problems can be reformulated as standard POMDPs with cost functions that are linear in the belief state. Linearity of the costs is of considerable practical significance since it enables the solution of our JEM problems directly using standard POMDP solvers. We illustrate JEM in simulations where it reduces the probability of error in state trajectory estimates whilst being more computationally efficient than competing active state estimation formulations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 American Control Conference (ACC)

自引率

0.00%

发文量

期刊最新文献

Optimal Connectivity during Multi-agent Consensus Dynamics via Model Predictive Control Gradient-Based Optimization for Anti-Windup PID Controls Power Management for Noise Aware Path Planning of Hybrid UAVs Fixed-Time Seeking and Tracking of Time-Varying Nash Equilibria in Noncooperative Games Aerial Interception of Non-Cooperative Intruder using Model Predictive Control