基于信息结构识别可处理的分散控制问题

2008 46th Annual Allerton Conference on Communication, Control, and Computing Pub Date : 2008-09-01 DOI:10.1109/ALLERTON.2008.4797732

Aditya Mahajan, A. Nayyar, D. Teneketzis

{"title":"基于信息结构识别可处理的分散控制问题","authors":"Aditya Mahajan, A. Nayyar, D. Teneketzis","doi":"10.1109/ALLERTON.2008.4797732","DOIUrl":null,"url":null,"abstract":"Sequential decomposition of two general models of decentralized systems with non-classical information structures is presented. In model A, all agents have two observations at each step: a common observation that all agents observe and a private observation of their own. The control actions of each agent is based on all past common observations, the current private observation and the contents of its memory. At each step, each agent also updates the contents of its memory. A cost function, which depends on the state of the plant and the control actions of all agents, is given. The objective is to choose control and memory update functions for all agents to either minimize a total expected cost over a finite horizon or to minimize a discounted cost over an infinite horizon. In model B, the agents do not have any common observation, the rest is same as in model A. The key idea of our solution methodology is the following. From the point of view of a fictitious agent that observes all common observations, the system can be viewed as a centralized system with partial observations. This allows us to identify information states and obtain a sequential decomposition. When the system variables take values in finite sets, the optimality equations of the sequential decomposition are similar to those of partially observable Markov decision processes (POMDP) with finite state and action spaces. For such systems, we can use algorithms for POMDPs to compute optimal designs for models A and B.","PeriodicalId":120561,"journal":{"name":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"Identifying tractable decentralized control problems on the basis of information structure\",\"authors\":\"Aditya Mahajan, A. Nayyar, D. Teneketzis\",\"doi\":\"10.1109/ALLERTON.2008.4797732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequential decomposition of two general models of decentralized systems with non-classical information structures is presented. In model A, all agents have two observations at each step: a common observation that all agents observe and a private observation of their own. The control actions of each agent is based on all past common observations, the current private observation and the contents of its memory. At each step, each agent also updates the contents of its memory. A cost function, which depends on the state of the plant and the control actions of all agents, is given. The objective is to choose control and memory update functions for all agents to either minimize a total expected cost over a finite horizon or to minimize a discounted cost over an infinite horizon. In model B, the agents do not have any common observation, the rest is same as in model A. The key idea of our solution methodology is the following. From the point of view of a fictitious agent that observes all common observations, the system can be viewed as a centralized system with partial observations. This allows us to identify information states and obtain a sequential decomposition. When the system variables take values in finite sets, the optimality equations of the sequential decomposition are similar to those of partially observable Markov decision processes (POMDP) with finite state and action spaces. For such systems, we can use algorithms for POMDPs to compute optimal designs for models A and B.\",\"PeriodicalId\":120561,\"journal\":{\"name\":\"2008 46th Annual Allerton Conference on Communication, Control, and Computing\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 46th Annual Allerton Conference on Communication, Control, and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2008.4797732\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2008.4797732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

摘要

给出了具有非经典信息结构的分散系统的两种一般模型的顺序分解。在模型A中，所有智能体在每一步都有两个观察结果:一个是所有智能体观察到的共同观察结果，另一个是它们自己的私人观察结果。每个智能体的控制动作基于所有过去的共同观察、当前的私有观察和其内存的内容。在每一步中，每个代理还更新其内存的内容。给出了一个成本函数，它依赖于工厂的状态和所有代理的控制动作。目标是为所有智能体选择控制和记忆更新函数，以最小化有限范围内的总期望成本或最小化无限范围内的折扣成本。在模型B中，代理没有任何共同观察，其余部分与模型a相同。我们的解决方法的关键思想如下。从观察所有共同观察的虚拟代理的角度来看，系统可以被视为具有部分观察的集中系统。这允许我们识别信息状态并获得顺序分解。当系统变量取值为有限集合时，序列分解的最优性方程类似于有限状态和有限动作空间的部分可观察马尔可夫决策过程的最优性方程。对于这样的系统，我们可以使用pomdp算法来计算模型A和B的最优设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Identifying tractable decentralized control problems on the basis of information structure

Sequential decomposition of two general models of decentralized systems with non-classical information structures is presented. In model A, all agents have two observations at each step: a common observation that all agents observe and a private observation of their own. The control actions of each agent is based on all past common observations, the current private observation and the contents of its memory. At each step, each agent also updates the contents of its memory. A cost function, which depends on the state of the plant and the control actions of all agents, is given. The objective is to choose control and memory update functions for all agents to either minimize a total expected cost over a finite horizon or to minimize a discounted cost over an infinite horizon. In model B, the agents do not have any common observation, the rest is same as in model A. The key idea of our solution methodology is the following. From the point of view of a fictitious agent that observes all common observations, the system can be viewed as a centralized system with partial observations. This allows us to identify information states and obtain a sequential decomposition. When the system variables take values in finite sets, the optimality equations of the sequential decomposition are similar to those of partially observable Markov decision processes (POMDP) with finite state and action spaces. For such systems, we can use algorithms for POMDPs to compute optimal designs for models A and B.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助