交换环境下的安全顺序优化

2021 National Conference on Communications (NCC) Pub Date : 2021-07-27 DOI:10.1109/NCC52529.2021.9530041

Durgesh Kalwar, V. Sukumaran

{"title":"交换环境下的安全顺序优化","authors":"Durgesh Kalwar, V. Sukumaran","doi":"10.1109/NCC52529.2021.9530041","DOIUrl":null,"url":null,"abstract":"We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safe Sequential Optimization in Switching Environments\",\"authors\":\"Durgesh Kalwar, V. Sukumaran\",\"doi\":\"10.1109/NCC52529.2021.9530041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima.\",\"PeriodicalId\":414087,\"journal\":{\"name\":\"2021 National Conference on Communications (NCC)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC52529.2021.9530041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC52529.2021.9530041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

考虑一个随时间变化的未知时变函数的序列决策代理的设计问题。在每一步中，代理接收到一个由代理决定的点上的函数值的观察值。观测结果可能会受到噪声的干扰。agent也被约束以高概率做出安全决策，即所选点的函数值应该大于阈值。针对这种切换环境，我们提出了一种称为Adaptive-SafeOpt的策略，并通过仿真评估了其性能。该策略结合了贝叶斯优化和变化点检测来解决安全顺序优化问题。我们观察到，适应切换变化的主要挑战是在检测到变化点时识别安全决策并防止吸引到局部最优。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Safe Sequential Optimization in Switching Environments

We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 National Conference on Communications (NCC)

自引率

0.00%

发文量