Safe Sequential Optimization in Switching Environments

2021 National Conference on Communications (NCC) Pub Date : 2021-07-27 DOI:10.1109/NCC52529.2021.9530041

Durgesh Kalwar, V. Sukumaran

引用次数: 0

Abstract

We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

交换环境下的安全顺序优化

考虑一个随时间变化的未知时变函数的序列决策代理的设计问题。在每一步中，代理接收到一个由代理决定的点上的函数值的观察值。观测结果可能会受到噪声的干扰。agent也被约束以高概率做出安全决策，即所选点的函数值应该大于阈值。针对这种切换环境，我们提出了一种称为Adaptive-SafeOpt的策略，并通过仿真评估了其性能。该策略结合了贝叶斯优化和变化点检测来解决安全顺序优化问题。我们观察到，适应切换变化的主要挑战是在检测到变化点时识别安全决策并防止吸引到局部最优。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 National Conference on Communications (NCC)

自引率

0.00%

发文量