带 Bandit 反馈的分布式资源分配的安全定价机制

IF 5 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Control of Network Systems Pub Date : 2024-03-01 DOI:10.1109/TCNS.2024.3372143

Spencer Hutchinson;Berkay Turan;Mahnoosh Alizadeh

{"title":"带 Bandit 反馈的分布式资源分配的安全定价机制","authors":"Spencer Hutchinson;Berkay Turan;Mahnoosh Alizadeh","doi":"10.1109/TCNS.2024.3372143","DOIUrl":null,"url":null,"abstract":"In societal-scale infrastructures, such as electric grids or transportation networks, pricing mechanisms are often used as a way to shape users' demand in order to lower operating costs and improve reliability. Existing approaches to pricing design for safety-critical networks often require that users are queried beforehand to negotiate prices, which has proven to be challenging to implement in the real world. To offer a more practical alternative, we develop learning-based pricing mechanisms that require no input from the users. These pricing mechanisms aim to maximize the utility of the users' consumption by gradually estimating the users' price response over a span of <inline-formula><tex-math>$T$</tex-math></inline-formula> time steps (e.g., days) while ensuring that the infrastructure network's safety constraints that limit the users' demand are satisfied at all time steps. We propose two different algorithms for the two different scenarios when: the utility function is chosen by the central coordinator to achieve a social objective, and the utility function is defined by the price response under the assumption that the users are self-interested agents. We prove that both algorithms enjoy <inline-formula><tex-math>$\\tilde{\\mathcal {O}} (T^{2/3})$</tex-math></inline-formula> regret with high probability. We then apply these algorithms to demand response pricing for the smart grid and numerically demonstrate their effectiveness.","PeriodicalId":56023,"journal":{"name":"IEEE Transactions on Control of Network Systems","volume":"11 4","pages":"2010-2021"},"PeriodicalIF":5.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safe Pricing Mechanisms for Distributed Resource Allocation With Bandit Feedback\",\"authors\":\"Spencer Hutchinson;Berkay Turan;Mahnoosh Alizadeh\",\"doi\":\"10.1109/TCNS.2024.3372143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In societal-scale infrastructures, such as electric grids or transportation networks, pricing mechanisms are often used as a way to shape users' demand in order to lower operating costs and improve reliability. Existing approaches to pricing design for safety-critical networks often require that users are queried beforehand to negotiate prices, which has proven to be challenging to implement in the real world. To offer a more practical alternative, we develop learning-based pricing mechanisms that require no input from the users. These pricing mechanisms aim to maximize the utility of the users' consumption by gradually estimating the users' price response over a span of <inline-formula><tex-math>$T$</tex-math></inline-formula> time steps (e.g., days) while ensuring that the infrastructure network's safety constraints that limit the users' demand are satisfied at all time steps. We propose two different algorithms for the two different scenarios when: the utility function is chosen by the central coordinator to achieve a social objective, and the utility function is defined by the price response under the assumption that the users are self-interested agents. We prove that both algorithms enjoy <inline-formula><tex-math>$\\\\tilde{\\\\mathcal {O}} (T^{2/3})$</tex-math></inline-formula> regret with high probability. We then apply these algorithms to demand response pricing for the smart grid and numerically demonstrate their effectiveness.\",\"PeriodicalId\":56023,\"journal\":{\"name\":\"IEEE Transactions on Control of Network Systems\",\"volume\":\"11 4\",\"pages\":\"2010-2021\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Control of Network Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10457043/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Control of Network Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10457043/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在社会规模的基础设施中，如电网或交通网络，定价机制经常被用作塑造用户需求的一种方式，以降低运营成本和提高可靠性。对于安全关键型网络，现有的定价设计方法通常要求事先询问用户以协商价格，这在现实世界中被证明是具有挑战性的。为了提供更实用的替代方案，我们开发了基于学习的定价机制，不需要用户的输入。这些定价机制旨在通过逐步估计用户在$T$时间步（例如，天）内的价格响应来最大化用户消费的效用，同时确保限制用户需求的基础设施网络安全约束在所有时间步上都得到满足。我们针对两种不同的场景提出了两种不同的算法：效用函数是由中央协调者为实现社会目标而选择的，效用函数是由假设用户是自利益主体的价格响应来定义的。我们证明了这两种算法都具有高概率的$\tilde{\mathcal {O}} (T^{2/3})$遗憾。然后，我们将这些算法应用于智能电网的需求响应定价，并在数值上证明了它们的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Safe Pricing Mechanisms for Distributed Resource Allocation With Bandit Feedback

In societal-scale infrastructures, such as electric grids or transportation networks, pricing mechanisms are often used as a way to shape users' demand in order to lower operating costs and improve reliability. Existing approaches to pricing design for safety-critical networks often require that users are queried beforehand to negotiate prices, which has proven to be challenging to implement in the real world. To offer a more practical alternative, we develop learning-based pricing mechanisms that require no input from the users. These pricing mechanisms aim to maximize the utility of the users' consumption by gradually estimating the users' price response over a span of

$T$

time steps (e.g., days) while ensuring that the infrastructure network's safety constraints that limit the users' demand are satisfied at all time steps. We propose two different algorithms for the two different scenarios when: the utility function is chosen by the central coordinator to achieve a social objective, and the utility function is defined by the price response under the assumption that the users are self-interested agents. We prove that both algorithms enjoy

$\tilde{\mathcal {O}} (T^{2/3})$

regret with high probability. We then apply these algorithms to demand response pricing for the smart grid and numerically demonstrate their effectiveness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Control of Network Systems Mathematics-Control and Optimization

CiteScore

7.80

自引率

7.10%

发文量

169

期刊介绍： The IEEE Transactions on Control of Network Systems is committed to the timely publication of high-impact papers at the intersection of control systems and network science. In particular, the journal addresses research on the analysis, design and implementation of networked control systems, as well as control over networks. Relevant work includes the full spectrum from basic research on control systems to the design of engineering solutions for automatic control of, and over, networks. The topics covered by this journal include: Coordinated control and estimation over networks, Control and computation over sensor networks, Control under communication constraints, Control and performance analysis issues that arise in the dynamics of networks used in application areas such as communications, computers, transportation, manufacturing, Web ranking and aggregation, social networks, biology, power systems, economics, Synchronization of activities across a controlled network, Stability analysis of controlled networks, Analysis of networks as hybrid dynamical systems.