Adversarial Group Linear Bandits and Its Application to Collaborative Edge Inference

Yin-Hae Huang, Letian Zhang, J. Xu
{"title":"Adversarial Group Linear Bandits and Its Application to Collaborative Edge Inference","authors":"Yin-Hae Huang, Letian Zhang, J. Xu","doi":"10.1109/INFOCOM53939.2023.10228900","DOIUrl":null,"url":null,"abstract":"Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.","PeriodicalId":387707,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM53939.2023.10228900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对抗性群体线性强盗及其在协同边缘推理中的应用
多武装盗匪是典型的不确定问题下的顺序决策。现有的大多数研究都是在随机奖励制度或对抗奖励制度下研究强盗问题,但对这两种制度的交集的研究却很少。在本文中,我们研究了一种新的强盗问题,称为对抗群体线性强盗(AGLB),其特征是奖励生成是随机过程和对抗行为的共同结果。特别是,学习者获得的奖励不仅是学习者在群体中选择的手臂的噪声线性函数,而且还取决于对手的群体级攻击决策。这样的问题存在于许多现实世界的应用中,例如,协作边缘推理和多站点在线广告放置。为了克服随机和对抗性奖励耦合中的不确定性,我们开发了一种新的强盗算法,称为EXPUCB,它结合了经典的LinUCB和EXP3算法,并证明了它的次线性后悔。我们将EXPUCB应用于协同边缘推理问题,并对其性能进行了评价。大量的仿真结果验证了EXPUCB在随机和对抗耦合奖励下的优越学习能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
i-NVMe: Isolated NVMe over TCP for a Containerized Environment One Shot for All: Quick and Accurate Data Aggregation for LPWANs Joint Participation Incentive and Network Pricing Design for Federated Learning Buffer Awareness Neural Adaptive Video Streaming for Avoiding Extra Buffer Consumption Melody: Toward Resource-Efficient Packet Header Vector Encoding on Programmable Switches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1