{"title":"Adversarial Group Linear Bandits and Its Application to Collaborative Edge Inference","authors":"Yin-Hae Huang, Letian Zhang, J. Xu","doi":"10.1109/INFOCOM53939.2023.10228900","DOIUrl":null,"url":null,"abstract":"Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.","PeriodicalId":387707,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM53939.2023.10228900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.