Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni, Alberto Maria Metelli
{"title":"Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting","authors":"Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni, Alberto Maria Metelli","doi":"arxiv-2409.05980","DOIUrl":null,"url":null,"abstract":"Rested and Restless Bandits are two well-known bandit settings that are\nuseful to model real-world sequential decision-making problems in which the\nexpected reward of an arm evolves over time due to the actions we perform or\ndue to the nature. In this work, we propose Graph-Triggered Bandits (GTBs), a\nunifying framework to generalize and extend rested and restless bandits. In\nthis setting, the evolution of the arms' expected rewards is governed by a\ngraph defined over the arms. An edge connecting a pair of arms $(i,j)$\nrepresents the fact that a pull of arm $i$ triggers the evolution of arm $j$,\nand vice versa. Interestingly, rested and restless bandits are both special\ncases of our model for some suitable (degenerated) graph. As relevant case\nstudies for this setting, we focus on two specific types of monotonic bandits:\nrising, where the expected reward of an arm grows as the number of triggers\nincreases, and rotting, where the opposite behavior occurs. For these cases, we\nstudy the optimal policies. We provide suitable algorithms for all scenarios\nand discuss their theoretical guarantees, highlighting the complexity of the\nlearning problem concerning instance-dependent terms that encode specific\nproperties of the underlying graph structure.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Rested and Restless Bandits are two well-known bandit settings that are
useful to model real-world sequential decision-making problems in which the
expected reward of an arm evolves over time due to the actions we perform or
due to the nature. In this work, we propose Graph-Triggered Bandits (GTBs), a
unifying framework to generalize and extend rested and restless bandits. In
this setting, the evolution of the arms' expected rewards is governed by a
graph defined over the arms. An edge connecting a pair of arms $(i,j)$
represents the fact that a pull of arm $i$ triggers the evolution of arm $j$,
and vice versa. Interestingly, rested and restless bandits are both special
cases of our model for some suitable (degenerated) graph. As relevant case
studies for this setting, we focus on two specific types of monotonic bandits:
rising, where the expected reward of an arm grows as the number of triggers
increases, and rotting, where the opposite behavior occurs. For these cases, we
study the optimal policies. We provide suitable algorithms for all scenarios
and discuss their theoretical guarantees, highlighting the complexity of the
learning problem concerning instance-dependent terms that encode specific
properties of the underlying graph structure.