{"title":"Online learning in decentralized multi-user spectrum access with synchronized explorations","authors":"Cem Tekin, M. Liu","doi":"10.1109/MILCOM.2012.6415693","DOIUrl":null,"url":null,"abstract":"In this paper we consider decentralized multi-user online learning of unused spectrum bands as an opportunistic spectrum access (OSA) problem. There is a set of M secondary users exploiting the spectrum opportunities in K channels. We develop a distributed algorithm for the secondary users that will learn the optimal allocation with logarithmic regret. Thus, our algorithm achieves the fastest convergence rate to the optimal allocation. In a more general framework, our algorithm gives an order optimal solution to the decentralized multi-player multi-armed bandit problem with general reward functions.","PeriodicalId":18720,"journal":{"name":"MILCOM 2012 - 2012 IEEE Military Communications Conference","volume":"79 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2012 - 2012 IEEE Military Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM.2012.6415693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
In this paper we consider decentralized multi-user online learning of unused spectrum bands as an opportunistic spectrum access (OSA) problem. There is a set of M secondary users exploiting the spectrum opportunities in K channels. We develop a distributed algorithm for the secondary users that will learn the optimal allocation with logarithmic regret. Thus, our algorithm achieves the fastest convergence rate to the optimal allocation. In a more general framework, our algorithm gives an order optimal solution to the decentralized multi-player multi-armed bandit problem with general reward functions.