{"title":"同步探索的分散多用户频谱访问中的在线学习","authors":"Cem Tekin, M. Liu","doi":"10.1109/MILCOM.2012.6415693","DOIUrl":null,"url":null,"abstract":"In this paper we consider decentralized multi-user online learning of unused spectrum bands as an opportunistic spectrum access (OSA) problem. There is a set of M secondary users exploiting the spectrum opportunities in K channels. We develop a distributed algorithm for the secondary users that will learn the optimal allocation with logarithmic regret. Thus, our algorithm achieves the fastest convergence rate to the optimal allocation. In a more general framework, our algorithm gives an order optimal solution to the decentralized multi-player multi-armed bandit problem with general reward functions.","PeriodicalId":18720,"journal":{"name":"MILCOM 2012 - 2012 IEEE Military Communications Conference","volume":"79 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Online learning in decentralized multi-user spectrum access with synchronized explorations\",\"authors\":\"Cem Tekin, M. Liu\",\"doi\":\"10.1109/MILCOM.2012.6415693\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we consider decentralized multi-user online learning of unused spectrum bands as an opportunistic spectrum access (OSA) problem. There is a set of M secondary users exploiting the spectrum opportunities in K channels. We develop a distributed algorithm for the secondary users that will learn the optimal allocation with logarithmic regret. Thus, our algorithm achieves the fastest convergence rate to the optimal allocation. In a more general framework, our algorithm gives an order optimal solution to the decentralized multi-player multi-armed bandit problem with general reward functions.\",\"PeriodicalId\":18720,\"journal\":{\"name\":\"MILCOM 2012 - 2012 IEEE Military Communications Conference\",\"volume\":\"79 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2012 - 2012 IEEE Military Communications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM.2012.6415693\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2012 - 2012 IEEE Military Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM.2012.6415693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online learning in decentralized multi-user spectrum access with synchronized explorations
In this paper we consider decentralized multi-user online learning of unused spectrum bands as an opportunistic spectrum access (OSA) problem. There is a set of M secondary users exploiting the spectrum opportunities in K channels. We develop a distributed algorithm for the secondary users that will learn the optimal allocation with logarithmic regret. Thus, our algorithm achieves the fastest convergence rate to the optimal allocation. In a more general framework, our algorithm gives an order optimal solution to the decentralized multi-player multi-armed bandit problem with general reward functions.