多臂强盗实值组合纯探索的快速算法。

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Computation Pub Date : 2025-01-21 DOI:10.1162/neco_a_01728

Shintaro Nakamura, Masashi Sugiyama

{"title":"多臂强盗实值组合纯探索的快速算法。","authors":"Shintaro Nakamura, Masashi Sugiyama","doi":"10.1162/neco_a_01728","DOIUrl":null,"url":null,"abstract":"We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper-bound-matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world data sets.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"294-310"},"PeriodicalIF":2.7000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of the Multi-Armed Bandit.\",\"authors\":\"Shintaro Nakamura, Masashi Sugiyama\",\"doi\":\"10.1162/neco_a_01728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper-bound-matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world data sets.\",\"PeriodicalId\":54731,\"journal\":{\"name\":\"Neural Computation\",\"volume\":\" \",\"pages\":\"294-310\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1162/neco_a_01728\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/neco_a_01728","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

研究随机多臂土匪（R-CPE-MAB）中的实值组合纯勘探问题。我们研究了动作集的大小是关于臂数的多项式的情况。在这种情况下，R-CPE-MAB可以被视为所谓的转导线性强盗的特殊情况。提出了一种基于组合间隙的探索算法（CombGapE），该算法的样本复杂度上界与下界匹配到一个与问题相关的常数因子。数值结果表明，在合成数据集和真实数据集中，CombGapE算法都明显优于现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of the Multi-Armed Bandit.

We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper-bound-matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Computation 工程技术-计算机：人工智能

CiteScore

6.30

自引率

3.40%

发文量

审稿时长

3.0 months

期刊介绍： Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.