{"title":"Batched Online Contextual Sparse Bandits with Sequential Inclusion of Features","authors":"Rowan Swiers, Subash Prabanantham, Andrew Maher","doi":"arxiv-2409.09199","DOIUrl":null,"url":null,"abstract":"Multi-armed Bandits (MABs) are increasingly employed in online platforms and\ne-commerce to optimize decision making for personalized user experiences. In\nthis work, we focus on the Contextual Bandit problem with linear rewards, under\nconditions of sparsity and batched data. We address the challenge of fairness\nby excluding irrelevant features from decision-making processes using a novel\nalgorithm, Online Batched Sequential Inclusion (OBSI), which sequentially\nincludes features as confidence in their impact on the reward increases. Our\nexperiments on synthetic data show the superior performance of OBSI compared to\nother algorithms in terms of regret, relevance of features used, and compute.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"118 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-armed Bandits (MABs) are increasingly employed in online platforms and
e-commerce to optimize decision making for personalized user experiences. In
this work, we focus on the Contextual Bandit problem with linear rewards, under
conditions of sparsity and batched data. We address the challenge of fairness
by excluding irrelevant features from decision-making processes using a novel
algorithm, Online Batched Sequential Inclusion (OBSI), which sequentially
includes features as confidence in their impact on the reward increases. Our
experiments on synthetic data show the superior performance of OBSI compared to
other algorithms in terms of regret, relevance of features used, and compute.