{"title":"Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon","authors":"A. Kolnogorov, Denis Grunev","doi":"10.1080/23737484.2021.1986170","DOIUrl":null,"url":null,"abstract":"ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"24 1","pages":"536 - 544"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Statistics Case Studies Data Analysis and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/23737484.2021.1986170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.