{"title":"基于分位数有序优化的可处理抽样策略","authors":"Dongwook Shin, M. Broadie, A. Zeevi","doi":"10.1109/WSC.2016.7822147","DOIUrl":null,"url":null,"abstract":"This paper describes and analyzes the problem of selecting the best of several alternatives (“systems”), where they are compared based on quantiles of their performances. The quantiles cannot be evaluated analytically but it is possible to sequentially sample from each system. The objective is to dynamically allocate a finite sampling budget to minimize the probability of falsely selecting non-best systems. To formulate this problem in a tractable form, we introduce an objective associated with the probability of false selection using large deviations theory and leverage it to design well-performing dynamic sampling policies. We first propose a naive policy that optimizes the aforementioned objective when the sampling budget is sufficiently large. We introduce two variants of the naive policy with the aim of improving finite-time performance; these policies retain the asymptotic performance of the naive one in some cases, while dramatically improving its finite-time performance.","PeriodicalId":367269,"journal":{"name":"2016 Winter Simulation Conference (WSC)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Tractable sampling strategies for quantile-based ordinal optimization\",\"authors\":\"Dongwook Shin, M. Broadie, A. Zeevi\",\"doi\":\"10.1109/WSC.2016.7822147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes and analyzes the problem of selecting the best of several alternatives (“systems”), where they are compared based on quantiles of their performances. The quantiles cannot be evaluated analytically but it is possible to sequentially sample from each system. The objective is to dynamically allocate a finite sampling budget to minimize the probability of falsely selecting non-best systems. To formulate this problem in a tractable form, we introduce an objective associated with the probability of false selection using large deviations theory and leverage it to design well-performing dynamic sampling policies. We first propose a naive policy that optimizes the aforementioned objective when the sampling budget is sufficiently large. We introduce two variants of the naive policy with the aim of improving finite-time performance; these policies retain the asymptotic performance of the naive one in some cases, while dramatically improving its finite-time performance.\",\"PeriodicalId\":367269,\"journal\":{\"name\":\"2016 Winter Simulation Conference (WSC)\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Winter Simulation Conference (WSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WSC.2016.7822147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC.2016.7822147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tractable sampling strategies for quantile-based ordinal optimization
This paper describes and analyzes the problem of selecting the best of several alternatives (“systems”), where they are compared based on quantiles of their performances. The quantiles cannot be evaluated analytically but it is possible to sequentially sample from each system. The objective is to dynamically allocate a finite sampling budget to minimize the probability of falsely selecting non-best systems. To formulate this problem in a tractable form, we introduce an objective associated with the probability of false selection using large deviations theory and leverage it to design well-performing dynamic sampling policies. We first propose a naive policy that optimizes the aforementioned objective when the sampling budget is sufficiently large. We introduce two variants of the naive policy with the aim of improving finite-time performance; these policies retain the asymptotic performance of the naive one in some cases, while dramatically improving its finite-time performance.