{"title":"具有任意收益参数的容量化MNL模型下分类优化的遗憾下界","authors":"Yannik Peeters, Arnoud V. den Boer","doi":"10.1017/S0269964821000395","DOIUrl":null,"url":null,"abstract":"Abstract In this note, we consider dynamic assortment optimization with incomplete information under the capacitated multinomial logit choice model. Recently, it has been shown that the regret (the cumulative expected revenue loss caused by offering suboptimal assortments) that any decision policy endures is bounded from below by a constant times $\\sqrt {NT}$, where $N$ denotes the number of products and $T$ denotes the time horizon. This result is shown under the assumption that the product revenues are constant, and thus leaves the question open whether a lower regret rate can be achieved for nonconstant revenue parameters. In this note, we show that this is not the case: we show that, for any vector of product revenues there is a positive constant such that the regret of any policy is bounded from below by this constant times $\\sqrt {N T}$. Our result implies that policies that achieve ${{\\mathcal {O}}}(\\sqrt {NT})$ regret are asymptotically optimal for all product revenue parameters.","PeriodicalId":54582,"journal":{"name":"Probability in the Engineering and Informational Sciences","volume":"30 1","pages":"1266 - 1274"},"PeriodicalIF":0.7000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A regret lower bound for assortment optimization under the capacitated MNL model with arbitrary revenue parameters\",\"authors\":\"Yannik Peeters, Arnoud V. den Boer\",\"doi\":\"10.1017/S0269964821000395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract In this note, we consider dynamic assortment optimization with incomplete information under the capacitated multinomial logit choice model. Recently, it has been shown that the regret (the cumulative expected revenue loss caused by offering suboptimal assortments) that any decision policy endures is bounded from below by a constant times $\\\\sqrt {NT}$, where $N$ denotes the number of products and $T$ denotes the time horizon. This result is shown under the assumption that the product revenues are constant, and thus leaves the question open whether a lower regret rate can be achieved for nonconstant revenue parameters. In this note, we show that this is not the case: we show that, for any vector of product revenues there is a positive constant such that the regret of any policy is bounded from below by this constant times $\\\\sqrt {N T}$. Our result implies that policies that achieve ${{\\\\mathcal {O}}}(\\\\sqrt {NT})$ regret are asymptotically optimal for all product revenue parameters.\",\"PeriodicalId\":54582,\"journal\":{\"name\":\"Probability in the Engineering and Informational Sciences\",\"volume\":\"30 1\",\"pages\":\"1266 - 1274\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Probability in the Engineering and Informational Sciences\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1017/S0269964821000395\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Probability in the Engineering and Informational Sciences","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1017/S0269964821000395","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
A regret lower bound for assortment optimization under the capacitated MNL model with arbitrary revenue parameters
Abstract In this note, we consider dynamic assortment optimization with incomplete information under the capacitated multinomial logit choice model. Recently, it has been shown that the regret (the cumulative expected revenue loss caused by offering suboptimal assortments) that any decision policy endures is bounded from below by a constant times $\sqrt {NT}$, where $N$ denotes the number of products and $T$ denotes the time horizon. This result is shown under the assumption that the product revenues are constant, and thus leaves the question open whether a lower regret rate can be achieved for nonconstant revenue parameters. In this note, we show that this is not the case: we show that, for any vector of product revenues there is a positive constant such that the regret of any policy is bounded from below by this constant times $\sqrt {N T}$. Our result implies that policies that achieve ${{\mathcal {O}}}(\sqrt {NT})$ regret are asymptotically optimal for all product revenue parameters.
期刊介绍:
The primary focus of the journal is on stochastic modelling in the physical and engineering sciences, with particular emphasis on queueing theory, reliability theory, inventory theory, simulation, mathematical finance and probabilistic networks and graphs. Papers on analytic properties and related disciplines are also considered, as well as more general papers on applied and computational probability, if appropriate. Readers include academics working in statistics, operations research, computer science, engineering, management science and physical sciences as well as industrial practitioners engaged in telecommunications, computer science, financial engineering, operations research and management science.