{"title":"Stochastic approximation for uncapacitated assortment optimization under the multinomial logit model","authors":"Yannik Peeters, Arnoud V. den Boer","doi":"10.1002/nav.22068","DOIUrl":null,"url":null,"abstract":"We consider dynamic assortment optimization with incomplete information under the uncapacitated multinomial logit choice model. We propose an anytime stochastic approximation policy and prove that the regret—the cumulative expected revenue loss caused by offering suboptimal assortments—after T$$ T $$ time periods is bounded by T$$ \\sqrt{T} $$ times a constant that is independent of the number of products. In addition, we prove a matching lower bound on the regret for any policy that is valid for arbitrary model parameters—slightly generalizing a recent regret lower bound derived for specific revenue parameters. Numerical illustrations suggest that our policy outperforms alternatives by a significant margin when T$$ T $$ and the number of products N$$ N $$ are not too small.","PeriodicalId":19120,"journal":{"name":"Naval Research Logistics (NRL)","volume":"31 1","pages":"927 - 938"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Naval Research Logistics (NRL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/nav.22068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We consider dynamic assortment optimization with incomplete information under the uncapacitated multinomial logit choice model. We propose an anytime stochastic approximation policy and prove that the regret—the cumulative expected revenue loss caused by offering suboptimal assortments—after T$$ T $$ time periods is bounded by T$$ \sqrt{T} $$ times a constant that is independent of the number of products. In addition, we prove a matching lower bound on the regret for any policy that is valid for arbitrary model parameters—slightly generalizing a recent regret lower bound derived for specific revenue parameters. Numerical illustrations suggest that our policy outperforms alternatives by a significant margin when T$$ T $$ and the number of products N$$ N $$ are not too small.
在无能力多项logit选择模型下,研究了具有不完全信息的动态分类优化问题。我们提出了一个随时随机逼近策略,并证明了T $$ T $$时间段后的遗憾-由提供次优分类引起的累积预期收入损失由T $$ \sqrt{T} $$乘以一个与产品数量无关的常数所限制。此外,我们证明了对任意模型参数有效的任何策略的后悔下界的匹配下界-稍微推广了最近为特定收益参数导出的后悔下界。数值实例表明,当T $$ T $$和产品数量N $$ N $$不是太小时,我们的政策明显优于替代方案。