多式联运动态定价

Yining Wang, Boxiao Chen, D. Simchi-Levi
{"title":"多式联运动态定价","authors":"Yining Wang, Boxiao Chen, D. Simchi-Levi","doi":"10.2139/ssrn.3489355","DOIUrl":null,"url":null,"abstract":"We consider a stylistic question of dynamic pricing of a single product with demand learning. The candidate prices belong to a wide range of price interval, and the modeling of the demand functions is nonparametric in nature, imposing only smoothness regularity conditions. One important aspect of our modeling is the possibility of the expected reward function to be non-convex and indeed multi-modal, which leads to many conceptual and technical challenges. Our proposed algorithm is inspired by both the Upper-Confidence-Bound (UCB) algorithm for multi-armed bandit and the Optimism-in-Face-of-Uncertainty (OFU) principle arising from linear contextual bandits. Through rigorous regret analysis, we demonstrate that our proposed algorithm achieves optimal worst-case regret over a wide range of smooth function classes. More specifically, for k-times smooth functions and T selling periods, the regret of our propose algorithm is O(T^{(k+1)/(2k+1)}), which is shown to be optimal via information theoretical lower bounds. We also show that in special cases such as strongly concave or infinitely smooth reward functions, our algorithm achieves an O(sqrt{T}) regret matching optimal regret established in previous works. Finally, we present numerical results which verify the effectiveness of our method in numerical simulations.","PeriodicalId":102139,"journal":{"name":"Other Topics Engineering Research eJournal","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multi-Modal Dynamic Pricing\",\"authors\":\"Yining Wang, Boxiao Chen, D. Simchi-Levi\",\"doi\":\"10.2139/ssrn.3489355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider a stylistic question of dynamic pricing of a single product with demand learning. The candidate prices belong to a wide range of price interval, and the modeling of the demand functions is nonparametric in nature, imposing only smoothness regularity conditions. One important aspect of our modeling is the possibility of the expected reward function to be non-convex and indeed multi-modal, which leads to many conceptual and technical challenges. Our proposed algorithm is inspired by both the Upper-Confidence-Bound (UCB) algorithm for multi-armed bandit and the Optimism-in-Face-of-Uncertainty (OFU) principle arising from linear contextual bandits. Through rigorous regret analysis, we demonstrate that our proposed algorithm achieves optimal worst-case regret over a wide range of smooth function classes. More specifically, for k-times smooth functions and T selling periods, the regret of our propose algorithm is O(T^{(k+1)/(2k+1)}), which is shown to be optimal via information theoretical lower bounds. We also show that in special cases such as strongly concave or infinitely smooth reward functions, our algorithm achieves an O(sqrt{T}) regret matching optimal regret established in previous works. Finally, we present numerical results which verify the effectiveness of our method in numerical simulations.\",\"PeriodicalId\":102139,\"journal\":{\"name\":\"Other Topics Engineering Research eJournal\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Other Topics Engineering Research eJournal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3489355\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Topics Engineering Research eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3489355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

我们考虑了一个具有需求学习的单一产品动态定价的风格问题。候选价格属于很宽的价格区间,需求函数的建模本质上是非参数的,只施加平滑正则性条件。我们建模的一个重要方面是期望奖励函数是非凸的,并且确实是多模态的,这导致了许多概念和技术上的挑战。我们提出的算法受到多臂强盗的上置信度界(UCB)算法和线性上下文强盗的不确定性乐观(OFU)原理的启发。通过严格的遗憾分析,我们证明了我们提出的算法在广泛的光滑函数类上实现了最优的最坏情况遗憾。更具体地说,对于k次光滑函数和T个销售周期,我们提出的算法的遗憾是O(T^{(k+1)/(2k+1)}),通过信息理论下界证明了它是最优的。我们还表明,在特殊情况下,如强凹或无限光滑的奖励函数,我们的算法实现了O(sqrt{T})的后悔匹配最优后悔在以前的工作中建立。最后给出了数值模拟结果,验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multi-Modal Dynamic Pricing
We consider a stylistic question of dynamic pricing of a single product with demand learning. The candidate prices belong to a wide range of price interval, and the modeling of the demand functions is nonparametric in nature, imposing only smoothness regularity conditions. One important aspect of our modeling is the possibility of the expected reward function to be non-convex and indeed multi-modal, which leads to many conceptual and technical challenges. Our proposed algorithm is inspired by both the Upper-Confidence-Bound (UCB) algorithm for multi-armed bandit and the Optimism-in-Face-of-Uncertainty (OFU) principle arising from linear contextual bandits. Through rigorous regret analysis, we demonstrate that our proposed algorithm achieves optimal worst-case regret over a wide range of smooth function classes. More specifically, for k-times smooth functions and T selling periods, the regret of our propose algorithm is O(T^{(k+1)/(2k+1)}), which is shown to be optimal via information theoretical lower bounds. We also show that in special cases such as strongly concave or infinitely smooth reward functions, our algorithm achieves an O(sqrt{T}) regret matching optimal regret established in previous works. Finally, we present numerical results which verify the effectiveness of our method in numerical simulations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Impact of Information Sharing on Bullwhip Effect in a Non-Serial Supply Chain with Stochastic Lead Time On the Problem of the Specific Frequency of Globular Clusters A Polynomial Least Squares Multiple-Model Estimator: Simple, Optimal, Adaptive, and Practical Predicting and Improving Hydraulic Performance of Pumping Suction Intakes By Computational Fluid Dynamics (CFD) Heptamethine and Nonamethine Cyanine Dyes: Novel Synthetic Strategy, Electronic Transitions, Solvatochromic and Halochromic Evaluation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1