We consider the inventory control problem of a multiwarehouse, multistore system over a time horizon when the warehouses receive no external replenishment. This problem is prevalent in retail settings, and it is referred to in the work of [Jackson PL (1988) Stock allocation in a two-echelon distribution system or “what to do until your ship comes in.” Management Sci. 34(7):880–895] as the problem of “what to do until your (external) shipment comes in.” The warehouses are stocked with initial inventories, and the stores are dynamically replenished from the warehouses in each period of the planning horizon. Excess demand in each period at a store is lost. The optimal policy for this problem is complex and state dependent, and because of the curse of dimensionality, computing the optimal policy using standard dynamic programming is numerically intractable. Static Lagrangian base-stock (LaBS) policies have been developed for this problem [Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales. Oper. Res. 70(1):141–159] and shown to be asymptotically optimal. In this paper, we develop adaptive policies that dynamically adjust the control parameters of a vanilla static LaBS policy using realized historical demands. We show, both theoretically and numerically, that adaptive policies significantly improve the performance of the LaBS policy, with the magnitude of improvement characterized by the number of policy adjustments. In particular, when the number of adjustments is a logarithm of the length of time horizon, the policy is rate optimal in the sense that the rate of the loss (in terms of the dependency on the length of the time horizon) matches that of the theoretical lower bound. Among other insights, our results also highlight the benefit of incorporating the “pooling effect” in designing a dynamic adjustment scheme.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0668.
我们考虑的是一个多仓库、多分店系统在仓库没有外部补货时的库存控制问题。这个问题在零售业中非常普遍,在[Jackson PL (1988) Stock allocation in a two-echelon distribution system or "what to do until your ship comes in."]的著作中被称为 "在你的船到港之前该怎么办 "的问题。管理科学》34(7):880-895] 中被称为 "在(外部)货物到达之前该怎么办 "的问题。仓库备有初始库存,在计划期的每个阶段都会从仓库动态地补充库存。商店在每个时期的超额需求都会损失。这个问题的最优策略既复杂又依赖于状态,而且由于维数诅咒,使用标准动态编程计算最优策略在数值上是难以实现的。针对这一问题,人们提出了静态拉格朗日基础库存(LaBS)策略[Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales.Oper.70(1):141-159],并证明是渐近最优的。在本文中,我们开发了自适应策略,利用已实现的历史需求动态调整虚静态 LaBS 策略的控制参数。我们从理论和数值上证明,自适应政策能显著改善 LaBS 政策的性能,改善的程度取决于政策调整的次数。特别是,当调整次数是时间跨度长度的对数时,该政策的损失率(与时间跨度长度的关系)与理论下限相匹配,因而是最优的。除其他见解外,我们的结果还强调了在设计动态调整方案时纳入 "集合效应 "的好处:在线附录见 https://doi.org/10.1287/opre.2022.0668。
{"title":"Adaptive Lagrangian Policies for a Multiwarehouse, Multistore Inventory System with Lost Sales","authors":"Xiuli Chao, Stefanus Jasin, Sentao Miao","doi":"10.1287/opre.2022.0668","DOIUrl":"https://doi.org/10.1287/opre.2022.0668","url":null,"abstract":"<p>We consider the inventory control problem of a multiwarehouse, multistore system over a time horizon when the warehouses receive no external replenishment. This problem is prevalent in retail settings, and it is referred to in the work of [Jackson PL (1988) Stock allocation in a two-echelon distribution system or “what to do until your ship comes in.” <i>Management Sci.</i> 34(7):880–895] as the problem of “what to do until your (external) shipment comes in.” The warehouses are stocked with initial inventories, and the stores are dynamically replenished from the warehouses in each period of the planning horizon. Excess demand in each period at a store is lost. The optimal policy for this problem is complex and state dependent, and because of the curse of dimensionality, computing the optimal policy using standard dynamic programming is numerically intractable. <i>Static</i> Lagrangian base-stock (LaBS) policies have been developed for this problem [Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales. <i>Oper. Res.</i> 70(1):141–159] and shown to be asymptotically optimal. In this paper, we develop <i>adaptive</i> policies that <i>dynamically</i> adjust the control parameters of a vanilla static LaBS policy using realized historical demands. We show, both theoretically and numerically, that adaptive policies significantly improve the performance of the LaBS policy, with the magnitude of improvement characterized by the number of policy adjustments. In particular, when the number of adjustments is a logarithm of the length of time horizon, the policy is rate optimal in the sense that the rate of the loss (in terms of the dependency on the length of the time horizon) matches that of the theoretical lower bound. Among other insights, our results also highlight the benefit of incorporating the “pooling effect” in designing a dynamic adjustment scheme.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0668.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data. Specifically, consider a γ-discounted, infinite-horizon Markov game with S states, in which the max-player has A actions and the min-player has B actions. We propose a pessimistic model–based algorithm with Bernstein-style lower confidence bounds—called the value iteration with lower confidence bounds for zero-sum Markov games—that provably finds an ε-approximate Nash equilibrium with a sample complexity no larger than (up to some log factor). Here, is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy ε can be any value within . Our sample complexity bound strengthens prior art by a factor of , achieving minimax optimality for a broad regime of interest. An appealing feature of our result lies in its algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.
Funding: Y. Yan is supported in part by the Charlotte Elizabeth Procter Honorific Fellowship from Princeton University and the Norbert Wiener Postdoctoral Fellowship from MIT. Y. Chen is supported in part by the Alfred P. Sloan Research Fellowship, the Google Research Scholar Award, the Air Force Office of Scientific Research [Grant FA9550-22-1-0198], the Office of Naval Research [Grant N00014-22-1-2354], and the National Science Foundation [Grants CCF-2221009, CCF-1907661, IIS-2218713, DMS-2014279, and IIS-2218773]. J. Fan is supported in part by the National Science Foundation [Grants DMS-1712591, DMS-2052926, DMS-2053832, and DMS-2210833] and Office of Naval
本文在从离线数据学习双人零和马尔可夫博弈中的纳什均衡方面取得了进展。具体来说,考虑一个具有 S 种状态的 γ 贴现无限视距马尔可夫博弈,其中最大玩家有 A 种行动,最小玩家有 B 种行动。我们提出了一种基于模型的悲观算法,该算法具有伯恩斯坦式置信下限,即零和马尔可夫博弈的置信下限值迭代,可以证明它能找到一个ε近似纳什均衡,样本复杂度不大于 Cclipped⋆S(A+B)(1-γ)3ε2(最多不超过某个对数因子)。这里,Cclipped⋆ 是某个单边剪切的同质性系数,反映了可用数据(相对于目标数据)的覆盖范围和分布偏移,而目标精度 ε 可以是 (0,11-γ] 范围内的任意值。我们的样本复杂度约束以最小{A,B}的系数加强了现有技术,在广泛的兴趣范围内实现了最小最优。我们的结果的一个吸引人之处在于其算法简单,它揭示了在实现样本最优性过程中减少方差和样本分割的必要性:严宇部分获得普林斯顿大学夏洛特-伊丽莎白-普罗克特荣誉奖学金和麻省理工学院诺伯特-维纳博士后奖学金的资助。Y. Chen 的部分研究经费来自 Alfred P. Sloan 研究奖学金、谷歌研究学者奖、空军科学研究办公室[FA9550-22-1-0198 号拨款]、海军研究办公室[N00014-22-1-2354 号拨款]和美国国家科学基金会[CCF-2221009、CCF-1907661、IIS-2218713、DMS-2014279 和 IIS-2218773 号拨款]。J. Fan 部分获得了美国国家科学基金会 [资助 DMS-1712591、DMS-2052926、DMS-2053832 和 DMS-2210833] 和海军研究办公室 [资助 N00014-22-1-2340] 的资助:在线附录见 https://doi.org/10.1287/opre.2022.0342。
{"title":"Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games","authors":"Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan","doi":"10.1287/opre.2022.0342","DOIUrl":"https://doi.org/10.1287/opre.2022.0342","url":null,"abstract":"<p>This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data. Specifically, consider a <i>γ</i>-discounted, infinite-horizon Markov game with <i>S</i> states, in which the max-player has <i>A</i> actions and the min-player has <i>B</i> actions. We propose a pessimistic model–based algorithm with Bernstein-style lower confidence bounds—called the value iteration with lower confidence bounds for zero-sum Markov games—that provably finds an <i>ε</i>-approximate Nash equilibrium with a sample complexity no larger than <span><math altimg=\"eq-00001.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mfrac><mrow><msubsup><mrow><mi>C</mi></mrow><mrow><mtext mathvariant=\"sans-serif\">clipped</mtext></mrow><mi>⋆</mi></msubsup><mi>S</mi><mo stretchy=\"false\">(</mo><mi>A</mi><mo>+</mo><mi>B</mi><mo stretchy=\"false\">)</mo></mrow><mrow><msup><mrow><mo stretchy=\"false\">(</mo><mn>1</mn><mo>−</mo><mi>γ</mi><mo stretchy=\"false\">)</mo></mrow><mn>3</mn></msup><msup><mrow><mi>ε</mi></mrow><mn>2</mn></msup></mrow></mfrac></mrow></math></span><span></span> (up to some log factor). Here, <span><math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"><mrow><msubsup><mrow><mi>C</mi></mrow><mrow><mtext mathvariant=\"sans-serif\">clipped</mtext></mrow><mi>⋆</mi></msubsup></mrow></math></span><span></span> is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy <i>ε</i> can be any value within <span><math altimg=\"eq-00003.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mrow><mo>(</mo><mrow><mn>0</mn><mo>,</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>−</mo><mi>γ</mi></mrow></mfrac></mrow><mo>]</mo></mrow></mrow></math></span><span></span>. Our sample complexity bound strengthens prior art by a factor of <span><math altimg=\"eq-00004.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>min</mi><mo stretchy=\"false\">{</mo><mi>A</mi><mo>,</mo><mi>B</mi><mo stretchy=\"false\">}</mo></mrow></math></span><span></span>, achieving minimax optimality for a broad regime of interest. An appealing feature of our result lies in its algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.</p><p><b>Funding:</b> Y. Yan is supported in part by the Charlotte Elizabeth Procter Honorific Fellowship from Princeton University and the Norbert Wiener Postdoctoral Fellowship from MIT. Y. Chen is supported in part by the Alfred P. Sloan Research Fellowship, the Google Research Scholar Award, the Air Force Office of Scientific Research [Grant FA9550-22-1-0198], the Office of Naval Research [Grant N00014-22-1-2354], and the National Science Foundation [Grants CCF-2221009, CCF-1907661, IIS-2218713, DMS-2014279, and IIS-2218773]. J. Fan is supported in part by the National Science Foundation [Grants DMS-1712591, DMS-2052926, DMS-2053832, and DMS-2210833] and Office of Naval","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Projected Inventory-Level Policies for Lost Sales Inventory Systems: Asymptotic Optimality in Two Regimes","authors":"Willem van Jaarsveld, Joachim Arts","doi":"10.1287/opre.2021.0032","DOIUrl":"https://doi.org/10.1287/opre.2021.0032","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malleable scheduling is a model that captures the possibility of parallelization to expedite the completion of time-critical tasks. A malleable job can be allocated and processed simultaneously on multiple machines, occupying the same time interval on all these machines. We study a general version of this setting, in which the functions determining the joint processing speed of machines for a given job follow different discrete concavity assumptions (subadditivity, fractional subadditivity, submodularity, and matroid ranks). We show that under these assumptions, the problem of scheduling malleable jobs at minimum makespan can be approximated by a considerably simpler assignment problem. Moreover, we provide efficient approximation algorithms for both the scheduling and the assignment problem, with increasingly stronger guarantees for increasingly stronger concavity assumptions, including a logarithmic approximation factor for the case of submodular processing speeds and a constant approximation factor when processing speeds are determined by matroid rank functions. Computational experiments indicate that our algorithms outperform the theoretical worst-case guarantees.
Funding: D. Fotakis received financial support from the Hellenic Foundation for Research and Innovation (H.F.R.I.) [“First Call for H.F.R.I. Research Projects to Support Faculty Members and Researchers and the Procurement of High-Cost Research Equipment Grant,” Project BALSAM, HFRI-FM17-1424]. J. Matuschke received financial support from the Fonds Wetenschappelijk Onderzoek-Vlanderen [Research Project G072520N “Optimization and Analytics for Stochastic and Robust Project Scheduling”]. O. Papadigenopoulos received financial support from the National Science Foundation Institute for Machine Learning [Award 2019844].
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0168.
{"title":"Assigning and Scheduling Generalized Malleable Jobs Under Subadditive or Submodular Processing Speeds","authors":"Dimitris Fotakis, Jannik Matuschke, Orestis Papadigenopoulos","doi":"10.1287/opre.2022.0168","DOIUrl":"https://doi.org/10.1287/opre.2022.0168","url":null,"abstract":"<p>Malleable scheduling is a model that captures the possibility of parallelization to expedite the completion of time-critical tasks. A malleable job can be allocated and processed simultaneously on multiple machines, occupying the same time interval on all these machines. We study a general version of this setting, in which the functions determining the joint processing speed of machines for a given job follow different discrete concavity assumptions (subadditivity, fractional subadditivity, submodularity, and matroid ranks). We show that under these assumptions, the problem of scheduling malleable jobs at minimum makespan can be approximated by a considerably simpler assignment problem. Moreover, we provide efficient approximation algorithms for both the scheduling and the assignment problem, with increasingly stronger guarantees for increasingly stronger concavity assumptions, including a logarithmic approximation factor for the case of submodular processing speeds and a constant approximation factor when processing speeds are determined by matroid rank functions. Computational experiments indicate that our algorithms outperform the theoretical worst-case guarantees.</p><p><b>Funding:</b> D. Fotakis received financial support from the Hellenic Foundation for Research and Innovation (H.F.R.I.) [“First Call for H.F.R.I. Research Projects to Support Faculty Members and Researchers and the Procurement of High-Cost Research Equipment Grant,” Project BALSAM, HFRI-FM17-1424]. J. Matuschke received financial support from the Fonds Wetenschappelijk Onderzoek-Vlanderen [Research Project G072520N “Optimization and Analytics for Stochastic and Robust Project Scheduling”]. O. Papadigenopoulos received financial support from the National Science Foundation Institute for Machine Learning [Award 2019844].</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0168.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinhui Han, Xiaolong Li, Suresh P. Sethi, Chi Chung Siu, S. Yam
Analyzing Production-Inventory Systems with General Demand: Cost Minimization and Risk Analytics Frequent production rate changes are prohibitive because of high setup costs or setup times in producing such items as sugar, glass, computer displays, and cell-free proteins. Thus, constant production rates are deployed for producing these items even when their demands are random. In “Production Management with General Demands and Lost Sales,” Han, Li, Sethi, Siu, and Yam obtain the optimal constant production rate for a production-inventory system with Lévy demand for long-run average and expected discounted cost objectives, explicitly in some cases and numerically in general with a Fourier-cosine scheme they develop. This scheme can help in computing risk analytics of the inventory system, such as stockout probability and expected shortfall. These measures are particularly significant for assessing supply resilience, especially for emergency products or services like medicines and healthcare equipment. This study’s analytical and numerical findings contribute to enhancing efficiency and decision making in production management.
{"title":"Technical Note—Production Management with General Demands and Lost Sales","authors":"Jinhui Han, Xiaolong Li, Suresh P. Sethi, Chi Chung Siu, S. Yam","doi":"10.1287/opre.2022.0191","DOIUrl":"https://doi.org/10.1287/opre.2022.0191","url":null,"abstract":"Analyzing Production-Inventory Systems with General Demand: Cost Minimization and Risk Analytics Frequent production rate changes are prohibitive because of high setup costs or setup times in producing such items as sugar, glass, computer displays, and cell-free proteins. Thus, constant production rates are deployed for producing these items even when their demands are random. In “Production Management with General Demands and Lost Sales,” Han, Li, Sethi, Siu, and Yam obtain the optimal constant production rate for a production-inventory system with Lévy demand for long-run average and expected discounted cost objectives, explicitly in some cases and numerically in general with a Fourier-cosine scheme they develop. This scheme can help in computing risk analytics of the inventory system, such as stockout probability and expected shortfall. These measures are particularly significant for assessing supply resilience, especially for emergency products or services like medicines and healthcare equipment. This study’s analytical and numerical findings contribute to enhancing efficiency and decision making in production management.","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140373425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saeed Alaei, Alexandre Belloni, Ali Makhdoumi, Azarakhsh Malekian
Consider a mechanism run by an auctioneer who can use both payment and inspection instruments to incentivize agents. The timeline of the events is as follows. Based on a prespecified allocation rule and the reported values of agents, the auctioneer allocates the item and secures the reported values as deposits. The auctioneer then inspects the values of agents and, using a prespecified reward rule, rewards the ones who have reported truthfully. Using techniques from convex analysis and calculus of variations, for any distribution of values, we fully characterize the optimal mechanism for a single agent. Using Border’s theorem and duality, we find conditions under which our characterization extends to multiple agents. Interestingly, the optimal allocation function, unlike the classic settings without inspection, is not a threshold strategy and instead is an increasing and continuous function of the types. We also present an implementation of our optimal auction and show that it achieves a higher revenue than auctions in classic settings without inspection. This is because the inspection enables the auctioneer to charge payments closer to the agents’ true values without creating incentives for them to deviate to lower types.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2020.0651.
{"title":"Optimal Auction Design with Deferred Inspection and Reward","authors":"Saeed Alaei, Alexandre Belloni, Ali Makhdoumi, Azarakhsh Malekian","doi":"10.1287/opre.2020.0651","DOIUrl":"https://doi.org/10.1287/opre.2020.0651","url":null,"abstract":"<p>Consider a mechanism run by an auctioneer who can use both payment and inspection instruments to incentivize agents. The timeline of the events is as follows. Based on a prespecified allocation rule and the reported values of agents, the auctioneer allocates the item and secures the reported values as deposits. The auctioneer then inspects the values of agents and, using a prespecified reward rule, rewards the ones who have reported truthfully. Using techniques from convex analysis and calculus of variations, for any distribution of values, we fully characterize the optimal mechanism for a single agent. Using Border’s theorem and duality, we find conditions under which our characterization extends to multiple agents. Interestingly, the optimal allocation function, unlike the classic settings without inspection, is not a threshold strategy and instead is an increasing and continuous function of the types. We also present an implementation of our optimal auction and show that it achieves a higher revenue than auctions in classic settings without inspection. This is because the inspection enables the auctioneer to charge payments closer to the agents’ true values without creating incentives for them to deviate to lower types.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2020.0651.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John R. Birge, Hongfan (Kevin) Chen, N. Bora Keskin, Amy Ward
Operations Research, Ahead of Print.
运筹学》,印刷版前。
{"title":"To Interfere or Not To Interfere: Information Revelation and Price-Setting Incentives in a Multiagent Learning Environment","authors":"John R. Birge, Hongfan (Kevin) Chen, N. Bora Keskin, Amy Ward","doi":"10.1287/opre.2023.0363","DOIUrl":"https://doi.org/10.1287/opre.2023.0363","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140323314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimitris Bertsimas, Vassilis Digalakis, Michael Lingzhi Li, Omar Skali Lami
We introduce the framework of slowly varying regression under sparsity, which allows sparse regression models to vary slowly and sparsely. We formulate the problem of parameter estimation as a mixed-integer optimization problem and demonstrate that it can be reformulated exactly as a binary convex optimization problem through a novel relaxation. The relaxation utilizes a new equality on Moore-Penrose inverses that convexifies the nonconvex objective function while coinciding with the original objective on all feasible binary points. This allows us to solve the problem significantly more efficiently and to provable optimality using a cutting plane–type algorithm. We develop a highly optimized implementation of such algorithm, which substantially improves upon the asymptotic computational complexity of a straightforward implementation. We further develop a fast heuristic method that is guaranteed to produce a feasible solution and, as we empirically illustrate, generates high-quality warm-start solutions for the binary optimization problem. To tune the framework’s hyperparameters, we propose a practical procedure relying on binary search that, under certain assumptions, is guaranteed to recover the true model parameters. We show, on both synthetic and real-world data sets, that the resulting algorithm outperforms competing formulations in comparable times across a variety of metrics, including estimation accuracy, predictive power, and computational time, and is highly scalable, enabling us to train models with tens of thousands of parameters.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0330.
{"title":"Slowly Varying Regression Under Sparsity","authors":"Dimitris Bertsimas, Vassilis Digalakis, Michael Lingzhi Li, Omar Skali Lami","doi":"10.1287/opre.2022.0330","DOIUrl":"https://doi.org/10.1287/opre.2022.0330","url":null,"abstract":"<p>We introduce the framework of slowly varying regression under sparsity, which allows sparse regression models to vary slowly and sparsely. We formulate the problem of parameter estimation as a mixed-integer optimization problem and demonstrate that it can be reformulated exactly as a binary convex optimization problem through a novel relaxation. The relaxation utilizes a new equality on Moore-Penrose inverses that convexifies the nonconvex objective function while coinciding with the original objective on all feasible binary points. This allows us to solve the problem significantly more efficiently and to provable optimality using a cutting plane–type algorithm. We develop a highly optimized implementation of such algorithm, which substantially improves upon the asymptotic computational complexity of a straightforward implementation. We further develop a fast heuristic method that is guaranteed to produce a feasible solution and, as we empirically illustrate, generates high-quality warm-start solutions for the binary optimization problem. To tune the framework’s hyperparameters, we propose a practical procedure relying on binary search that, under certain assumptions, is guaranteed to recover the true model parameters. We show, on both synthetic and real-world data sets, that the resulting algorithm outperforms competing formulations in comparable times across a variety of metrics, including estimation accuracy, predictive power, and computational time, and is highly scalable, enabling us to train models with tens of thousands of parameters.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0330.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140323316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study proposes two new dynamic assignment algorithms to match refugees and asylum seekers to geographic localities within a host country. The first, currently implemented in a multiyear randomized control trial in Switzerland, seeks to maximize the average predicted employment level (or any measured outcome of interest) of refugees through a minimum-discord online assignment algorithm. The performance of this algorithm is tested on real refugee resettlement data from both the United States and Switzerland, where we find that it is able to achieve near-optimal expected employment, compared with the hindsight-optimal solution, and is able to improve upon the status quo procedure by 40%–50%. However, pure outcome maximization can result in a periodically imbalanced allocation to the localities over time, leading to implementation difficulties and an undesirable workflow for resettlement resources and agents. To address these problems, the second algorithm balances the goal of improving refugee outcomes with the desire for an even allocation over time. We find that this algorithm can achieve near-perfect balance over time with only a small loss in expected employment compared with the employment-maximizing algorithm. In addition, the allocation balancing algorithm offers a number of ancillary benefits compared with pure outcome maximization, including robustness to unknown arrival flows and greater exploration.
Funding: Financial support from the Charles Koch Foundation, Stanford Impact Labs, the Rockefeller Foundation, Google.org, Schmidt Futures, the Stanford Institute for Human-Centered Artificial Intelligence, and Stanford University is gratefully acknowledged.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0445.
{"title":"Outcome-Driven Dynamic Refugee Assignment with Allocation Balancing","authors":"Kirk Bansak, Elisabeth Paulson","doi":"10.1287/opre.2022.0445","DOIUrl":"https://doi.org/10.1287/opre.2022.0445","url":null,"abstract":"<p>This study proposes two new dynamic assignment algorithms to match refugees and asylum seekers to geographic localities within a host country. The first, currently implemented in a multiyear randomized control trial in Switzerland, seeks to maximize the average predicted employment level (or any measured outcome of interest) of refugees through a minimum-discord online assignment algorithm. The performance of this algorithm is tested on real refugee resettlement data from both the United States and Switzerland, where we find that it is able to achieve near-optimal expected employment, compared with the hindsight-optimal solution, and is able to improve upon the status quo procedure by 40%–50%. However, pure outcome maximization can result in a periodically imbalanced allocation to the localities over time, leading to implementation difficulties and an undesirable workflow for resettlement resources and agents. To address these problems, the second algorithm balances the goal of improving refugee outcomes with the desire for an even allocation over time. We find that this algorithm can achieve near-perfect balance over time with only a small loss in expected employment compared with the employment-maximizing algorithm. In addition, the allocation balancing algorithm offers a number of ancillary benefits compared with pure outcome maximization, including robustness to unknown arrival flows and greater exploration.</p><p><b>Funding:</b> Financial support from the Charles Koch Foundation, Stanford Impact Labs, the Rockefeller Foundation, Google.org, Schmidt Futures, the Stanford Institute for Human-Centered Artificial Intelligence, and Stanford University is gratefully acknowledged.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0445.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Random Consideration Set Model for Demand Estimation, Assortment Optimization, and Pricing","authors":"Guillermo Gallego, Anran Li","doi":"10.1287/opre.2019.0333","DOIUrl":"https://doi.org/10.1287/opre.2019.0333","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}