首页 > 最新文献

Operations Research最新文献

英文 中文
Drone-Delivery Network for Opioid Overdose: Nonlinear Integer Queueing-Optimization Models and Methods 阿片类药物过量的无人机递送网络:非线性整数排队优化模型与方法
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-05-07 DOI: 10.1287/opre.2022.0489
Miguel A. Lejeune, Wenbo Ma

We propose a new stochastic emergency network design model that uses a fleet of drones to quickly deliver naloxone in response to opioid overdoses. The network is represented as a collection of M/G/K queueing systems in which the capacity K of each system is a decision variable, and the service time is modeled as a decision-dependent random variable. The model is a queuing-based optimization problem which locates fixed (drone bases) and mobile (drones) servers and determines the drone dispatching decisions and takes the form of a nonlinear integer problem intractable in its original form. We develop an efficient reformulation and algorithmic framework. Our approach reformulates the multiple nonlinearities (fractional, polynomial, exponential, factorial terms) to give a mixed-integer linear programming (MILP) formulation. We demonstrate its generalizability and show that the problem of minimizing the average response time of a collection of M/G/K queueing systems with unknown capacity K is always MILP-representable. We design an outer approximation branch-and-cut algorithmic framework that is computationally efficient and scales well. The analysis based on real-life data reveals that drones can in Virginia Beach: (1) decrease the response time by 82%, (2) increase the survival chance by more than 273%, (3) save up to 33 additional lives per year, and (4) provide annually up to 279 additional quality-adjusted life years.

Funding: M. A. Lejeune acknowledges the support of the National Science Foundation [Grant ECCS-2114100] and the Office of Naval Research [Grant N00014-22-1-2649].

Supplemental Material: The online appendices are available at https://doi.org/10.1287/opre.2022.0489.

我们提出了一种新的随机应急网络设计模型,该模型利用无人机队快速投放纳洛酮,以应对阿片类药物过量的情况。该网络被表示为 M/G/K 队列系统的集合,其中每个系统的容量 K 是一个决策变量,而服务时间则被模拟为一个依赖于决策的随机变量。该模型是一个基于队列的优化问题,需要确定固定(无人机基地)和移动(无人机)服务器的位置,并决定无人机的调度决策,其原始形式是一个难以解决的非线性整数问题。我们开发了一种高效的重新表述和算法框架。我们的方法对多重非线性(分数项、多项式项、指数项、因子项)进行了重新表述,给出了混合整数线性规划(MILP)公式。我们证明了这一方法的通用性,并表明最大限度地缩短具有未知容量 K 的 M/G/K 队列系统集合的平均响应时间这一问题始终是可以用 MILP 表示的。我们设计了一个外近似分支-切割算法框架,该框架计算效率高,扩展性好。基于真实数据的分析表明,在弗吉尼亚海滩,无人机可以:(1)减少 82% 的响应时间;(2)提高超过 273% 的存活几率;(3)每年挽救多达 33 人的生命;(4)每年提供多达 279 个质量调整生命年:M. A. Lejeune感谢美国国家科学基金会[ECCS-2114100号资助]和海军研究办公室[N00014-22-1-2649号资助]的支持:在线附录见 https://doi.org/10.1287/opre.2022.0489。
{"title":"Drone-Delivery Network for Opioid Overdose: Nonlinear Integer Queueing-Optimization Models and Methods","authors":"Miguel A. Lejeune, Wenbo Ma","doi":"10.1287/opre.2022.0489","DOIUrl":"https://doi.org/10.1287/opre.2022.0489","url":null,"abstract":"<p>We propose a new stochastic emergency network design model that uses a fleet of drones to quickly deliver naloxone in response to opioid overdoses. The network is represented as a collection of <span><math altimg=\"eq-00001.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>M</mi><mo>/</mo><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math></span><span></span> queueing systems in which the capacity <i>K</i> of each system is a decision variable, and the service time is modeled as a decision-dependent random variable. The model is a queuing-based optimization problem which locates fixed (drone bases) and mobile (drones) servers and determines the drone dispatching decisions and takes the form of a nonlinear integer problem intractable in its original form. We develop an efficient reformulation and algorithmic framework. Our approach reformulates the multiple nonlinearities (fractional, polynomial, exponential, factorial terms) to give a mixed-integer linear programming (MILP) formulation. We demonstrate its generalizability and show that the problem of minimizing the average response time of a collection of <span><math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>M</mi><mo>/</mo><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math></span><span></span> queueing systems with unknown capacity <i>K</i> is always MILP-representable. We design an outer approximation branch-and-cut algorithmic framework that is computationally efficient and scales well. The analysis based on real-life data reveals that drones can in Virginia Beach: (1) decrease the response time by 82%, (2) increase the survival chance by more than 273%, (3) save up to 33 additional lives per year, and (4) provide annually up to 279 additional quality-adjusted life years.</p><p><b>Funding:</b> M. A. Lejeune acknowledges the support of the National Science Foundation [Grant ECCS-2114100] and the Office of Naval Research [Grant N00014-22-1-2649].</p><p><b>Supplemental Material:</b> The online appendices are available at https://doi.org/10.1287/opre.2022.0489.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"113 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pricing and Positioning of Horizontally Differentiated Products with Incomplete Demand Information 需求信息不完全的横向差异化产品的定价与定位
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-04-29 DOI: 10.1287/opre.2021.0093
Arnoud V. den Boer, Boxiao Chen, Yining Wang

We consider the problem of determining the optimal prices and product configurations of horizontally differentiated products when customers purchase according to a locational (Hotelling) choice model and where the problem parameters are initially unknown to the decision maker. Both for the single-product and multiple-product setting, we propose a data-driven algorithm that learns the optimal prices and product configurations from accumulating sales data, and we show that their regret—the expected cumulative loss caused by not using optimal decisions—after T time periods is O(T1/2+o(1)). We accompany this result by showing that, even in the single-product setting, the regret of any algorithm is bounded from below by a constant time T1/2, implying that our algorithms are asymptotically near optimal. In an extension, we show how our algorithm can be adapted for the case of fixed locations. A numerical study that compares our algorithms with three benchmarks shows that our algorithm is also competitive on a finite time horizon.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2021.0093.

我们考虑的问题是,当客户根据定位(Hotelling)选择模型进行购买时,如何确定横向差异化产品的最优价格和产品配置,而决策者最初并不知道问题的参数。对于单产品和多产品设置,我们都提出了一种数据驱动算法,该算法能从累积的销售数据中学习最优价格和产品配置,并证明了其遗憾值--即在 T 个时间段后因未使用最优决策而造成的预期累积损失--为 O(T1/2+o(1))。同时,我们还证明,即使在单一产品的情况下,任何算法的遗憾值都会被一个恒定时间 T1/2 从下往上限定,这意味着我们的算法在渐近上接近最优。在扩展中,我们展示了如何将我们的算法适用于位置固定的情况。将我们的算法与三个基准进行比较的数值研究表明,我们的算法在有限时间范围内也具有竞争力:在线附录见 https://doi.org/10.1287/opre.2021.0093。
{"title":"Pricing and Positioning of Horizontally Differentiated Products with Incomplete Demand Information","authors":"Arnoud V. den Boer, Boxiao Chen, Yining Wang","doi":"10.1287/opre.2021.0093","DOIUrl":"https://doi.org/10.1287/opre.2021.0093","url":null,"abstract":"<p>We consider the problem of determining the optimal prices and product configurations of horizontally differentiated products when customers purchase according to a locational (Hotelling) choice model and where the problem parameters are initially unknown to the decision maker. Both for the single-product and multiple-product setting, we propose a data-driven algorithm that learns the optimal prices and product configurations from accumulating sales data, and we show that their regret—the expected cumulative loss caused by not using optimal decisions—after <i>T</i> time periods is <span><math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>O</mi><mo stretchy=\"false\">(</mo><msup><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn><mo>+</mo><mi>o</mi><mo stretchy=\"false\">(</mo><mn>1</mn><mo stretchy=\"false\">)</mo></mrow></msup><mo stretchy=\"false\">)</mo></mrow></math></span><span></span>. We accompany this result by showing that, even in the single-product setting, the regret of any algorithm is bounded from below by a constant time <span><math altimg=\"eq-00003.gif\" display=\"inline\" overflow=\"scroll\"><mrow><msup><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup></mrow></math></span><span></span>, implying that our algorithms are asymptotically near optimal. In an extension, we show how our algorithm can be adapted for the case of fixed locations. A numerical study that compares our algorithms with three benchmarks shows that our algorithm is also competitive on a finite time horizon.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2021.0093.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"46 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Market Entry and Competition Under Network Effects 网络效应下的市场进入与竞争
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-04-29 DOI: 10.1287/opre.2022.0275
Yinbo Feng, Ming Hu

We consider a three-stage game in which, first, a large number of potential firms make entry decisions, then those who choose to stay in the market decide on the investment (quality) level in each product, and last, customers with heterogeneous preferences arrive sequentially to make (random) purchase decisions based on product quality and historical sales under the network effect according to a discrete choice model. We characterize such a random purchase process and show that a growing network effect always contributes to more sales concentration ex post on a small number of products. Perhaps surprisingly, we further show several phase-changing phenomena regarding equilibrium outcomes with respect to the network effect’s strength. In particular, the equilibrium product variety (respectively, quality investment) first decreases (respectively, increases) and then increases (respectively, decreases) as the network effect grows. Specifically, when the strength of the network effect is below a threshold, an increasing network effect would shift more sales toward those products with higher quality, preventing more products from entering the market ex ante and inducing firms to adopt the high-budget equilibrium strategy by making a small number of high-quality products, which is consistent with the blockbuster phenomenon. When the strength of the network effect is above the threshold, the network effect would easily cause the market to be concentrated on a few products ex post; even some low-quality products may have a chance to become a “hit.” Interestingly, in this case, when the network effect is growing, the ex ante equilibrium product variety will be wider, and firms adopt the low-budget equilibrium strategy by making a (relatively) large number of low-quality products, a finding consistent with the long tail theory. We then establish the robustness of the previous main insights by accounting for endogenized pricing and multiproducts carried by each firm.

Funding: Y. Feng was financially supported by the Major Program of National Natural Science Foundation of China [Grants 72192830 and 7219283X], Fundamental Research Funds for the Central Universities, and Program for Innovative Research of Shanghai University of Finance and Economics. M. Hu was supported by the Natural Sciences and Engineering Research Council of Canada [Grants RGPIN-2015-06757 and RGPIN-2021-04295].

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0275.

我们考虑了这样一个三阶段博弈:首先,大量潜在企业做出进入市场的决定,然后,选择留在市场上的企业决定每种产品的投资(质量)水平,最后,具有异质性偏好的客户根据离散选择模型,根据产品质量和网络效应下的历史销售情况,依次做出(随机)购买决定。我们描述了这种随机购买过程的特征,并表明不断增长的网络效应总是会在事后促使更多的销售集中在少数产品上。或许令人惊讶的是,我们还进一步展示了与网络效应强度有关的均衡结果的若干阶段性变化现象。特别是,随着网络效应的增长,均衡产品种类(分别是质量投资)先减少(分别是增加),然后增加(分别是减少)。具体来说,当网络效应的强度低于临界值时,网络效应的增强会使更多的销售转向质量更高的产品,从而阻止更多的产品事先进入市场,促使企业采取高预算均衡策略,生产少量高质量的产品,这与大片现象是一致的。当网络效应的强度高于临界值时,网络效应很容易导致事后市场向少数产品集中,甚至一些低质量产品也有机会成为 "爆款"。有趣的是,在这种情况下,当网络效应不断增强时,事前均衡产品种类会更多,企业会采取低预算均衡策略,生产(相对)大量低质量产品,这一结论与长尾理论一致。然后,我们通过考虑内生定价和每个企业的多产品情况,确定了前面主要观点的稳健性:冯宇得到了国家自然科学基金重大项目[72192830 和 7219283X]、中央高校基本科研业务费和上海财经大学创新研究计划的资助。M.Hu得到了加拿大自然科学与工程研究理事会[Grants RGPIN-2015-06757 and RGPIN-2021-04295]的资助:在线附录见 https://doi.org/10.1287/opre.2022.0275。
{"title":"Market Entry and Competition Under Network Effects","authors":"Yinbo Feng, Ming Hu","doi":"10.1287/opre.2022.0275","DOIUrl":"https://doi.org/10.1287/opre.2022.0275","url":null,"abstract":"<p>We consider a three-stage game in which, first, a large number of potential firms make entry decisions, then those who choose to stay in the market decide on the investment (quality) level in each product, and last, customers with heterogeneous preferences arrive sequentially to make (random) purchase decisions based on product quality and historical sales under the network effect according to a discrete choice model. We characterize such a random purchase process and show that a growing network effect always contributes to more sales concentration ex post on a small number of products. Perhaps surprisingly, we further show several phase-changing phenomena regarding equilibrium outcomes with respect to the network effect’s strength. In particular, the equilibrium product variety (respectively, quality investment) first decreases (respectively, increases) and then increases (respectively, decreases) as the network effect grows. Specifically, when the strength of the network effect is below a threshold, an increasing network effect would shift more sales toward those products with higher quality, preventing more products from entering the market ex ante and inducing firms to adopt the high-budget equilibrium strategy by making a small number of high-quality products, which is consistent with the blockbuster phenomenon. When the strength of the network effect is above the threshold, the network effect would easily cause the market to be concentrated on a few products ex post; even some low-quality products may have a chance to become a “hit.” Interestingly, in this case, when the network effect is growing, the ex ante equilibrium product variety will be wider, and firms adopt the low-budget equilibrium strategy by making a (relatively) large number of low-quality products, a finding consistent with the long tail theory. We then establish the robustness of the previous main insights by accounting for endogenized pricing and multiproducts carried by each firm.</p><p><b>Funding:</b> Y. Feng was financially supported by the Major Program of National Natural Science Foundation of China [Grants 72192830 and 7219283X], Fundamental Research Funds for the Central Universities, and Program for Innovative Research of Shanghai University of Finance and Economics. M. Hu was supported by the Natural Sciences and Engineering Research Council of Canada [Grants RGPIN-2015-06757 and RGPIN-2021-04295].</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0275.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"2011 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Lagrangian Policies for a Multiwarehouse, Multistore Inventory System with Lost Sales 具有销售损失的多仓库、多分店库存系统的自适应拉格朗日政策
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-04-16 DOI: 10.1287/opre.2022.0668
Xiuli Chao, Stefanus Jasin, Sentao Miao

We consider the inventory control problem of a multiwarehouse, multistore system over a time horizon when the warehouses receive no external replenishment. This problem is prevalent in retail settings, and it is referred to in the work of [Jackson PL (1988) Stock allocation in a two-echelon distribution system or “what to do until your ship comes in.” Management Sci. 34(7):880–895] as the problem of “what to do until your (external) shipment comes in.” The warehouses are stocked with initial inventories, and the stores are dynamically replenished from the warehouses in each period of the planning horizon. Excess demand in each period at a store is lost. The optimal policy for this problem is complex and state dependent, and because of the curse of dimensionality, computing the optimal policy using standard dynamic programming is numerically intractable. Static Lagrangian base-stock (LaBS) policies have been developed for this problem [Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales. Oper. Res. 70(1):141–159] and shown to be asymptotically optimal. In this paper, we develop adaptive policies that dynamically adjust the control parameters of a vanilla static LaBS policy using realized historical demands. We show, both theoretically and numerically, that adaptive policies significantly improve the performance of the LaBS policy, with the magnitude of improvement characterized by the number of policy adjustments. In particular, when the number of adjustments is a logarithm of the length of time horizon, the policy is rate optimal in the sense that the rate of the loss (in terms of the dependency on the length of the time horizon) matches that of the theoretical lower bound. Among other insights, our results also highlight the benefit of incorporating the “pooling effect” in designing a dynamic adjustment scheme.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0668.

我们考虑的是一个多仓库、多分店系统在仓库没有外部补货时的库存控制问题。这个问题在零售业中非常普遍,在[Jackson PL (1988) Stock allocation in a two-echelon distribution system or "what to do until your ship comes in."]的著作中被称为 "在你的船到港之前该怎么办 "的问题。管理科学》34(7):880-895] 中被称为 "在(外部)货物到达之前该怎么办 "的问题。仓库备有初始库存,在计划期的每个阶段都会从仓库动态地补充库存。商店在每个时期的超额需求都会损失。这个问题的最优策略既复杂又依赖于状态,而且由于维数诅咒,使用标准动态编程计算最优策略在数值上是难以实现的。针对这一问题,人们提出了静态拉格朗日基础库存(LaBS)策略[Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales.Oper.70(1):141-159],并证明是渐近最优的。在本文中,我们开发了自适应策略,利用已实现的历史需求动态调整虚静态 LaBS 策略的控制参数。我们从理论和数值上证明,自适应政策能显著改善 LaBS 政策的性能,改善的程度取决于政策调整的次数。特别是,当调整次数是时间跨度长度的对数时,该政策的损失率(与时间跨度长度的关系)与理论下限相匹配,因而是最优的。除其他见解外,我们的结果还强调了在设计动态调整方案时纳入 "集合效应 "的好处:在线附录见 https://doi.org/10.1287/opre.2022.0668。
{"title":"Adaptive Lagrangian Policies for a Multiwarehouse, Multistore Inventory System with Lost Sales","authors":"Xiuli Chao, Stefanus Jasin, Sentao Miao","doi":"10.1287/opre.2022.0668","DOIUrl":"https://doi.org/10.1287/opre.2022.0668","url":null,"abstract":"<p>We consider the inventory control problem of a multiwarehouse, multistore system over a time horizon when the warehouses receive no external replenishment. This problem is prevalent in retail settings, and it is referred to in the work of [Jackson PL (1988) Stock allocation in a two-echelon distribution system or “what to do until your ship comes in.” <i>Management Sci.</i> 34(7):880–895] as the problem of “what to do until your (external) shipment comes in.” The warehouses are stocked with initial inventories, and the stores are dynamically replenished from the warehouses in each period of the planning horizon. Excess demand in each period at a store is lost. The optimal policy for this problem is complex and state dependent, and because of the curse of dimensionality, computing the optimal policy using standard dynamic programming is numerically intractable. <i>Static</i> Lagrangian base-stock (LaBS) policies have been developed for this problem [Miao S, Jasin S, Chao X (2022) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales. <i>Oper. Res.</i> 70(1):141–159] and shown to be asymptotically optimal. In this paper, we develop <i>adaptive</i> policies that <i>dynamically</i> adjust the control parameters of a vanilla static LaBS policy using realized historical demands. We show, both theoretically and numerically, that adaptive policies significantly improve the performance of the LaBS policy, with the magnitude of improvement characterized by the number of policy adjustments. In particular, when the number of adjustments is a logarithm of the length of time horizon, the policy is rate optimal in the sense that the rate of the loss (in terms of the dependency on the length of the time horizon) matches that of the theoretical lower bound. Among other insights, our results also highlight the benefit of incorporating the “pooling effect” in designing a dynamic adjustment scheme.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0668.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"37 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games 基于模型的离线零和马尔可夫游戏强化学习
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-04-02 DOI: 10.1287/opre.2022.0342
Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan
<p>This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data. Specifically, consider a <i>γ</i>-discounted, infinite-horizon Markov game with <i>S</i> states, in which the max-player has <i>A</i> actions and the min-player has <i>B</i> actions. We propose a pessimistic model–based algorithm with Bernstein-style lower confidence bounds—called the value iteration with lower confidence bounds for zero-sum Markov games—that provably finds an <i>ε</i>-approximate Nash equilibrium with a sample complexity no larger than <span><math altimg="eq-00001.gif" display="inline" overflow="scroll"><mrow><mfrac><mrow><msubsup><mrow><mi>C</mi></mrow><mrow><mtext mathvariant="sans-serif">clipped</mtext></mrow><mi>⋆</mi></msubsup><mi>S</mi><mo stretchy="false">(</mo><mi>A</mi><mo>+</mo><mi>B</mi><mo stretchy="false">)</mo></mrow><mrow><msup><mrow><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>γ</mi><mo stretchy="false">)</mo></mrow><mn>3</mn></msup><msup><mrow><mi>ε</mi></mrow><mn>2</mn></msup></mrow></mfrac></mrow></math></span><span></span> (up to some log factor). Here, <span><math altimg="eq-00002.gif" display="inline" overflow="scroll"><mrow><msubsup><mrow><mi>C</mi></mrow><mrow><mtext mathvariant="sans-serif">clipped</mtext></mrow><mi>⋆</mi></msubsup></mrow></math></span><span></span> is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy <i>ε</i> can be any value within <span><math altimg="eq-00003.gif" display="inline" overflow="scroll"><mrow><mrow><mo>(</mo><mrow><mn>0</mn><mo>,</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>−</mo><mi>γ</mi></mrow></mfrac></mrow><mo>]</mo></mrow></mrow></math></span><span></span>. Our sample complexity bound strengthens prior art by a factor of <span><math altimg="eq-00004.gif" display="inline" overflow="scroll"><mrow><mi>min</mi><mo stretchy="false">{</mo><mi>A</mi><mo>,</mo><mi>B</mi><mo stretchy="false">}</mo></mrow></math></span><span></span>, achieving minimax optimality for a broad regime of interest. An appealing feature of our result lies in its algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.</p><p><b>Funding:</b> Y. Yan is supported in part by the Charlotte Elizabeth Procter Honorific Fellowship from Princeton University and the Norbert Wiener Postdoctoral Fellowship from MIT. Y. Chen is supported in part by the Alfred P. Sloan Research Fellowship, the Google Research Scholar Award, the Air Force Office of Scientific Research [Grant FA9550-22-1-0198], the Office of Naval Research [Grant N00014-22-1-2354], and the National Science Foundation [Grants CCF-2221009, CCF-1907661, IIS-2218713, DMS-2014279, and IIS-2218773]. J. Fan is supported in part by the National Science Foundation [Grants DMS-1712591, DMS-2052926, DMS-2053832, and DMS-2210833] and Office of Naval
本文在从离线数据学习双人零和马尔可夫博弈中的纳什均衡方面取得了进展。具体来说,考虑一个具有 S 种状态的 γ 贴现无限视距马尔可夫博弈,其中最大玩家有 A 种行动,最小玩家有 B 种行动。我们提出了一种基于模型的悲观算法,该算法具有伯恩斯坦式置信下限,即零和马尔可夫博弈的置信下限值迭代,可以证明它能找到一个ε近似纳什均衡,样本复杂度不大于 Cclipped⋆S(A+B)(1-γ)3ε2(最多不超过某个对数因子)。这里,Cclipped⋆ 是某个单边剪切的同质性系数,反映了可用数据(相对于目标数据)的覆盖范围和分布偏移,而目标精度 ε 可以是 (0,11-γ] 范围内的任意值。我们的样本复杂度约束以最小{A,B}的系数加强了现有技术,在广泛的兴趣范围内实现了最小最优。我们的结果的一个吸引人之处在于其算法简单,它揭示了在实现样本最优性过程中减少方差和样本分割的必要性:严宇部分获得普林斯顿大学夏洛特-伊丽莎白-普罗克特荣誉奖学金和麻省理工学院诺伯特-维纳博士后奖学金的资助。Y. Chen 的部分研究经费来自 Alfred P. Sloan 研究奖学金、谷歌研究学者奖、空军科学研究办公室[FA9550-22-1-0198 号拨款]、海军研究办公室[N00014-22-1-2354 号拨款]和美国国家科学基金会[CCF-2221009、CCF-1907661、IIS-2218713、DMS-2014279 和 IIS-2218773 号拨款]。J. Fan 部分获得了美国国家科学基金会 [资助 DMS-1712591、DMS-2052926、DMS-2053832 和 DMS-2210833] 和海军研究办公室 [资助 N00014-22-1-2340] 的资助:在线附录见 https://doi.org/10.1287/opre.2022.0342。
{"title":"Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games","authors":"Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan","doi":"10.1287/opre.2022.0342","DOIUrl":"https://doi.org/10.1287/opre.2022.0342","url":null,"abstract":"&lt;p&gt;This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data. Specifically, consider a &lt;i&gt;γ&lt;/i&gt;-discounted, infinite-horizon Markov game with &lt;i&gt;S&lt;/i&gt; states, in which the max-player has &lt;i&gt;A&lt;/i&gt; actions and the min-player has &lt;i&gt;B&lt;/i&gt; actions. We propose a pessimistic model–based algorithm with Bernstein-style lower confidence bounds—called the value iteration with lower confidence bounds for zero-sum Markov games—that provably finds an &lt;i&gt;ε&lt;/i&gt;-approximate Nash equilibrium with a sample complexity no larger than &lt;span&gt;&lt;math altimg=\"eq-00001.gif\" display=\"inline\" overflow=\"scroll\"&gt;&lt;mrow&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msubsup&gt;&lt;mrow&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mtext mathvariant=\"sans-serif\"&gt;clipped&lt;/mtext&gt;&lt;/mrow&gt;&lt;mi&gt;⋆&lt;/mi&gt;&lt;/msubsup&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo stretchy=\"false\"&gt;(&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;mo stretchy=\"false\"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mo stretchy=\"false\"&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;γ&lt;/mi&gt;&lt;mo stretchy=\"false\"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;ε&lt;/mi&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt; (up to some log factor). Here, &lt;span&gt;&lt;math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"&gt;&lt;mrow&gt;&lt;msubsup&gt;&lt;mrow&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mtext mathvariant=\"sans-serif\"&gt;clipped&lt;/mtext&gt;&lt;/mrow&gt;&lt;mi&gt;⋆&lt;/mi&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt; is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy &lt;i&gt;ε&lt;/i&gt; can be any value within &lt;span&gt;&lt;math altimg=\"eq-00003.gif\" display=\"inline\" overflow=\"scroll\"&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mrow&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;γ&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;. Our sample complexity bound strengthens prior art by a factor of &lt;span&gt;&lt;math altimg=\"eq-00004.gif\" display=\"inline\" overflow=\"scroll\"&gt;&lt;mrow&gt;&lt;mi&gt;min&lt;/mi&gt;&lt;mo stretchy=\"false\"&gt;{&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;mo stretchy=\"false\"&gt;}&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;, achieving minimax optimality for a broad regime of interest. An appealing feature of our result lies in its algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Funding:&lt;/b&gt; Y. Yan is supported in part by the Charlotte Elizabeth Procter Honorific Fellowship from Princeton University and the Norbert Wiener Postdoctoral Fellowship from MIT. Y. Chen is supported in part by the Alfred P. Sloan Research Fellowship, the Google Research Scholar Award, the Air Force Office of Scientific Research [Grant FA9550-22-1-0198], the Office of Naval Research [Grant N00014-22-1-2354], and the National Science Foundation [Grants CCF-2221009, CCF-1907661, IIS-2218713, DMS-2014279, and IIS-2218773]. J. Fan is supported in part by the National Science Foundation [Grants DMS-1712591, DMS-2052926, DMS-2053832, and DMS-2210833] and Office of Naval","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Projected Inventory-Level Policies for Lost Sales Inventory Systems: Asymptotic Optimality in Two Regimes 销售损失库存系统的预测库存水平政策:两种状态下的渐近最优性
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-04-01 DOI: 10.1287/opre.2021.0032
Willem van Jaarsveld, Joachim Arts
Operations Research, Ahead of Print.
运筹学》,印刷版前。
{"title":"Projected Inventory-Level Policies for Lost Sales Inventory Systems: Asymptotic Optimality in Two Regimes","authors":"Willem van Jaarsveld, Joachim Arts","doi":"10.1287/opre.2021.0032","DOIUrl":"https://doi.org/10.1287/opre.2021.0032","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"245 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assigning and Scheduling Generalized Malleable Jobs Under Subadditive or Submodular Processing Speeds 在次正或次模态处理速度下分配和调度广义可变工作
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-03-28 DOI: 10.1287/opre.2022.0168
Dimitris Fotakis, Jannik Matuschke, Orestis Papadigenopoulos

Malleable scheduling is a model that captures the possibility of parallelization to expedite the completion of time-critical tasks. A malleable job can be allocated and processed simultaneously on multiple machines, occupying the same time interval on all these machines. We study a general version of this setting, in which the functions determining the joint processing speed of machines for a given job follow different discrete concavity assumptions (subadditivity, fractional subadditivity, submodularity, and matroid ranks). We show that under these assumptions, the problem of scheduling malleable jobs at minimum makespan can be approximated by a considerably simpler assignment problem. Moreover, we provide efficient approximation algorithms for both the scheduling and the assignment problem, with increasingly stronger guarantees for increasingly stronger concavity assumptions, including a logarithmic approximation factor for the case of submodular processing speeds and a constant approximation factor when processing speeds are determined by matroid rank functions. Computational experiments indicate that our algorithms outperform the theoretical worst-case guarantees.

Funding: D. Fotakis received financial support from the Hellenic Foundation for Research and Innovation (H.F.R.I.) [“First Call for H.F.R.I. Research Projects to Support Faculty Members and Researchers and the Procurement of High-Cost Research Equipment Grant,” Project BALSAM, HFRI-FM17-1424]. J. Matuschke received financial support from the Fonds Wetenschappelijk Onderzoek-Vlanderen [Research Project G072520N “Optimization and Analytics for Stochastic and Robust Project Scheduling”]. O. Papadigenopoulos received financial support from the National Science Foundation Institute for Machine Learning [Award 2019844].

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0168.

可延展调度是一种模型,它捕捉了并行化的可能性,以加快完成时间紧迫的任务。一项可延展作业可在多台机器上同时分配和处理,并在所有这些机器上占用相同的时间间隔。我们研究了这种情况的一般版本,在这种情况下,决定给定作业的机器联合处理速度的函数遵循不同的离散凹性假设(次凹性、分数次凹性、次模性和矩阵秩)。我们的研究表明,在这些假设条件下,可以用一个简单得多的分配问题来近似调度可延展作业,使其达到最小工作时间。此外,我们还为调度和分配问题提供了高效的近似算法,在凹性假设越来越强的情况下,近似算法的保证也越来越强,包括在处理速度为次模态的情况下,近似系数为对数;在处理速度由 matroid 秩函数决定的情况下,近似系数为常数。计算实验表明,我们的算法优于理论上的最坏情况保证:D. Fotakis获得了希腊研究与创新基金会(H.F.R.I.)的资金支持["H.F.R.I.支持教师和研究人员的研究项目首次征集及高成本研究设备采购资助",BALSAM项目,HFRI-FM17-1424]。J. Matuschke 获得了 Fonds Wetenschappelijk Onderzoek-Vlanderen [研究项目 G072520N "用于随机和稳健项目调度的优化和分析"]的资助。O. Papadigenopoulos 获得了美国国家科学基金会机器学习研究所 [Award 2019844]的资助:在线附录见 https://doi.org/10.1287/opre.2022.0168。
{"title":"Assigning and Scheduling Generalized Malleable Jobs Under Subadditive or Submodular Processing Speeds","authors":"Dimitris Fotakis, Jannik Matuschke, Orestis Papadigenopoulos","doi":"10.1287/opre.2022.0168","DOIUrl":"https://doi.org/10.1287/opre.2022.0168","url":null,"abstract":"<p>Malleable scheduling is a model that captures the possibility of parallelization to expedite the completion of time-critical tasks. A malleable job can be allocated and processed simultaneously on multiple machines, occupying the same time interval on all these machines. We study a general version of this setting, in which the functions determining the joint processing speed of machines for a given job follow different discrete concavity assumptions (subadditivity, fractional subadditivity, submodularity, and matroid ranks). We show that under these assumptions, the problem of scheduling malleable jobs at minimum makespan can be approximated by a considerably simpler assignment problem. Moreover, we provide efficient approximation algorithms for both the scheduling and the assignment problem, with increasingly stronger guarantees for increasingly stronger concavity assumptions, including a logarithmic approximation factor for the case of submodular processing speeds and a constant approximation factor when processing speeds are determined by matroid rank functions. Computational experiments indicate that our algorithms outperform the theoretical worst-case guarantees.</p><p><b>Funding:</b> D. Fotakis received financial support from the Hellenic Foundation for Research and Innovation (H.F.R.I.) [“First Call for H.F.R.I. Research Projects to Support Faculty Members and Researchers and the Procurement of High-Cost Research Equipment Grant,” Project BALSAM, HFRI-FM17-1424]. J. Matuschke received financial support from the Fonds Wetenschappelijk Onderzoek-Vlanderen [Research Project G072520N “Optimization and Analytics for Stochastic and Robust Project Scheduling”]. O. Papadigenopoulos received financial support from the National Science Foundation Institute for Machine Learning [Award 2019844].</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0168.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"77 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Auction Design with Deferred Inspection and Reward 带有延迟检查和奖励的最佳拍卖设计
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-03-28 DOI: 10.1287/opre.2020.0651
Saeed Alaei, Alexandre Belloni, Ali Makhdoumi, Azarakhsh Malekian

Consider a mechanism run by an auctioneer who can use both payment and inspection instruments to incentivize agents. The timeline of the events is as follows. Based on a prespecified allocation rule and the reported values of agents, the auctioneer allocates the item and secures the reported values as deposits. The auctioneer then inspects the values of agents and, using a prespecified reward rule, rewards the ones who have reported truthfully. Using techniques from convex analysis and calculus of variations, for any distribution of values, we fully characterize the optimal mechanism for a single agent. Using Border’s theorem and duality, we find conditions under which our characterization extends to multiple agents. Interestingly, the optimal allocation function, unlike the classic settings without inspection, is not a threshold strategy and instead is an increasing and continuous function of the types. We also present an implementation of our optimal auction and show that it achieves a higher revenue than auctions in classic settings without inspection. This is because the inspection enables the auctioneer to charge payments closer to the agents’ true values without creating incentives for them to deviate to lower types.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2020.0651.

考虑一个由拍卖人运行的机制,拍卖人可以使用付款和检查两种手段来激励代理人。事件的时间表如下。根据预先规定的分配规则和代理人报告的价值,拍卖人分配物品,并将报告的价值作为保证金。然后,拍卖师检查代理人的价值,并根据预先规定的奖励规则奖励如实报告的代理人。利用凸分析和变分法的技术,对于任何价值分布,我们都能完全描述单个代理人的最优机制。利用边界定理和对偶性,我们找到了将我们的描述扩展到多个代理的条件。有趣的是,最优分配函数与没有检查的经典设置不同,它不是一个阈值策略,而是类型的递增连续函数。我们还提出了最优拍卖的实现方法,并证明它比传统的不带检验的拍卖获得了更高的收益。这是因为检查能让拍卖人收取更接近代理人真实价值的报酬,而不会刺激他们偏离较低的类型:在线附录见 https://doi.org/10.1287/opre.2020.0651。
{"title":"Optimal Auction Design with Deferred Inspection and Reward","authors":"Saeed Alaei, Alexandre Belloni, Ali Makhdoumi, Azarakhsh Malekian","doi":"10.1287/opre.2020.0651","DOIUrl":"https://doi.org/10.1287/opre.2020.0651","url":null,"abstract":"<p>Consider a mechanism run by an auctioneer who can use both payment and inspection instruments to incentivize agents. The timeline of the events is as follows. Based on a prespecified allocation rule and the reported values of agents, the auctioneer allocates the item and secures the reported values as deposits. The auctioneer then inspects the values of agents and, using a prespecified reward rule, rewards the ones who have reported truthfully. Using techniques from convex analysis and calculus of variations, for any distribution of values, we fully characterize the optimal mechanism for a single agent. Using Border’s theorem and duality, we find conditions under which our characterization extends to multiple agents. Interestingly, the optimal allocation function, unlike the classic settings without inspection, is not a threshold strategy and instead is an increasing and continuous function of the types. We also present an implementation of our optimal auction and show that it achieves a higher revenue than auctions in classic settings without inspection. This is because the inspection enables the auctioneer to charge payments closer to the agents’ true values without creating incentives for them to deviate to lower types.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2020.0651.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"53 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
To Interfere or Not To Interfere: Information Revelation and Price-Setting Incentives in a Multiagent Learning Environment 干预还是不干预?多代理学习环境中的信息揭示与价格制定激励机制
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-03-27 DOI: 10.1287/opre.2023.0363
John R. Birge, Hongfan (Kevin) Chen, N. Bora Keskin, Amy Ward
Operations Research, Ahead of Print.
运筹学》,印刷版前。
{"title":"To Interfere or Not To Interfere: Information Revelation and Price-Setting Incentives in a Multiagent Learning Environment","authors":"John R. Birge, Hongfan (Kevin) Chen, N. Bora Keskin, Amy Ward","doi":"10.1287/opre.2023.0363","DOIUrl":"https://doi.org/10.1287/opre.2023.0363","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"249 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140323314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Slowly Varying Regression Under Sparsity 稀疏性下的缓慢变化回归
IF 2.7 3区 管理学 Q3 MANAGEMENT Pub Date : 2024-03-27 DOI: 10.1287/opre.2022.0330
Dimitris Bertsimas, Vassilis Digalakis, Michael Lingzhi Li, Omar Skali Lami

We introduce the framework of slowly varying regression under sparsity, which allows sparse regression models to vary slowly and sparsely. We formulate the problem of parameter estimation as a mixed-integer optimization problem and demonstrate that it can be reformulated exactly as a binary convex optimization problem through a novel relaxation. The relaxation utilizes a new equality on Moore-Penrose inverses that convexifies the nonconvex objective function while coinciding with the original objective on all feasible binary points. This allows us to solve the problem significantly more efficiently and to provable optimality using a cutting plane–type algorithm. We develop a highly optimized implementation of such algorithm, which substantially improves upon the asymptotic computational complexity of a straightforward implementation. We further develop a fast heuristic method that is guaranteed to produce a feasible solution and, as we empirically illustrate, generates high-quality warm-start solutions for the binary optimization problem. To tune the framework’s hyperparameters, we propose a practical procedure relying on binary search that, under certain assumptions, is guaranteed to recover the true model parameters. We show, on both synthetic and real-world data sets, that the resulting algorithm outperforms competing formulations in comparable times across a variety of metrics, including estimation accuracy, predictive power, and computational time, and is highly scalable, enabling us to train models with tens of thousands of parameters.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.0330.

我们引入了稀疏性下的缓慢变化回归框架,它允许稀疏回归模型缓慢而稀疏地变化。我们将参数估计问题表述为一个混合整数优化问题,并证明可以通过一种新的松弛方法将其精确地重新表述为一个二元凸优化问题。该松弛利用摩尔-彭罗斯倒数的新等式,凸化了非凸目标函数,同时在所有可行的二进制点上与原始目标重合。这使我们能够更高效地解决这个问题,并使用切割平面型算法达到可证明的最优性。我们开发了这种算法的高度优化实现,大大提高了直接实现的渐近计算复杂度。我们还进一步开发了一种快速启发式方法,该方法能保证生成可行的解决方案,而且正如我们通过经验说明的那样,能为二元优化问题生成高质量的热启动解决方案。为了调整框架的超参数,我们提出了一种实用程序,该程序依赖于二元搜索,在某些假设条件下,可以保证恢复真实的模型参数。我们在合成数据集和真实世界数据集上表明,由此产生的算法在各种指标(包括估计精度、预测能力和计算时间)上都在可比时间内优于竞争方案,而且具有很强的可扩展性,使我们能够训练具有数万个参数的模型:在线附录见 https://doi.org/10.1287/opre.2022.0330。
{"title":"Slowly Varying Regression Under Sparsity","authors":"Dimitris Bertsimas, Vassilis Digalakis, Michael Lingzhi Li, Omar Skali Lami","doi":"10.1287/opre.2022.0330","DOIUrl":"https://doi.org/10.1287/opre.2022.0330","url":null,"abstract":"<p>We introduce the framework of slowly varying regression under sparsity, which allows sparse regression models to vary slowly and sparsely. We formulate the problem of parameter estimation as a mixed-integer optimization problem and demonstrate that it can be reformulated exactly as a binary convex optimization problem through a novel relaxation. The relaxation utilizes a new equality on Moore-Penrose inverses that convexifies the nonconvex objective function while coinciding with the original objective on all feasible binary points. This allows us to solve the problem significantly more efficiently and to provable optimality using a cutting plane–type algorithm. We develop a highly optimized implementation of such algorithm, which substantially improves upon the asymptotic computational complexity of a straightforward implementation. We further develop a fast heuristic method that is guaranteed to produce a feasible solution and, as we empirically illustrate, generates high-quality warm-start solutions for the binary optimization problem. To tune the framework’s hyperparameters, we propose a practical procedure relying on binary search that, under certain assumptions, is guaranteed to recover the true model parameters. We show, on both synthetic and real-world data sets, that the resulting algorithm outperforms competing formulations in comparable times across a variety of metrics, including estimation accuracy, predictive power, and computational time, and is highly scalable, enabling us to train models with tens of thousands of parameters.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0330.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":"2012 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140323316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Operations Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1