A robust-to-dynamics optimization (RDO) problem is an optimization problem specified by two pieces of input: (i) a mathematical program (an objective function [Formula: see text] and a feasible set [Formula: see text]) and (ii) a dynamical system (a map [Formula: see text]). Its goal is to minimize f over the set [Formula: see text] of initial conditions that forever remain in [Formula: see text] under g. The focus of this paper is on the case where the mathematical program is a linear program and where the dynamical system is either a known linear map or an uncertain linear map that can change over time. In both cases, we study a converging sequence of polyhedral outer approximations and (lifted) spectrahedral inner approximations to [Formula: see text]. Our inner approximations are optimized with respect to the objective function f, and their semidefinite characterization—which has a semidefinite constraint of fixed size—is obtained by applying polar duality to convex sets that are invariant under (multiple) linear maps. We characterize three barriers that can stop convergence of the outer approximations to [Formula: see text] from being finite. We prove that once these barriers are removed, our inner and outer approximating procedures find an optimal solution and a certificate of optimality for the RDO problem in a finite number of steps. Moreover, in the case where the dynamics are linear, we show that this phenomenon occurs in a number of steps that can be computed in time polynomial in the bit size of the input data. Our analysis also leads to a polynomial-time algorithm for RDO instances where the spectral radius of the linear map is bounded above by any constant less than one. Finally, in our concluding section, we propose a broader research agenda for studying optimization problems with dynamical systems constraints, of which RDO is a special case.Funding: O. Günlük was partially supported by the Office of Naval Research [Grant N00014-21-1-2575]. This work was partially funded by the Alfred P. Sloan Foundation, the Air Force Office of Scientific Research, Defense Advanced Research Projects Agency [Young Faculty Award], the National Science Foundation [Faculty Early Career Development Program Award], and Google [Faculty Award].
鲁棒-动态优化(RDO)问题是一个优化问题,由两部分输入指定:(i) 数学程序(目标函数[公式:见正文]和可行集[公式:见正文]);(ii) 动态系统(映射[公式:见正文])。本文的重点是数学程序是线性程序,动态系统是已知线性地图或随时间变化的不确定线性地图的情况。在这两种情况下,我们研究了多面体外近似和(提升的)谱面内近似[公式:见正文]的收敛序列。我们的内近似是针对目标函数 f 进行优化的,而它们的半有限表征--具有固定大小的半有限约束--是通过对在(多个)线性映射下不变的凸集应用极对偶性而获得的。我们描述了三个障碍,它们可以阻止[公式:见正文]的外近似的有限收敛。我们证明,一旦消除这些障碍,我们的内部和外部近似程序就能在有限步数内找到最优解和 RDO 问题的最优性证书。此外,在动态是线性的情况下,我们还证明了这一现象发生的步数,其计算时间与输入数据的比特大小成多项式关系。我们的分析还为 RDO 实例提供了一种多项式时间算法,在这种情况下,线性映射的谱半径以小于 1 的任意常数为界。最后,在结论部分,我们提出了研究具有动力系统约束的优化问题的更广泛的研究议程,RDO 就是其中的一个特例:O. Günlük 得到了美国海军研究办公室的部分资助[拨款 N00014-21-1-2575]。这项工作的部分经费来自阿尔弗雷德-斯隆基金会(Alfred P. Sloan Foundation)、空军科学研究办公室(Air Force Office of Scientific Research)、国防高级研究计划局(Defense Advanced Research Projects Agency)[青年教师奖]、美国国家科学基金会(National Science Foundation)[教师早期职业发展计划奖]和谷歌公司(Google)[教师奖]。
{"title":"Robust-to-Dynamics Optimization","authors":"Amir Ali Ahmadi, Oktay Günlük","doi":"10.1287/moor.2023.0116","DOIUrl":"https://doi.org/10.1287/moor.2023.0116","url":null,"abstract":"A robust-to-dynamics optimization (RDO) problem is an optimization problem specified by two pieces of input: (i) a mathematical program (an objective function [Formula: see text] and a feasible set [Formula: see text]) and (ii) a dynamical system (a map [Formula: see text]). Its goal is to minimize f over the set [Formula: see text] of initial conditions that forever remain in [Formula: see text] under g. The focus of this paper is on the case where the mathematical program is a linear program and where the dynamical system is either a known linear map or an uncertain linear map that can change over time. In both cases, we study a converging sequence of polyhedral outer approximations and (lifted) spectrahedral inner approximations to [Formula: see text]. Our inner approximations are optimized with respect to the objective function f, and their semidefinite characterization—which has a semidefinite constraint of fixed size—is obtained by applying polar duality to convex sets that are invariant under (multiple) linear maps. We characterize three barriers that can stop convergence of the outer approximations to [Formula: see text] from being finite. We prove that once these barriers are removed, our inner and outer approximating procedures find an optimal solution and a certificate of optimality for the RDO problem in a finite number of steps. Moreover, in the case where the dynamics are linear, we show that this phenomenon occurs in a number of steps that can be computed in time polynomial in the bit size of the input data. Our analysis also leads to a polynomial-time algorithm for RDO instances where the spectral radius of the linear map is bounded above by any constant less than one. Finally, in our concluding section, we propose a broader research agenda for studying optimization problems with dynamical systems constraints, of which RDO is a special case.Funding: O. Günlük was partially supported by the Office of Naval Research [Grant N00014-21-1-2575]. This work was partially funded by the Alfred P. Sloan Foundation, the Air Force Office of Scientific Research, Defense Advanced Research Projects Agency [Young Faculty Award], the National Science Foundation [Faculty Early Career Development Program Award], and Google [Faculty Award].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"85 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140612441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper deals with composite optimization problems having the objective function formed as the sum of two terms; one has a Lipschitz continuous gradient along random subspaces and may be nonconvex, and the second term is simple and differentiable but possibly nonconvex and nonseparable. Under these settings, we design a stochastic coordinate proximal gradient method that takes into account the nonseparable composite form of the objective function. This algorithm achieves scalability by constructing at each iteration a local approximation model of the whole nonseparable objective function along a random subspace with user-determined dimension. We outline efficient techniques for selecting the random subspace, yielding an implementation that has low cost per iteration, also achieving fast convergence rates. We present a probabilistic worst case complexity analysis for our stochastic coordinate proximal gradient method in convex and nonconvex settings; in particular, we prove high-probability bounds on the number of iterations before a given optimality is achieved. Extensive numerical results also confirm the efficiency of our algorithm.Funding: This work was supported by Norway Grants 2014-2021 [Grant ELO-Hyp 24/2020]; Unitatea Executiva pentru Finantarea Invatamantului Superior, a Cercetarii, Dezvoltarii si Inovarii [Grants PN-III-P4-PCE-2021-0720, L2O-MOC, nr 70/2022]; and the ITN-ETN project TraDE-OPT funded by the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie grant agreement [Grant 861137].
本文处理的是目标函数为两个项之和的复合优化问题;其中一个项具有沿随机子空间的利普斯奇兹连续梯度,并且可能是非凸的,而第二个项是简单可微的,但可能是非凸和不可分的。在这种情况下,我们设计了一种随机坐标近似梯度法,它考虑到了目标函数的不可分割复合形式。该算法通过在每次迭代中沿用户确定维度的随机子空间构建整个不可分割目标函数的局部近似模型来实现可扩展性。我们概述了选择随机子空间的高效技术,从而实现了每次迭代成本低、收敛速度快的算法。我们提出了随机坐标近似梯度法在凸和非凸环境下的概率最坏情况复杂性分析;特别是,我们证明了在达到给定最优性之前迭代次数的高概率边界。广泛的数值结果也证实了我们算法的效率:这项工作得到了挪威 2014-2021 年赠款[赠款 ELO-Hyp 24/2020]、Unitatea Executiva pentru Finantarea Invatamantului Superior, a Cercetarii, Dezvoltarii si Inovarii [赠款 PN-III-P4-PCE-2021-0720, L2O-MOC, nr 70/2022]以及 ITN-ETN 项目的支持;以及由欧盟 "地平线 2020 研究与创新计划 "资助的 ITN-ETN 项目 TraDE-OPT,根据 Marie Skłodowska-Curie 补助金协议[第 861137 号补助金]。
{"title":"Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization","authors":"Ion Necoara, Flavia Chorobura","doi":"10.1287/moor.2023.0044","DOIUrl":"https://doi.org/10.1287/moor.2023.0044","url":null,"abstract":"This paper deals with composite optimization problems having the objective function formed as the sum of two terms; one has a Lipschitz continuous gradient along random subspaces and may be nonconvex, and the second term is simple and differentiable but possibly nonconvex and nonseparable. Under these settings, we design a stochastic coordinate proximal gradient method that takes into account the nonseparable composite form of the objective function. This algorithm achieves scalability by constructing at each iteration a local approximation model of the whole nonseparable objective function along a random subspace with user-determined dimension. We outline efficient techniques for selecting the random subspace, yielding an implementation that has low cost per iteration, also achieving fast convergence rates. We present a probabilistic worst case complexity analysis for our stochastic coordinate proximal gradient method in convex and nonconvex settings; in particular, we prove high-probability bounds on the number of iterations before a given optimality is achieved. Extensive numerical results also confirm the efficiency of our algorithm.Funding: This work was supported by Norway Grants 2014-2021 [Grant ELO-Hyp 24/2020]; Unitatea Executiva pentru Finantarea Invatamantului Superior, a Cercetarii, Dezvoltarii si Inovarii [Grants PN-III-P4-PCE-2021-0720, L2O-MOC, nr 70/2022]; and the ITN-ETN project TraDE-OPT funded by the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie grant agreement [Grant 861137].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"63 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140612454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The traveling tournament problem (TTP) is a hard but interesting sports scheduling problem inspired by Major League Baseball, which is to design a double round-robin schedule such that each pair of teams plays one game in each other’s home venue, minimizing the total distance traveled by all n teams (n is even). In this paper, we consider TTP-2 (i.e., TTP under the constraint that at most two consecutive home games or away games are allowed for each team). In this paper, we propose practical algorithms for TTP-2 with improved approximation ratios. Because of the different structural properties of the problem, all known algorithms for TTP-2 are different for n/2 being odd and even, and our algorithms are also different for these two cases. For even n/2, our approximation ratio is [Formula: see text], improving the previous result of [Formula: see text]. For odd n/2, our approximation ratio is [Formula: see text], improving the previous result of [Formula: see text]. In practice, our algorithms are easy to implement. Experiments on well-known benchmark sets show that our algorithms beat previously known solutions for all instances with an average improvement of 5.66%.Funding: This work was supported by the National Natural Science Foundation of China [Grants 62372095 and 62172077] and the Sichuan Natural Science Foundation [Grant 2023NSFSC0059].
巡回赛问题(TTP)是受美国职业棒球大联盟(Major League Baseball)启发而提出的一个困难但有趣的体育赛事安排问题,即设计一个双循环赛程表,使每对球队在对方主场各打一场比赛,最大限度地减少所有 n 支球队(n 为偶数)的总路程。在本文中,我们考虑的是 TTP-2(即每队最多允许连续进行两场主场比赛或客场比赛的约束条件下的 TTP)。本文针对 TTP-2 提出了改进近似率的实用算法。由于问题的结构特性不同,所有已知的 TTP-2 算法在 n/2 为奇数和偶数时都不同,我们的算法在这两种情况下也不同。对于偶数 n/2,我们的近似率是[公式:见正文],改进了之前的结果[公式:见正文]。对于奇数 n/2,我们的近似率是[公式:见正文],改进了之前的结果[公式:见正文]。实际上,我们的算法很容易实现。在知名基准集上的实验表明,我们的算法在所有实例上都优于之前已知的解决方案,平均提高了 5.66%:本研究得到了国家自然科学基金[62372095 和 62172077]和四川省自然科学基金[2023NSFSC0059]的资助。
{"title":"Practical Algorithms with Guaranteed Approximation Ratio for Traveling Tournament Problem with Maximum Tour Length 2","authors":"Jingyang Zhao, Mingyu Xiao","doi":"10.1287/moor.2022.0356","DOIUrl":"https://doi.org/10.1287/moor.2022.0356","url":null,"abstract":"The traveling tournament problem (TTP) is a hard but interesting sports scheduling problem inspired by Major League Baseball, which is to design a double round-robin schedule such that each pair of teams plays one game in each other’s home venue, minimizing the total distance traveled by all n teams (n is even). In this paper, we consider TTP-2 (i.e., TTP under the constraint that at most two consecutive home games or away games are allowed for each team). In this paper, we propose practical algorithms for TTP-2 with improved approximation ratios. Because of the different structural properties of the problem, all known algorithms for TTP-2 are different for n/2 being odd and even, and our algorithms are also different for these two cases. For even n/2, our approximation ratio is [Formula: see text], improving the previous result of [Formula: see text]. For odd n/2, our approximation ratio is [Formula: see text], improving the previous result of [Formula: see text]. In practice, our algorithms are easy to implement. Experiments on well-known benchmark sets show that our algorithms beat previously known solutions for all instances with an average improvement of 5.66%.Funding: This work was supported by the National Natural Science Foundation of China [Grants 62372095 and 62172077] and the Sichuan Natural Science Foundation [Grant 2023NSFSC0059].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"36 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diminishing returns (DR)–submodular functions encompass a broad class of functions that are generally nonconvex and nonconcave. We study the problem of minimizing any DR-submodular function with continuous and general integer variables under box constraints and, possibly, additional monotonicity constraints. We propose valid linear inequalities for the epigraph of any DR-submodular function under the constraints. We further provide the complete convex hull of such an epigraph, which, surprisingly, turns out to be polyhedral. We propose a polynomial-time exact separation algorithm for our proposed valid inequalities with which we first establish the polynomial-time solvability of this class of mixed-integer nonlinear optimization problems.Funding: This work was supported by the Office of Naval Research Global [Grant N00014-22-1-2602].
收益递减(DR)-次模态函数包括一大类函数,它们通常是非凸非凹的。我们研究的问题是,在盒式约束和可能的附加单调性约束下,如何最小化任何具有连续和一般整数变量的 DR 次模态函数。我们提出了约束条件下任何 DR 次模态函数外延的有效线性不等式。我们进一步提供了这样一个外延的完整凸壳,令人惊讶的是,这个凸壳竟然是多面体的。我们针对所提出的有效不等式提出了一种多项式时间精确分离算法,通过这种算法,我们首先建立了这一类混合整数非线性优化问题的多项式时间可解性:这项工作得到了全球海军研究办公室[N00014-22-1-2602 号拨款]的支持。
{"title":"On Constrained Mixed-Integer DR-Submodular Minimization","authors":"Qimeng Yu, Simge Küçükyavuz","doi":"10.1287/moor.2022.0320","DOIUrl":"https://doi.org/10.1287/moor.2022.0320","url":null,"abstract":"Diminishing returns (DR)–submodular functions encompass a broad class of functions that are generally nonconvex and nonconcave. We study the problem of minimizing any DR-submodular function with continuous and general integer variables under box constraints and, possibly, additional monotonicity constraints. We propose valid linear inequalities for the epigraph of any DR-submodular function under the constraints. We further provide the complete convex hull of such an epigraph, which, surprisingly, turns out to be polyhedral. We propose a polynomial-time exact separation algorithm for our proposed valid inequalities with which we first establish the polynomial-time solvability of this class of mixed-integer nonlinear optimization problems.Funding: This work was supported by the Office of Naval Research Global [Grant N00014-22-1-2602].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"58 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Fraiman, Tzu-Chi Lin, Mariana Olvera-Cravioto
We propose and analyze a mathematical model for the evolution of opinions on directed complex networks. Our model generalizes the popular DeGroot and Friedkin-Johnsen models by allowing vertices to have attributes that may influence the opinion dynamics. We start by establishing sufficient conditions for the existence of a stationary opinion distribution on any fixed graph, and then provide an increasingly detailed characterization of its behavior by considering a sequence of directed random graphs having a local weak limit. Our most explicit results are obtained for graph sequences whose local weak limit is a marked Galton-Watson tree, in which case our model can be used to explain a variety of phenomena, for example, conditions under which consensus can be achieved, mechanisms in which opinions can become polarized, and the effect of disruptive stubborn agents on the formation of opinions.Funding: This work was supported by the National Science Foundation [Grants NSF-DMS-1929298 and CMMI-2243261].
{"title":"Opinion Dynamics on Directed Complex Networks","authors":"Nicolas Fraiman, Tzu-Chi Lin, Mariana Olvera-Cravioto","doi":"10.1287/moor.2022.0250","DOIUrl":"https://doi.org/10.1287/moor.2022.0250","url":null,"abstract":"We propose and analyze a mathematical model for the evolution of opinions on directed complex networks. Our model generalizes the popular DeGroot and Friedkin-Johnsen models by allowing vertices to have attributes that may influence the opinion dynamics. We start by establishing sufficient conditions for the existence of a stationary opinion distribution on any fixed graph, and then provide an increasingly detailed characterization of its behavior by considering a sequence of directed random graphs having a local weak limit. Our most explicit results are obtained for graph sequences whose local weak limit is a marked Galton-Watson tree, in which case our model can be used to explain a variety of phenomena, for example, conditions under which consensus can be achieved, mechanisms in which opinions can become polarized, and the effect of disruptive stubborn agents on the formation of opinions.Funding: This work was supported by the National Science Foundation [Grants NSF-DMS-1929298 and CMMI-2243261].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"14 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the first-come-first-serve (FCFS) [Formula: see text] queue and prove the first simple and explicit bounds that scale as [Formula: see text] under only the assumption that interarrival times have finite second moment, and service times have finite [Formula: see text] moment for some [Formula: see text]. Here, ρ denotes the corresponding traffic intensity. Conceptually, our results can be viewed as a multiserver analogue of Kingman’s bound. Our main results are bounds for the tail of the steady-state queue length and the steady-state probability of delay. The strength of our bounds (e.g., in the form of tail decay rate) is a function of how many moments of the service distribution are assumed finite. Our bounds scale gracefully, even when the number of servers grows large and the traffic intensity converges to unity simultaneously, as in the Halfin-Whitt scaling regime. Some of our bounds scale better than [Formula: see text] in certain asymptotic regimes. In these same asymptotic regimes, we also prove bounds for the tail of the steady-state number in service. Our main proofs proceed by explicitly analyzing the bounding process that arises in the stochastic comparison bounds of Gamarnik and Goldberg for multiserver queues. Along the way, we derive several novel results for suprema of random walks and pooled renewal processes, which may be of independent interest. We also prove several additional bounds using drift arguments (which have much smaller prefactors) and point out a conjecture that would imply further related bounds and generalizations. We also show that when all moments of the service distribution are finite and satisfy a mild growth rate assumption, our bounds can be strengthened to yield explicit tail estimates decaying as [Formula: see text], with [Formula: see text], depending on the growth rate of these moments.Funding: Financial support from the National Science Foundation [Grant 1333457] is gratefully acknowledged.Supplemental Material: The supplemental appendix is available at https://doi.org/10.1287/moor.2022.0131 .
{"title":"Simple and Explicit Bounds for Multiserver Queues with 11−ρ Scaling","authors":"Yuan Li, David A. Goldberg","doi":"10.1287/moor.2022.0131","DOIUrl":"https://doi.org/10.1287/moor.2022.0131","url":null,"abstract":"We consider the first-come-first-serve (FCFS) [Formula: see text] queue and prove the first simple and explicit bounds that scale as [Formula: see text] under only the assumption that interarrival times have finite second moment, and service times have finite [Formula: see text] moment for some [Formula: see text]. Here, ρ denotes the corresponding traffic intensity. Conceptually, our results can be viewed as a multiserver analogue of Kingman’s bound. Our main results are bounds for the tail of the steady-state queue length and the steady-state probability of delay. The strength of our bounds (e.g., in the form of tail decay rate) is a function of how many moments of the service distribution are assumed finite. Our bounds scale gracefully, even when the number of servers grows large and the traffic intensity converges to unity simultaneously, as in the Halfin-Whitt scaling regime. Some of our bounds scale better than [Formula: see text] in certain asymptotic regimes. In these same asymptotic regimes, we also prove bounds for the tail of the steady-state number in service. Our main proofs proceed by explicitly analyzing the bounding process that arises in the stochastic comparison bounds of Gamarnik and Goldberg for multiserver queues. Along the way, we derive several novel results for suprema of random walks and pooled renewal processes, which may be of independent interest. We also prove several additional bounds using drift arguments (which have much smaller prefactors) and point out a conjecture that would imply further related bounds and generalizations. We also show that when all moments of the service distribution are finite and satisfy a mild growth rate assumption, our bounds can be strengthened to yield explicit tail estimates decaying as [Formula: see text], with [Formula: see text], depending on the growth rate of these moments.Funding: Financial support from the National Science Foundation [Grant 1333457] is gratefully acknowledged.Supplemental Material: The supplemental appendix is available at https://doi.org/10.1287/moor.2022.0131 .","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"32 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we develop a stochastic algorithm based on the Euler–Maruyama scheme to approximate the invariant measure of the limiting multidimensional diffusion of [Formula: see text] queues in the Halfin–Whitt regime. Specifically, we prove a nonasymptotic error bound between the invariant measures of the approximate model from the algorithm and the limiting diffusion. To establish the error bound, we employ the recently developed Stein’s method for multidimensional diffusions, in which the regularity of Stein’s equation obtained by the partial differential equation (PDE) theory plays a crucial role. We further prove the central limit theorem (CLT) and the moderate deviation principle (MDP) for the occupation measures of the limiting diffusion of [Formula: see text] queues and its Euler–Maruyama scheme. In particular, the variances in the CLT and MDP associated with the limiting diffusion are determined by Stein’s equation and Malliavin calculus, in which properties of a mollified diffusion and an associated weighted occupation time play a crucial role.Funding: X. Jin is supported in part by the Fundamental Research Funds for the Central Universities [Grants JZ2022HGQA0148 and JZ2023HGTA0170]. G. Pang is supported in part by the U.S. National Science Foundation [Grants DMS-1715875 and DMS-2216765]. L. Xu is supported in part by the National Nature Science Foundation of China [Grant 12071499], Macao Special Administrative Region [Grant FDCT 0090/2019/A2], and the University of Macau [Grant MYRG2018-00133-FST]. This work was supported by U.S. National Science Foundation [Grant DMS-2108683].
在本文中,我们开发了一种基于欧拉-Maruyama 方案的随机算法,用于近似 Halfin-Whitt 体系中[公式:见正文]队列的极限多维扩散的不变度量。具体地说,我们证明了算法近似模型的不变度量与极限扩散之间的非渐近误差约束。为了建立误差约束,我们采用了最近开发的斯坦因多维扩散方法,其中由偏微分方程(PDE)理论得到的斯坦因方程的正则性起着至关重要的作用。我们进一步证明了[公式:见正文]队列及其欧拉-马鲁山方案的极限扩散的占用度量的中心极限定理(CLT)和适度偏差原理(MDP)。特别是,与极限扩散相关的 CLT 和 MDP 中的方差是由斯坦因方程和马利亚文微积分决定的,其中软化扩散和相关加权占用时间的特性起着至关重要的作用:X. Jin 的部分研究经费来自中央高校基本科研业务费[JZ2022HGQA0148 和 JZ2023HGTA0170]。G. Pang 得到美国国家科学基金会[Grants DMS-1715875 and DMS-2216765] 的部分资助。L. Xu 的部分研究工作得到国家自然科学基金委员会 [Grant 12071499]、澳门特别行政区 [Grant FDCT 0090/2019/A2] 和澳门大学 [Grant MYRG2018-00133-FST] 的支持。这项工作得到了美国国家科学基金会[DMS-2108683号资助]的支持。
{"title":"An Approximation to the Invariant Measure of the Limiting Diffusion of G/Ph/n + GI Queues in the Halfin–Whitt Regime and Related Asymptotics","authors":"Xinghu Jin, Guodong Pang, Lihu Xu, Xin Xu","doi":"10.1287/moor.2021.0241","DOIUrl":"https://doi.org/10.1287/moor.2021.0241","url":null,"abstract":"In this paper, we develop a stochastic algorithm based on the Euler–Maruyama scheme to approximate the invariant measure of the limiting multidimensional diffusion of [Formula: see text] queues in the Halfin–Whitt regime. Specifically, we prove a nonasymptotic error bound between the invariant measures of the approximate model from the algorithm and the limiting diffusion. To establish the error bound, we employ the recently developed Stein’s method for multidimensional diffusions, in which the regularity of Stein’s equation obtained by the partial differential equation (PDE) theory plays a crucial role. We further prove the central limit theorem (CLT) and the moderate deviation principle (MDP) for the occupation measures of the limiting diffusion of [Formula: see text] queues and its Euler–Maruyama scheme. In particular, the variances in the CLT and MDP associated with the limiting diffusion are determined by Stein’s equation and Malliavin calculus, in which properties of a mollified diffusion and an associated weighted occupation time play a crucial role.Funding: X. Jin is supported in part by the Fundamental Research Funds for the Central Universities [Grants JZ2022HGQA0148 and JZ2023HGTA0170]. G. Pang is supported in part by the U.S. National Science Foundation [Grants DMS-1715875 and DMS-2216765]. L. Xu is supported in part by the National Nature Science Foundation of China [Grant 12071499], Macao Special Administrative Region [Grant FDCT 0090/2019/A2], and the University of Macau [Grant MYRG2018-00133-FST]. This work was supported by U.S. National Science Foundation [Grant DMS-2108683].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"42 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mihail Bazhba, Jose Blanchet, Chang-Han Rhee, Bert Zwart
We prove a sample-path large deviation principle (LDP) with sublinear speed for unbounded functionals of certain Markov chains induced by the Lindley recursion. The LDP holds in the Skorokhod space [Formula: see text] equipped with the [Formula: see text] topology. Our technique hinges on a suitable decomposition of the Markov chain in terms of regeneration cycles. Each regeneration cycle denotes the area accumulated during the busy period of the reflected random walk. We prove a large deviation principle for the area under the busy period of the Markov random walk, and we show that it exhibits a heavy-tailed behavior.Funding: The research of B. Zwart and M. Bazhba is supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek [Grant 639.033.413]. The research of J. Blanchet is supported by the National Science Foundation (NSF) [Grants 1915967, 1820942, and 1838576] as well as the Defense Advanced Research Projects Agency [Grant N660011824028]. The research of C.-H. Rhee is supported by the NSF [Grant CMMI-2146530].
我们以亚线性速度证明了林德利递推诱导的某些马尔可夫链的无界函数的样本路径大偏差原理(LDP)。大偏差原理在配有[公式:见正文]拓扑的斯科罗霍德空间[公式:见正文]中成立。我们的技术取决于用再生周期对马尔可夫链进行适当的分解。每个再生周期表示在反射随机游走的繁忙期积累的面积。我们证明了马尔可夫随机游走忙周期下面积的大偏差原理,并证明它表现出重尾行为:B. Zwart 和 M. Bazhba 的研究得到了 Nederlandse Organisatie voor Wetenschappelijk Onderzoek [Grant 639.033.413] 的支持。J. Blanchet 的研究得到了美国国家科学基金会(NSF)[1915967、1820942 和 1838576 号资助]以及美国国防部高级研究计划局[N660011824028 号资助]的支持。C.-H.Rhee 的研究得到了美国国家科学基金会 [CMMI-2146530] 的资助。
{"title":"Sample-Path Large Deviations for Unbounded Additive Functionals of the Reflected Random Walk","authors":"Mihail Bazhba, Jose Blanchet, Chang-Han Rhee, Bert Zwart","doi":"10.1287/moor.2020.0094","DOIUrl":"https://doi.org/10.1287/moor.2020.0094","url":null,"abstract":"We prove a sample-path large deviation principle (LDP) with sublinear speed for unbounded functionals of certain Markov chains induced by the Lindley recursion. The LDP holds in the Skorokhod space [Formula: see text] equipped with the [Formula: see text] topology. Our technique hinges on a suitable decomposition of the Markov chain in terms of regeneration cycles. Each regeneration cycle denotes the area accumulated during the busy period of the reflected random walk. We prove a large deviation principle for the area under the busy period of the Markov random walk, and we show that it exhibits a heavy-tailed behavior.Funding: The research of B. Zwart and M. Bazhba is supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek [Grant 639.033.413]. The research of J. Blanchet is supported by the National Science Foundation (NSF) [Grants 1915967, 1820942, and 1838576] as well as the Defense Advanced Research Projects Agency [Grant N660011824028]. The research of C.-H. Rhee is supported by the NSF [Grant CMMI-2146530].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"1 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For a subset T of nodes of an undirected graph G, a T-Steiner cut is a cut [Formula: see text] with [Formula: see text] and [Formula: see text]. The T-Steiner cut dominant of G is the dominant [Formula: see text] of the convex hull of the incidence vectors of the T-Steiner cuts of G. For [Formula: see text], this is the well-understood s-t-cut dominant. Choosing T as the set of all nodes of G, we obtain the cut dominant for which an outer description in the space of the original variables is still not known. We prove that for each integer τ, there is a finite set of inequalities such that for every pair (G, T) with [Formula: see text], the nontrivial facet-defining inequalities of [Formula: see text] are the inequalities that can be obtained via iterated applications of two simple operations, starting from that set. In particular, the absolute values of the coefficients and of the right-hand sides in a description of [Formula: see text] by integral inequalities can be bounded from above by a function of [Formula: see text]. For all [Formula: see text], we provide descriptions of [Formula: see text] by facet-defining inequalities, extending the known descriptions of s-t-cut dominants.
对于无向图 G 的节点子集 T,T-Steiner 切分是具有[公式:见正文]和[公式:见正文]的切分[公式:见正文]。G 的 T-Steiner 切分显式是 G 的 T-Steiner 切分的入射向量凸壳的显式[公式:见正文],对于[公式:见正文],这就是广为人知的 s-t 切分显式。选择 T 作为 G 的所有节点集,我们就得到了切分显式,而对于切分显式,原变量空间中的外部描述仍然未知。我们证明,对于每个整数 τ,都有一个有限的不等式集,即对于每一对具有[公式:见正文]的(G,T),[公式:见正文]的非难面定义不等式都是可以通过迭代应用两个简单运算得到的不等式,从这个集合开始。特别是,在用积分不等式描述[公式:见正文]时,系数和右边的绝对值可以用[公式:见正文]的函数从上而下加以限定。对于所有[公式:见正文],我们通过面定义不等式提供了[公式:见正文]的描述,扩展了已知的 s-t 切占优描述。
{"title":"Steiner Cut Dominants","authors":"Michele Conforti, Volker Kaibel","doi":"10.1287/moor.2022.0280","DOIUrl":"https://doi.org/10.1287/moor.2022.0280","url":null,"abstract":"For a subset T of nodes of an undirected graph G, a T-Steiner cut is a cut [Formula: see text] with [Formula: see text] and [Formula: see text]. The T-Steiner cut dominant of G is the dominant [Formula: see text] of the convex hull of the incidence vectors of the T-Steiner cuts of G. For [Formula: see text], this is the well-understood s-t-cut dominant. Choosing T as the set of all nodes of G, we obtain the cut dominant for which an outer description in the space of the original variables is still not known. We prove that for each integer τ, there is a finite set of inequalities such that for every pair (G, T) with [Formula: see text], the nontrivial facet-defining inequalities of [Formula: see text] are the inequalities that can be obtained via iterated applications of two simple operations, starting from that set. In particular, the absolute values of the coefficients and of the right-hand sides in a description of [Formula: see text] by integral inequalities can be bounded from above by a function of [Formula: see text]. For all [Formula: see text], we provide descriptions of [Formula: see text] by facet-defining inequalities, extending the known descriptions of s-t-cut dominants.","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"6 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the regret of offline reinforcement learning in an infinite-horizon discounted Markov decision process (MDP). While existing analyses of common approaches, such as fitted Q-iteration (FQI), suggest root-n convergence for regret, empirical behavior exhibits much faster convergence. In this paper, we present a finer regret analysis that exactly characterizes this phenomenon by providing fast rates for the regret convergence. First, we show that given any estimate for the optimal quality function, the regret of the policy it defines converges at a rate given by the exponentiation of the estimate’s pointwise convergence rate, thus speeding up the rate. The level of exponentiation depends on the level of noise in the decision-making problem, rather than the estimation problem. We establish such noise levels for linear and tabular MDPs as examples. Second, we provide new analyses of FQI and Bellman residual minimization to establish the correct pointwise convergence guarantees. As specific cases, our results imply one-over-n rates in linear cases and exponential-in-n rates in tabular cases. We extend our findings to general function approximation by extending our results to regret guarantees based on Lp-convergence rates for estimating the optimal quality function rather than pointwise rates, where L2 guarantees for nonparametric estimation can be ensured under mild conditions.Funding: This work was supported by the Division of Information and Intelligent Systems, National Science Foundation [Grant 1846210].
{"title":"Fast Rates for the Regret of Offline Reinforcement Learning","authors":"Yichun Hu, Nathan Kallus, Masatoshi Uehara","doi":"10.1287/moor.2021.0167","DOIUrl":"https://doi.org/10.1287/moor.2021.0167","url":null,"abstract":"We study the regret of offline reinforcement learning in an infinite-horizon discounted Markov decision process (MDP). While existing analyses of common approaches, such as fitted Q-iteration (FQI), suggest root-n convergence for regret, empirical behavior exhibits much faster convergence. In this paper, we present a finer regret analysis that exactly characterizes this phenomenon by providing fast rates for the regret convergence. First, we show that given any estimate for the optimal quality function, the regret of the policy it defines converges at a rate given by the exponentiation of the estimate’s pointwise convergence rate, thus speeding up the rate. The level of exponentiation depends on the level of noise in the decision-making problem, rather than the estimation problem. We establish such noise levels for linear and tabular MDPs as examples. Second, we provide new analyses of FQI and Bellman residual minimization to establish the correct pointwise convergence guarantees. As specific cases, our results imply one-over-n rates in linear cases and exponential-in-n rates in tabular cases. We extend our findings to general function approximation by extending our results to regret guarantees based on L<jats:sub>p</jats:sub>-convergence rates for estimating the optimal quality function rather than pointwise rates, where L<jats:sub>2</jats:sub> guarantees for nonparametric estimation can be ensured under mild conditions.Funding: This work was supported by the Division of Information and Intelligent Systems, National Science Foundation [Grant 1846210].","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"47 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140297959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}