Pub Date : 2024-06-20DOI: 10.1007/s10107-024-02108-w
Aris Daniilidis, David Salas, Sebastián Tapia-García
A classical result of variational analysis, known as Attouch theorem, establishes an equivalence between epigraphical convergence of a sequence of proper convex lower semicontinuous functions and graphical convergence of the corresponding subdifferential maps up to a normalization condition which fixes the integration constant. In this work, we show that in finite dimensions and under a mild boundedness assumption, we can replace subdifferentials (sets of vectors) by slopes (scalars, corresponding to the distance of the subdifferentials to zero) and still obtain the same characterization: namely, the epigraphical convergence of functions is equivalent to the epigraphical convergence of their slopes. This surprising result goes in line with recent developments on slope determination (Boulmezaoud et al. in SIAM J Optim 28(3):2049–2066, 2018; Pérez-Aros et al. in Math Program 190(1–2):561-583, 2021) and slope sensitivity (Daniilidis and Drusvyatskiy in Proc Am Math Soc 151(11):4751-4756, 2023) for convex functions.
变分分析的一个经典结果,即阿图什(Attouch)定理,确定了适当凸下半连续函数序列的表观收敛性与相应次微分映射的图形收敛性之间的等价性,但须满足一个固定积分常数的归一化条件。在这项工作中,我们证明了在有限维度和温和的有界性假设下,我们可以用斜率(标量,对应于子微分到零的距离)替换子微分(向量集),并仍然得到相同的特征:即函数的图解收敛等同于其斜率的图解收敛。这一令人惊讶的结果与凸函数的斜率确定(Boulmezaoud 等人,发表于 SIAM J Optim 28(3):2049-2066, 2018;Pérez-Aros 等人,发表于 Math Program 190(1-2):561-583, 2021)和斜率敏感性(Daniilidis 和 Drusvyatskiy,发表于 Proc Am Math Soc 151(11):4751-4756, 2023)的最新进展一致。
{"title":"A slope generalization of Attouch theorem","authors":"Aris Daniilidis, David Salas, Sebastián Tapia-García","doi":"10.1007/s10107-024-02108-w","DOIUrl":"https://doi.org/10.1007/s10107-024-02108-w","url":null,"abstract":"<p>A classical result of variational analysis, known as Attouch theorem, establishes an equivalence between epigraphical convergence of a sequence of proper convex lower semicontinuous functions and graphical convergence of the corresponding subdifferential maps up to a normalization condition which fixes the integration constant. In this work, we show that in finite dimensions and under a mild boundedness assumption, we can replace subdifferentials (sets of vectors) by slopes (scalars, corresponding to the distance of the subdifferentials to zero) and still obtain the same characterization: namely, the epigraphical convergence of functions is equivalent to the epigraphical convergence of their slopes. This surprising result goes in line with recent developments on slope determination (Boulmezaoud et al. in SIAM J Optim 28(3):2049–2066, 2018; Pérez-Aros et al. in Math Program 190(1–2):561-583, 2021) and slope sensitivity (Daniilidis and Drusvyatskiy in Proc Am Math Soc 151(11):4751-4756, 2023) for convex functions.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-20DOI: 10.1007/s10107-024-02101-3
Zhongzhu Chen, Marcia Fampa, Jon Lee
The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is (ordinary) scaling, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to generalized scaling, employing a positive vector of parameters, which allows much more flexibility and thus potentially reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.
{"title":"Generalized scaling for the constrained maximum-entropy sampling problem","authors":"Zhongzhu Chen, Marcia Fampa, Jon Lee","doi":"10.1007/s10107-024-02101-3","DOIUrl":"https://doi.org/10.1007/s10107-024-02101-3","url":null,"abstract":"<p>The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is <i>(ordinary) scaling</i>, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to <i>generalized scaling</i>, employing a positive vector of parameters, which allows much more flexibility and thus potentially reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"25 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-13DOI: 10.1007/s10107-024-02106-y
Arindam Khan, Aditya Subramanian, Andreas Wiese
We study rectangle stabbing problems in which we are given n axis-aligned rectangles in the plane that we want to stab, that is, we want to select line segments such that for each given rectangle there is a line segment that intersects two opposite edges of it. In the horizontal rectangle stabbing problem (Stabbing), the goal is to find a set of horizontal line segments of minimum total length such that all rectangles are stabbed. In the horizontal–vertical stabbing problem (HV-Stabbing), the goal is to find a set of rectilinear (that is, either vertical or horizontal) line segments of minimum total length such that all rectangles are stabbed. Both variants are NP-hard. Chan et al. (ISAAC, 2018) initiated the study of these problems by providing constant approximation algorithms. Recently, Eisenbrand et al. (A QPTAS for stabbing rectangles, 2021) have presented a QPTAS and a polynomial-time 8-approximation algorithm for Stabbing, but it was open whether the problem admits a PTAS. In this paper, we obtain a PTAS for Stabbing, settling this question. For HV-Stabbing, we obtain a ((2+varepsilon ))-approximation. We also obtain PTASs for special cases of HV-Stabbing: (i) when all rectangles are squares, (ii) when each rectangle’s width is at most its height, and (iii) when all rectangles are (delta )-large, that is, have at least one edge whose length is at least (delta ), while all edge lengths are at most 1. Our result also implies improved approximations for other problems such as generalized minimum Manhattan network.
{"title":"A PTAS for the horizontal rectangle stabbing problem","authors":"Arindam Khan, Aditya Subramanian, Andreas Wiese","doi":"10.1007/s10107-024-02106-y","DOIUrl":"https://doi.org/10.1007/s10107-024-02106-y","url":null,"abstract":"<p>We study rectangle stabbing problems in which we are given <i>n</i> axis-aligned rectangles in the plane that we want to <i>stab</i>, that is, we want to select line segments such that for each given rectangle there is a line segment that intersects two opposite edges of it. In the <i>horizontal rectangle stabbing problem</i> (<span>Stabbing</span>), the goal is to find a set of horizontal line segments of minimum total length such that all rectangles are stabbed. In the <i>horizontal–vertical stabbing problem</i> (<span>HV-Stabbing</span>), the goal is to find a set of rectilinear (that is, either vertical or horizontal) line segments of minimum total length such that all rectangles are stabbed. Both variants are NP-hard. Chan et al. (ISAAC, 2018) initiated the study of these problems by providing constant approximation algorithms. Recently, Eisenbrand et al. (A QPTAS for stabbing rectangles, 2021) have presented a QPTAS and a polynomial-time 8-approximation algorithm for <span>Stabbing</span>, but it was open whether the problem admits a PTAS. In this paper, we obtain a PTAS for <span>Stabbing</span>, settling this question. For <span>HV-Stabbing</span>, we obtain a <span>((2+varepsilon ))</span>-approximation. We also obtain PTASs for special cases of <span>HV-Stabbing</span>: (i) when all rectangles are squares, (ii) when each rectangle’s width is at most its height, and (iii) when all rectangles are <span>(delta )</span>-large, that is, have at least one edge whose length is at least <span>(delta )</span>, while all edge lengths are at most 1. Our result also implies improved approximations for other problems such as <i>generalized minimum Manhattan network</i>.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"46 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1007/s10107-024-02088-x
Frank Fischer
We develop a fully asynchronous proximal bundle method for solving non-smooth, convex optimization problems. The algorithm can be used as a drop-in replacement for classic bundle methods, i.e., the function must be given by a first-order oracle for computing function values and subgradients. The algorithm allows for an arbitrary number of master problem processes computing new candidate points and oracle processes evaluating functions at those candidate points. These processes share information by communication with a single supervisor process that resembles the main loop of a classic bundle method. All processes run in parallel and no explicit synchronization step is required. Instead, the asynchronous and possibly outdated results of the oracle computations can be seen as an inexact function oracle. Hence, we show the convergence of our method under weak assumptions very similar to inexact and incremental bundle methods. In particular, we show how the algorithm learns important structural properties of the functions to control the inaccuracy induced by the asynchronicity automatically such that overall convergence can be guaranteed.
{"title":"An asynchronous proximal bundle method","authors":"Frank Fischer","doi":"10.1007/s10107-024-02088-x","DOIUrl":"https://doi.org/10.1007/s10107-024-02088-x","url":null,"abstract":"<p>We develop a fully asynchronous proximal bundle method for solving non-smooth, convex optimization problems. The algorithm can be used as a drop-in replacement for classic bundle methods, i.e., the function must be given by a first-order oracle for computing function values and subgradients. The algorithm allows for an arbitrary number of master problem processes computing new candidate points and oracle processes evaluating functions at those candidate points. These processes share information by communication with a single supervisor process that resembles the main loop of a classic bundle method. All processes run in parallel and no explicit synchronization step is required. Instead, the asynchronous and possibly outdated results of the oracle computations can be seen as an inexact function oracle. Hence, we show the convergence of our method under weak assumptions very similar to inexact and incremental bundle methods. In particular, we show how the algorithm learns important structural properties of the functions to control the inaccuracy induced by the asynchronicity automatically such that overall convergence can be guaranteed.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"13 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1007/s10107-024-02102-2
Jasper van Doornmalen, Christopher Hojny
Handling symmetries in optimization problems is essential for devising efficient solution methods. In this article, we present a general framework that captures many of the already existing symmetry handling methods. While these methods are mostly discussed independently from each other, our framework allows to apply different methods simultaneously and thus outperforming their individual effect. Moreover, most existing symmetry handling methods only apply to binary variables. Our framework allows to easily generalize these methods to general variable types. Numerical experiments confirm that our novel framework is superior to the state-of-the-art symmetry handling methods as implemented in the solver SCIP on a broad set of instances.
{"title":"A unified framework for symmetry handling","authors":"Jasper van Doornmalen, Christopher Hojny","doi":"10.1007/s10107-024-02102-2","DOIUrl":"https://doi.org/10.1007/s10107-024-02102-2","url":null,"abstract":"<p>Handling symmetries in optimization problems is essential for devising efficient solution methods. In this article, we present a general framework that captures many of the already existing symmetry handling methods. While these methods are mostly discussed independently from each other, our framework allows to apply different methods simultaneously and thus outperforming their individual effect. Moreover, most existing symmetry handling methods only apply to binary variables. Our framework allows to easily generalize these methods to general variable types. Numerical experiments confirm that our novel framework is superior to the state-of-the-art symmetry handling methods as implemented in the solver <span>SCIP</span> on a broad set of instances.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"33 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1007/s10107-024-02100-4
Naoki Marumo, Akiko Takeda
We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than (varepsilon ) in (O(H_{nu }^{frac{1}{2 + 2 nu }} varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }})) function and gradient evaluations, where (nu in [0, 1]) and (H_{nu }) are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of (O(varepsilon ^{-2})) for (nu = 0) and the state-of-the-art bound of (O(varepsilon ^{-7/4})) for (nu = 1). Our algorithm is (nu )-independent and thus universal; it automatically achieves the above complexity bound with the optimal (nu in [0, 1]) without knowledge of (H_{nu }). In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy (varepsilon ). Numerical results illustrate that the proposed method is promising.
我们提出了一种新的一阶方法,用于最小化具有 Lipschitz 连续梯度和 Hölder 连续 Hessians 的非凸函数。所提出的算法是一种重球方法,配备了两种特殊的重启机制。它能在(O(H_{nu }^{frac{1}{2 + 2 nu }} 内找到梯度规范小于(varepsilon )的解。varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }})函数和梯度评估,其中 (nu in [0, 1]) 和 (H_{nu }) 分别是霍尔德指数和常数。这个复杂度结果涵盖了 (nu = 0) 的经典边界(O(varepsilon ^{-2}))和 (nu = 1) 的最新边界(O(varepsilon ^{-7/4}))。我们的算法与 (nu )无关,因此是通用的;它可以在不知道 (H_{nu }) 的情况下,以最优的 (nu in [0, 1]) 自动实现上述复杂度约束。此外,该算法不需要其他与问题相关的参数作为输入,包括梯度的 Lipschitz 常量或目标精度 (varepsilon )。数值结果表明,所提出的方法很有前途。
{"title":"Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians","authors":"Naoki Marumo, Akiko Takeda","doi":"10.1007/s10107-024-02100-4","DOIUrl":"https://doi.org/10.1007/s10107-024-02100-4","url":null,"abstract":"<p>We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than <span>(varepsilon )</span> in <span>(O(H_{nu }^{frac{1}{2 + 2 nu }} varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }}))</span> function and gradient evaluations, where <span>(nu in [0, 1])</span> and <span>(H_{nu })</span> are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of <span>(O(varepsilon ^{-2}))</span> for <span>(nu = 0)</span> and the state-of-the-art bound of <span>(O(varepsilon ^{-7/4}))</span> for <span>(nu = 1)</span>. Our algorithm is <span>(nu )</span>-independent and thus universal; it automatically achieves the above complexity bound with the optimal <span>(nu in [0, 1])</span> without knowledge of <span>(H_{nu })</span>. In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy <span>(varepsilon )</span>. Numerical results illustrate that the proposed method is promising.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"51 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1007/s10107-024-02091-2
A. S. Lewis, Tonghua Tian
A central tool for understanding first-order optimization algorithms is the Kurdyka–Łojasiewicz inequality. Standard approaches to such methods rely crucially on this inequality to leverage sufficient decrease conditions involving gradients or subgradients. However, the KL property fundamentally concerns not subgradients but rather “slope”, a purely metric notion. By highlighting this view, and avoiding any use of subgradients, we present a simple and concise complexity analysis for first-order optimization algorithms on metric spaces. This subgradient-free perspective also frames a short and focused proof of the KL property for nonsmooth semi-algebraic functions.
{"title":"The complexity of first-order optimization methods from a metric perspective","authors":"A. S. Lewis, Tonghua Tian","doi":"10.1007/s10107-024-02091-2","DOIUrl":"https://doi.org/10.1007/s10107-024-02091-2","url":null,"abstract":"<p>A central tool for understanding first-order optimization algorithms is the Kurdyka–Łojasiewicz inequality. Standard approaches to such methods rely crucially on this inequality to leverage sufficient decrease conditions involving gradients or subgradients. However, the KL property fundamentally concerns not subgradients but rather “slope”, a purely metric notion. By highlighting this view, and avoiding any use of subgradients, we present a simple and concise complexity analysis for first-order optimization algorithms on metric spaces. This subgradient-free perspective also frames a short and focused proof of the KL property for nonsmooth semi-algebraic functions.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"71 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141252287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.1007/s10107-024-02095-y
Leon Bungert, Tim Roith, Philipp Wacker
In this paper we propose polarized consensus-based dynamics in order to make consensus-based optimization (CBO) and sampling (CBS) applicable for objective functions with several global minima or distributions with many modes, respectively. For this, we “polarize” the dynamics with a localizing kernel and the resulting model can be viewed as a bounded confidence model for opinion formation in the presence of common objective. Instead of being attracted to a common weighted mean as in the original consensus-based methods, which prevents the detection of more than one minimum or mode, in our method every particle is attracted to a weighted mean which gives more weight to nearby particles. We prove that in the mean-field regime the polarized CBS dynamics are unbiased for Gaussian targets. We also prove that in the zero temperature limit and for sufficiently well-behaved strongly convex objectives the solution of the Fokker–Planck equation converges in the Wasserstein-2 distance to a Dirac measure at the minimizer. Finally, we propose a computationally more efficient generalization which works with a predefined number of clusters and improves upon our polarized baseline method for high-dimensional optimization.
在本文中,我们提出了基于极化共识的动力学,以便使基于共识的优化(CBO)和采样(CBS)分别适用于具有多个全局最小值或具有多种模式分布的目标函数。为此,我们用一个局部化核对动力学进行了 "极化",由此产生的模型可被视为在存在共同目标的情况下形成意见的有界置信模型。在我们的方法中,每个粒子都会被一个加权平均值所吸引,而不是像最初的基于共识的方法那样被一个共同的加权平均值所吸引,因为后者会阻止检测到一个以上的最小值或模式。我们证明,在均场机制下,对于高斯目标,极化 CBS 动力学是无偏的。我们还证明,在零温度极限和充分良好的强凸目标下,福克-普朗克方程的解在瓦瑟斯坦-2 距离上收敛于最小值处的狄拉克量纲。最后,我们提出了一种计算效率更高的广义方法,它可以使用预定义的簇数,并改进了我们的高维优化极化基线方法。
{"title":"Polarized consensus-based dynamics for optimization and sampling","authors":"Leon Bungert, Tim Roith, Philipp Wacker","doi":"10.1007/s10107-024-02095-y","DOIUrl":"https://doi.org/10.1007/s10107-024-02095-y","url":null,"abstract":"<p>In this paper we propose polarized consensus-based dynamics in order to make consensus-based optimization (CBO) and sampling (CBS) applicable for objective functions with several global minima or distributions with many modes, respectively. For this, we “polarize” the dynamics with a localizing kernel and the resulting model can be viewed as a bounded confidence model for opinion formation in the presence of common objective. Instead of being attracted to a common weighted mean as in the original consensus-based methods, which prevents the detection of more than one minimum or mode, in our method every particle is attracted to a weighted mean which gives more weight to nearby particles. We prove that in the mean-field regime the polarized CBS dynamics are unbiased for Gaussian targets. We also prove that in the zero temperature limit and for sufficiently well-behaved strongly convex objectives the solution of the Fokker–Planck equation converges in the Wasserstein-2 distance to a Dirac measure at the minimizer. Finally, we propose a computationally more efficient generalization which works with a predefined number of clusters and improves upon our polarized baseline method for high-dimensional optimization.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"239 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141189696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s10107-024-02097-w
Gonzalo Muñoz, David Salas, Anton Svensson
We study linear bilevel programming problems whose lower-level objective is given by a random cost vector with known distribution. We consider the case where this distribution is nonatomic, allowing to reformulate the problem of the leader using the Bayesian approach in the sense of Salas and Svensson (SIAM J Optim 33(3):2311–2340, 2023), with a decision-dependent distribution that concentrates on the vertices of the feasible set of the follower’s problem. We call this a vertex-supported belief. We prove that this formulation is piecewise affine over the so-called chamber complex of the feasible set of the high-point relaxation. We propose two algorithmic approaches to solve general problems enjoying this last property. The first one is based on enumerating the vertices of the chamber complex. This approach is not scalable, but we present it as a computational baseline and for its theoretical interest. The second one is a Monte-Carlo approximation scheme based on the fact that randomly drawn points of the domain lie, with probability 1, in the interior of full-dimensional chambers, where the problem (restricted to this chamber) can be reduced to a linear program. Finally, we evaluate these methods through computational experiments showing both approaches’ advantages and challenges.
我们研究的是线性双级编程问题,其下级目标由已知分布的随机代价向量给出。我们考虑了这种分布是非原子分布的情况,这样就可以使用 Salas 和 Svensson(SIAM J Optim 33(3):2311-2340, 2023)意义上的贝叶斯方法来重新表述领导者的问题,这种决策依赖分布集中在追随者问题可行集的顶点上。我们称之为顶点支持信念。我们证明,这种表述在高点松弛可行集的所谓室复上是片断仿射的。我们提出了两种算法方法来解决具有最后这一特性的一般问题。第一种方法基于枚举室复合体的顶点。这种方法不具有可扩展性,但我们将其作为计算基线并从理论上加以阐述。第二种方法是蒙特卡洛近似方案,该方案基于这样一个事实,即随机绘制的域点以 1 的概率位于全维腔室的内部,在这种情况下,问题(仅限于该腔室)可以简化为线性程序。最后,我们通过计算实验对这些方法进行了评估,展示了这两种方法的优势和挑战。
{"title":"Exploiting the polyhedral geometry of stochastic linear bilevel programming","authors":"Gonzalo Muñoz, David Salas, Anton Svensson","doi":"10.1007/s10107-024-02097-w","DOIUrl":"https://doi.org/10.1007/s10107-024-02097-w","url":null,"abstract":"<p>We study linear bilevel programming problems whose lower-level objective is given by a random cost vector with known distribution. We consider the case where this distribution is nonatomic, allowing to reformulate the problem of the leader using the Bayesian approach in the sense of Salas and Svensson (SIAM J Optim 33(3):2311–2340, 2023), with a decision-dependent distribution that concentrates on the vertices of the feasible set of the follower’s problem. We call this a vertex-supported belief. We prove that this formulation is piecewise affine over the so-called chamber complex of the feasible set of the high-point relaxation. We propose two algorithmic approaches to solve general problems enjoying this last property. The first one is based on enumerating the vertices of the chamber complex. This approach is not scalable, but we present it as a computational baseline and for its theoretical interest. The second one is a Monte-Carlo approximation scheme based on the fact that randomly drawn points of the domain lie, with probability 1, in the interior of full-dimensional chambers, where the problem (restricted to this chamber) can be reduced to a linear program. Finally, we evaluate these methods through computational experiments showing both approaches’ advantages and challenges.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"70 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s10107-024-02099-8
Amitabh Basu, Hongyi Jiang, Phillip Kerger, Marco Molinaro
We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental “transfer” result that shows how lower bounds on information complexity of continuous convex optimization under different oracles can be transferred to the mixed-integer setting in a black-box manner. Further, we (to the best of our knowledge) initiate the study of, and obtain the first set of results on, information complexity under oracles that only reveal partial first-order information, e.g., where one can only make a binary query over the function value or subgradient at a given point. We give algorithms for (mixed-integer) convex optimization that work under these less informative oracles. We also give lower bounds showing that, for some of these oracles, every algorithm requires more iterations to achieve a target error compared to when complete first-order information is available. That is, these oracles are provably less informative than full first-order oracles for the purpose of optimization.
{"title":"Information complexity of mixed-integer convex optimization","authors":"Amitabh Basu, Hongyi Jiang, Phillip Kerger, Marco Molinaro","doi":"10.1007/s10107-024-02099-8","DOIUrl":"https://doi.org/10.1007/s10107-024-02099-8","url":null,"abstract":"<p>We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental “transfer” result that shows how lower bounds on information complexity of continuous convex optimization under different oracles can be transferred to the mixed-integer setting in a black-box manner. Further, we (to the best of our knowledge) initiate the study of, and obtain the first set of results on, information complexity under oracles that only reveal <i>partial</i> first-order information, e.g., where one can only make a binary query over the function value or subgradient at a given point. We give algorithms for (mixed-integer) convex optimization that work under these less informative oracles. We also give lower bounds showing that, for some of these oracles, every algorithm requires more iterations to achieve a target error compared to when complete first-order information is available. That is, these oracles are provably less informative than full first-order oracles for the purpose of optimization.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}