首页 > 最新文献

Mathematical Programming最新文献

英文 中文
A slope generalization of Attouch theorem 阿图什定理的斜率一般化
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-20 DOI: 10.1007/s10107-024-02108-w
Aris Daniilidis, David Salas, Sebastián Tapia-García

A classical result of variational analysis, known as Attouch theorem, establishes an equivalence between epigraphical convergence of a sequence of proper convex lower semicontinuous functions and graphical convergence of the corresponding subdifferential maps up to a normalization condition which fixes the integration constant. In this work, we show that in finite dimensions and under a mild boundedness assumption, we can replace subdifferentials (sets of vectors) by slopes (scalars, corresponding to the distance of the subdifferentials to zero) and still obtain the same characterization: namely, the epigraphical convergence of functions is equivalent to the epigraphical convergence of their slopes. This surprising result goes in line with recent developments on slope determination (Boulmezaoud et al. in SIAM J Optim 28(3):2049–2066, 2018; Pérez-Aros et al. in Math Program 190(1–2):561-583, 2021) and slope sensitivity (Daniilidis and Drusvyatskiy in Proc Am Math Soc 151(11):4751-4756, 2023) for convex functions.

变分分析的一个经典结果,即阿图什(Attouch)定理,确定了适当凸下半连续函数序列的表观收敛性与相应次微分映射的图形收敛性之间的等价性,但须满足一个固定积分常数的归一化条件。在这项工作中,我们证明了在有限维度和温和的有界性假设下,我们可以用斜率(标量,对应于子微分到零的距离)替换子微分(向量集),并仍然得到相同的特征:即函数的图解收敛等同于其斜率的图解收敛。这一令人惊讶的结果与凸函数的斜率确定(Boulmezaoud 等人,发表于 SIAM J Optim 28(3):2049-2066, 2018;Pérez-Aros 等人,发表于 Math Program 190(1-2):561-583, 2021)和斜率敏感性(Daniilidis 和 Drusvyatskiy,发表于 Proc Am Math Soc 151(11):4751-4756, 2023)的最新进展一致。
{"title":"A slope generalization of Attouch theorem","authors":"Aris Daniilidis, David Salas, Sebastián Tapia-García","doi":"10.1007/s10107-024-02108-w","DOIUrl":"https://doi.org/10.1007/s10107-024-02108-w","url":null,"abstract":"<p>A classical result of variational analysis, known as Attouch theorem, establishes an equivalence between epigraphical convergence of a sequence of proper convex lower semicontinuous functions and graphical convergence of the corresponding subdifferential maps up to a normalization condition which fixes the integration constant. In this work, we show that in finite dimensions and under a mild boundedness assumption, we can replace subdifferentials (sets of vectors) by slopes (scalars, corresponding to the distance of the subdifferentials to zero) and still obtain the same characterization: namely, the epigraphical convergence of functions is equivalent to the epigraphical convergence of their slopes. This surprising result goes in line with recent developments on slope determination (Boulmezaoud et al. in SIAM J Optim 28(3):2049–2066, 2018; Pérez-Aros et al. in Math Program 190(1–2):561-583, 2021) and slope sensitivity (Daniilidis and Drusvyatskiy in Proc Am Math Soc 151(11):4751-4756, 2023) for convex functions.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized scaling for the constrained maximum-entropy sampling problem 受限最大熵抽样问题的广义缩放
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-20 DOI: 10.1007/s10107-024-02101-3
Zhongzhu Chen, Marcia Fampa, Jon Lee

The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is (ordinary) scaling, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to generalized scaling, employing a positive vector of parameters, which allows much more flexibility and thus potentially reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.

受限最大熵采样问题是实验设计中出现的离散优化问题,精确求解该问题实例的最佳实用技术是通过分支与边界框架,利用目标函数的各种凹连续松弛来实现的。在这种情况下,一种标准的、在计算上非常重要的边界增强技术是通过单个正参数进行(普通)缩放。缩放可以调整连续松弛的形状,从而缩小上限与最优值之间的差距。我们将这一技术扩展到广义缩放,即采用一个正向参数向量,这样就有了更大的灵活性,从而有可能进一步缩小差距。我们给出的数学结果旨在支持计算最优广义缩放的算法方法,我们给出的计算结果证明了广义缩放在基准问题实例上的性能。
{"title":"Generalized scaling for the constrained maximum-entropy sampling problem","authors":"Zhongzhu Chen, Marcia Fampa, Jon Lee","doi":"10.1007/s10107-024-02101-3","DOIUrl":"https://doi.org/10.1007/s10107-024-02101-3","url":null,"abstract":"<p>The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is <i>(ordinary) scaling</i>, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to <i>generalized scaling</i>, employing a positive vector of parameters, which allows much more flexibility and thus potentially reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"25 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A PTAS for the horizontal rectangle stabbing problem 水平矩形刺入问题的 PTAS
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-13 DOI: 10.1007/s10107-024-02106-y
Arindam Khan, Aditya Subramanian, Andreas Wiese

We study rectangle stabbing problems in which we are given n axis-aligned rectangles in the plane that we want to stab, that is, we want to select line segments such that for each given rectangle there is a line segment that intersects two opposite edges of it. In the horizontal rectangle stabbing problem (Stabbing), the goal is to find a set of horizontal line segments of minimum total length such that all rectangles are stabbed. In the horizontal–vertical stabbing problem (HV-Stabbing), the goal is to find a set of rectilinear (that is, either vertical or horizontal) line segments of minimum total length such that all rectangles are stabbed. Both variants are NP-hard. Chan et al. (ISAAC, 2018) initiated the study of these problems by providing constant approximation algorithms. Recently, Eisenbrand et al. (A QPTAS for stabbing rectangles, 2021) have presented a QPTAS and a polynomial-time 8-approximation algorithm for Stabbing, but it was open whether the problem admits a PTAS. In this paper, we obtain a PTAS for Stabbing, settling this question. For HV-Stabbing, we obtain a ((2+varepsilon ))-approximation. We also obtain PTASs for special cases of HV-Stabbing: (i) when all rectangles are squares, (ii) when each rectangle’s width is at most its height, and (iii) when all rectangles are (delta )-large, that is, have at least one edge whose length is at least (delta ), while all edge lengths are at most 1. Our result also implies improved approximations for other problems such as generalized minimum Manhattan network.

我们研究的是矩形切割问题,在这个问题中,我们要切割的是平面上 n 个轴线对齐的矩形,也就是说,我们要选择线段,使得每个矩形都有一条线段与它的两条相对边相交。在水平矩形刺入问题(刺入)中,我们的目标是找到一组总长度最小的水平线段,从而刺入所有矩形。在水平-垂直刺入问题(HV-Stabbing)中,目标是找到一组总长度最小的直线(即垂直或水平)线段,使所有矩形都被刺入。这两个变体都是 NP 难。Chan 等人(ISAAC,2018)通过提供恒定近似算法,开始了对这些问题的研究。最近,Eisenbrand 等人(A QPTAS for stabbing rectangles, 2021)提出了针对 Stabbing 问题的 QPTAS 和多项式时间 8 近似算法,但该问题是否存在 PTAS 尚无定论。在本文中,我们得到了 Stabbing 的 PTAS,从而解决了这个问题。对于HV-Stabbing,我们得到了一个((2+varepsilon ))近似值。我们还得到了 HV-Stabbing 特殊情况下的 PTAS:(i) 所有矩形都是正方形,(ii) 每个矩形的宽度最多等于它的高度,(iii) 所有矩形都是(Δ )大的,也就是说,至少有一条边的长度至少是(Δ ),而所有边的长度最多是 1。 我们的结果还意味着对其他问题的近似值的改进,比如广义最小曼哈顿网络。
{"title":"A PTAS for the horizontal rectangle stabbing problem","authors":"Arindam Khan, Aditya Subramanian, Andreas Wiese","doi":"10.1007/s10107-024-02106-y","DOIUrl":"https://doi.org/10.1007/s10107-024-02106-y","url":null,"abstract":"<p>We study rectangle stabbing problems in which we are given <i>n</i> axis-aligned rectangles in the plane that we want to <i>stab</i>, that is, we want to select line segments such that for each given rectangle there is a line segment that intersects two opposite edges of it. In the <i>horizontal rectangle stabbing problem</i> (<span>Stabbing</span>), the goal is to find a set of horizontal line segments of minimum total length such that all rectangles are stabbed. In the <i>horizontal–vertical stabbing problem</i> (<span>HV-Stabbing</span>), the goal is to find a set of rectilinear (that is, either vertical or horizontal) line segments of minimum total length such that all rectangles are stabbed. Both variants are NP-hard. Chan et al. (ISAAC, 2018) initiated the study of these problems by providing constant approximation algorithms. Recently, Eisenbrand et al. (A QPTAS for stabbing rectangles, 2021) have presented a QPTAS and a polynomial-time 8-approximation algorithm for <span>Stabbing</span>, but it was open whether the problem admits a PTAS. In this paper, we obtain a PTAS for <span>Stabbing</span>, settling this question. For <span>HV-Stabbing</span>, we obtain a <span>((2+varepsilon ))</span>-approximation. We also obtain PTASs for special cases of <span>HV-Stabbing</span>: (i) when all rectangles are squares, (ii) when each rectangle’s width is at most its height, and (iii) when all rectangles are <span>(delta )</span>-large, that is, have at least one edge whose length is at least <span>(delta )</span>, while all edge lengths are at most 1. Our result also implies improved approximations for other problems such as <i>generalized minimum Manhattan network</i>.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"46 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An asynchronous proximal bundle method 异步近端捆绑法
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-04 DOI: 10.1007/s10107-024-02088-x
Frank Fischer

We develop a fully asynchronous proximal bundle method for solving non-smooth, convex optimization problems. The algorithm can be used as a drop-in replacement for classic bundle methods, i.e., the function must be given by a first-order oracle for computing function values and subgradients. The algorithm allows for an arbitrary number of master problem processes computing new candidate points and oracle processes evaluating functions at those candidate points. These processes share information by communication with a single supervisor process that resembles the main loop of a classic bundle method. All processes run in parallel and no explicit synchronization step is required. Instead, the asynchronous and possibly outdated results of the oracle computations can be seen as an inexact function oracle. Hence, we show the convergence of our method under weak assumptions very similar to inexact and incremental bundle methods. In particular, we show how the algorithm learns important structural properties of the functions to control the inaccuracy induced by the asynchronicity automatically such that overall convergence can be guaranteed.

我们开发了一种用于解决非光滑凸优化问题的完全异步近似束方法。该算法可以直接替代传统的捆绑方法,即函数必须由计算函数值和子梯度的一阶神谕给出。该算法允许任意数量的主问题进程计算新的候选点,并允许神谕进程在这些候选点上评估函数。这些进程通过与单个监督进程通信共享信息,该监督进程类似于经典捆绑方法的主循环。所有进程并行运行,无需明确的同步步骤。相反,甲骨文计算的异步和可能过时的结果可以看作是一个不精确的函数甲骨文。因此,我们展示了我们的方法在弱假设条件下的收敛性,这与不精确方法和增量捆绑方法非常相似。特别是,我们展示了算法如何学习函数的重要结构特性,自动控制异步性引起的不准确性,从而保证整体收敛性。
{"title":"An asynchronous proximal bundle method","authors":"Frank Fischer","doi":"10.1007/s10107-024-02088-x","DOIUrl":"https://doi.org/10.1007/s10107-024-02088-x","url":null,"abstract":"<p>We develop a fully asynchronous proximal bundle method for solving non-smooth, convex optimization problems. The algorithm can be used as a drop-in replacement for classic bundle methods, i.e., the function must be given by a first-order oracle for computing function values and subgradients. The algorithm allows for an arbitrary number of master problem processes computing new candidate points and oracle processes evaluating functions at those candidate points. These processes share information by communication with a single supervisor process that resembles the main loop of a classic bundle method. All processes run in parallel and no explicit synchronization step is required. Instead, the asynchronous and possibly outdated results of the oracle computations can be seen as an inexact function oracle. Hence, we show the convergence of our method under weak assumptions very similar to inexact and incremental bundle methods. In particular, we show how the algorithm learns important structural properties of the functions to control the inaccuracy induced by the asynchronicity automatically such that overall convergence can be guaranteed.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"13 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified framework for symmetry handling 处理对称性的统一框架
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-04 DOI: 10.1007/s10107-024-02102-2
Jasper van Doornmalen, Christopher Hojny

Handling symmetries in optimization problems is essential for devising efficient solution methods. In this article, we present a general framework that captures many of the already existing symmetry handling methods. While these methods are mostly discussed independently from each other, our framework allows to apply different methods simultaneously and thus outperforming their individual effect. Moreover, most existing symmetry handling methods only apply to binary variables. Our framework allows to easily generalize these methods to general variable types. Numerical experiments confirm that our novel framework is superior to the state-of-the-art symmetry handling methods as implemented in the solver SCIP on a broad set of instances.

处理优化问题中的对称性对于设计高效的求解方法至关重要。在本文中,我们提出了一个通用框架,其中包含了许多现有的对称性处理方法。虽然这些方法大多是相互独立讨论的,但我们的框架允许同时应用不同的方法,从而超越它们各自的效果。此外,大多数现有的对称性处理方法只适用于二进制变量。我们的框架可以轻松地将这些方法推广到一般变量类型。数值实验证实,在大量实例上,我们的新框架优于在求解器 SCIP 中实施的最先进的对称性处理方法。
{"title":"A unified framework for symmetry handling","authors":"Jasper van Doornmalen, Christopher Hojny","doi":"10.1007/s10107-024-02102-2","DOIUrl":"https://doi.org/10.1007/s10107-024-02102-2","url":null,"abstract":"<p>Handling symmetries in optimization problems is essential for devising efficient solution methods. In this article, we present a general framework that captures many of the already existing symmetry handling methods. While these methods are mostly discussed independently from each other, our framework allows to apply different methods simultaneously and thus outperforming their individual effect. Moreover, most existing symmetry handling methods only apply to binary variables. Our framework allows to easily generalize these methods to general variable types. Numerical experiments confirm that our novel framework is superior to the state-of-the-art symmetry handling methods as implemented in the solver <span>SCIP</span> on a broad set of instances.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"33 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians 赫尔德连续赫西亚条件下非凸优化的通用重球法
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-04 DOI: 10.1007/s10107-024-02100-4
Naoki Marumo, Akiko Takeda

We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than (varepsilon ) in (O(H_{nu }^{frac{1}{2 + 2 nu }} varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }})) function and gradient evaluations, where (nu in [0, 1]) and (H_{nu }) are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of (O(varepsilon ^{-2})) for (nu = 0) and the state-of-the-art bound of (O(varepsilon ^{-7/4})) for (nu = 1). Our algorithm is (nu )-independent and thus universal; it automatically achieves the above complexity bound with the optimal (nu in [0, 1]) without knowledge of (H_{nu }). In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy (varepsilon ). Numerical results illustrate that the proposed method is promising.

我们提出了一种新的一阶方法,用于最小化具有 Lipschitz 连续梯度和 Hölder 连续 Hessians 的非凸函数。所提出的算法是一种重球方法,配备了两种特殊的重启机制。它能在(O(H_{nu }^{frac{1}{2 + 2 nu }} 内找到梯度规范小于(varepsilon )的解。varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }})函数和梯度评估,其中 (nu in [0, 1]) 和 (H_{nu }) 分别是霍尔德指数和常数。这个复杂度结果涵盖了 (nu = 0) 的经典边界(O(varepsilon ^{-2}))和 (nu = 1) 的最新边界(O(varepsilon ^{-7/4}))。我们的算法与 (nu )无关,因此是通用的;它可以在不知道 (H_{nu }) 的情况下,以最优的 (nu in [0, 1]) 自动实现上述复杂度约束。此外,该算法不需要其他与问题相关的参数作为输入,包括梯度的 Lipschitz 常量或目标精度 (varepsilon )。数值结果表明,所提出的方法很有前途。
{"title":"Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians","authors":"Naoki Marumo, Akiko Takeda","doi":"10.1007/s10107-024-02100-4","DOIUrl":"https://doi.org/10.1007/s10107-024-02100-4","url":null,"abstract":"<p>We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than <span>(varepsilon )</span> in <span>(O(H_{nu }^{frac{1}{2 + 2 nu }} varepsilon ^{- frac{4 + 3 nu }{2 + 2 nu }}))</span> function and gradient evaluations, where <span>(nu in [0, 1])</span> and <span>(H_{nu })</span> are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of <span>(O(varepsilon ^{-2}))</span> for <span>(nu = 0)</span> and the state-of-the-art bound of <span>(O(varepsilon ^{-7/4}))</span> for <span>(nu = 1)</span>. Our algorithm is <span>(nu )</span>-independent and thus universal; it automatically achieves the above complexity bound with the optimal <span>(nu in [0, 1])</span> without knowledge of <span>(H_{nu })</span>. In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy <span>(varepsilon )</span>. Numerical results illustrate that the proposed method is promising.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"51 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141251840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The complexity of first-order optimization methods from a metric perspective 从度量角度看一阶优化方法的复杂性
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-04 DOI: 10.1007/s10107-024-02091-2
A. S. Lewis, Tonghua Tian

A central tool for understanding first-order optimization algorithms is the Kurdyka–Łojasiewicz inequality. Standard approaches to such methods rely crucially on this inequality to leverage sufficient decrease conditions involving gradients or subgradients. However, the KL property fundamentally concerns not subgradients but rather “slope”, a purely metric notion. By highlighting this view, and avoiding any use of subgradients, we present a simple and concise complexity analysis for first-order optimization algorithms on metric spaces. This subgradient-free perspective also frames a short and focused proof of the KL property for nonsmooth semi-algebraic functions.

理解一阶优化算法的核心工具是 Kurdyka-Łojasiewicz 不等式。此类方法的标准方法主要依靠该不等式来利用涉及梯度或子梯度的充分下降条件。然而,从根本上说,KL 特性涉及的不是子梯度,而是 "斜率",一个纯粹的度量概念。通过强调这一观点,并避免使用任何子梯度,我们提出了对公域空间上一阶优化算法的简单明了的复杂性分析。这种无子梯度观点还为非光滑半代数函数的 KL 特性提供了简短而集中的证明。
{"title":"The complexity of first-order optimization methods from a metric perspective","authors":"A. S. Lewis, Tonghua Tian","doi":"10.1007/s10107-024-02091-2","DOIUrl":"https://doi.org/10.1007/s10107-024-02091-2","url":null,"abstract":"<p>A central tool for understanding first-order optimization algorithms is the Kurdyka–Łojasiewicz inequality. Standard approaches to such methods rely crucially on this inequality to leverage sufficient decrease conditions involving gradients or subgradients. However, the KL property fundamentally concerns not subgradients but rather “slope”, a purely metric notion. By highlighting this view, and avoiding any use of subgradients, we present a simple and concise complexity analysis for first-order optimization algorithms on metric spaces. This subgradient-free perspective also frames a short and focused proof of the KL property for nonsmooth semi-algebraic functions.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"71 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141252287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polarized consensus-based dynamics for optimization and sampling 基于共识的极化动态优化和抽样
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-05-31 DOI: 10.1007/s10107-024-02095-y
Leon Bungert, Tim Roith, Philipp Wacker

In this paper we propose polarized consensus-based dynamics in order to make consensus-based optimization (CBO) and sampling (CBS) applicable for objective functions with several global minima or distributions with many modes, respectively. For this, we “polarize” the dynamics with a localizing kernel and the resulting model can be viewed as a bounded confidence model for opinion formation in the presence of common objective. Instead of being attracted to a common weighted mean as in the original consensus-based methods, which prevents the detection of more than one minimum or mode, in our method every particle is attracted to a weighted mean which gives more weight to nearby particles. We prove that in the mean-field regime the polarized CBS dynamics are unbiased for Gaussian targets. We also prove that in the zero temperature limit and for sufficiently well-behaved strongly convex objectives the solution of the Fokker–Planck equation converges in the Wasserstein-2 distance to a Dirac measure at the minimizer. Finally, we propose a computationally more efficient generalization which works with a predefined number of clusters and improves upon our polarized baseline method for high-dimensional optimization.

在本文中,我们提出了基于极化共识的动力学,以便使基于共识的优化(CBO)和采样(CBS)分别适用于具有多个全局最小值或具有多种模式分布的目标函数。为此,我们用一个局部化核对动力学进行了 "极化",由此产生的模型可被视为在存在共同目标的情况下形成意见的有界置信模型。在我们的方法中,每个粒子都会被一个加权平均值所吸引,而不是像最初的基于共识的方法那样被一个共同的加权平均值所吸引,因为后者会阻止检测到一个以上的最小值或模式。我们证明,在均场机制下,对于高斯目标,极化 CBS 动力学是无偏的。我们还证明,在零温度极限和充分良好的强凸目标下,福克-普朗克方程的解在瓦瑟斯坦-2 距离上收敛于最小值处的狄拉克量纲。最后,我们提出了一种计算效率更高的广义方法,它可以使用预定义的簇数,并改进了我们的高维优化极化基线方法。
{"title":"Polarized consensus-based dynamics for optimization and sampling","authors":"Leon Bungert, Tim Roith, Philipp Wacker","doi":"10.1007/s10107-024-02095-y","DOIUrl":"https://doi.org/10.1007/s10107-024-02095-y","url":null,"abstract":"<p>In this paper we propose polarized consensus-based dynamics in order to make consensus-based optimization (CBO) and sampling (CBS) applicable for objective functions with several global minima or distributions with many modes, respectively. For this, we “polarize” the dynamics with a localizing kernel and the resulting model can be viewed as a bounded confidence model for opinion formation in the presence of common objective. Instead of being attracted to a common weighted mean as in the original consensus-based methods, which prevents the detection of more than one minimum or mode, in our method every particle is attracted to a weighted mean which gives more weight to nearby particles. We prove that in the mean-field regime the polarized CBS dynamics are unbiased for Gaussian targets. We also prove that in the zero temperature limit and for sufficiently well-behaved strongly convex objectives the solution of the Fokker–Planck equation converges in the Wasserstein-2 distance to a Dirac measure at the minimizer. Finally, we propose a computationally more efficient generalization which works with a predefined number of clusters and improves upon our polarized baseline method for high-dimensional optimization.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"239 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141189696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting the polyhedral geometry of stochastic linear bilevel programming 利用随机线性双层程序设计的多面体几何学
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-05-27 DOI: 10.1007/s10107-024-02097-w
Gonzalo Muñoz, David Salas, Anton Svensson

We study linear bilevel programming problems whose lower-level objective is given by a random cost vector with known distribution. We consider the case where this distribution is nonatomic, allowing to reformulate the problem of the leader using the Bayesian approach in the sense of Salas and Svensson (SIAM J Optim 33(3):2311–2340, 2023), with a decision-dependent distribution that concentrates on the vertices of the feasible set of the follower’s problem. We call this a vertex-supported belief. We prove that this formulation is piecewise affine over the so-called chamber complex of the feasible set of the high-point relaxation. We propose two algorithmic approaches to solve general problems enjoying this last property. The first one is based on enumerating the vertices of the chamber complex. This approach is not scalable, but we present it as a computational baseline and for its theoretical interest. The second one is a Monte-Carlo approximation scheme based on the fact that randomly drawn points of the domain lie, with probability 1, in the interior of full-dimensional chambers, where the problem (restricted to this chamber) can be reduced to a linear program. Finally, we evaluate these methods through computational experiments showing both approaches’ advantages and challenges.

我们研究的是线性双级编程问题,其下级目标由已知分布的随机代价向量给出。我们考虑了这种分布是非原子分布的情况,这样就可以使用 Salas 和 Svensson(SIAM J Optim 33(3):2311-2340, 2023)意义上的贝叶斯方法来重新表述领导者的问题,这种决策依赖分布集中在追随者问题可行集的顶点上。我们称之为顶点支持信念。我们证明,这种表述在高点松弛可行集的所谓室复上是片断仿射的。我们提出了两种算法方法来解决具有最后这一特性的一般问题。第一种方法基于枚举室复合体的顶点。这种方法不具有可扩展性,但我们将其作为计算基线并从理论上加以阐述。第二种方法是蒙特卡洛近似方案,该方案基于这样一个事实,即随机绘制的域点以 1 的概率位于全维腔室的内部,在这种情况下,问题(仅限于该腔室)可以简化为线性程序。最后,我们通过计算实验对这些方法进行了评估,展示了这两种方法的优势和挑战。
{"title":"Exploiting the polyhedral geometry of stochastic linear bilevel programming","authors":"Gonzalo Muñoz, David Salas, Anton Svensson","doi":"10.1007/s10107-024-02097-w","DOIUrl":"https://doi.org/10.1007/s10107-024-02097-w","url":null,"abstract":"<p>We study linear bilevel programming problems whose lower-level objective is given by a random cost vector with known distribution. We consider the case where this distribution is nonatomic, allowing to reformulate the problem of the leader using the Bayesian approach in the sense of Salas and Svensson (SIAM J Optim 33(3):2311–2340, 2023), with a decision-dependent distribution that concentrates on the vertices of the feasible set of the follower’s problem. We call this a vertex-supported belief. We prove that this formulation is piecewise affine over the so-called chamber complex of the feasible set of the high-point relaxation. We propose two algorithmic approaches to solve general problems enjoying this last property. The first one is based on enumerating the vertices of the chamber complex. This approach is not scalable, but we present it as a computational baseline and for its theoretical interest. The second one is a Monte-Carlo approximation scheme based on the fact that randomly drawn points of the domain lie, with probability 1, in the interior of full-dimensional chambers, where the problem (restricted to this chamber) can be reduced to a linear program. Finally, we evaluate these methods through computational experiments showing both approaches’ advantages and challenges.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"70 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information complexity of mixed-integer convex optimization 混合整数凸优化的信息复杂性
IF 2.7 2区 数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-05-27 DOI: 10.1007/s10107-024-02099-8
Amitabh Basu, Hongyi Jiang, Phillip Kerger, Marco Molinaro

We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental “transfer” result that shows how lower bounds on information complexity of continuous convex optimization under different oracles can be transferred to the mixed-integer setting in a black-box manner. Further, we (to the best of our knowledge) initiate the study of, and obtain the first set of results on, information complexity under oracles that only reveal partial first-order information, e.g., where one can only make a binary query over the function value or subgradient at a given point. We give algorithms for (mixed-integer) convex optimization that work under these less informative oracles. We also give lower bounds showing that, for some of these oracles, every algorithm requires more iterations to achieve a target error compared to when complete first-order information is available. That is, these oracles are provably less informative than full first-order oracles for the purpose of optimization.

我们研究了不同类型神谕下混合整数凸优化的信息复杂性。我们为标准一阶神谕建立了新的下界,改进了之前已知的最佳下界。这使得下界和上界之间的差距只剩下一个低阶线性项(维数)。这是一个更基本的 "转移 "结果的推论,说明了在不同神谕下连续凸优化的信息复杂度下界如何以黑箱方式转移到混合整数环境中。此外,我们(据我们所知)开始研究只揭示部分一阶信息(例如,只能对给定点上的函数值或子梯度进行二元查询)的传道器下的信息复杂度,并获得了第一组相关结果。我们给出的(混合整数)凸优化算法可以在这些信息量较少的传票下运行。我们还给出了下限,表明对于其中一些奥拉夫,与有完整的一阶信息时相比,每种算法都需要更多的迭代才能达到目标误差。也就是说,就优化而言,这些算例的信息量明显低于完整的一阶算例。
{"title":"Information complexity of mixed-integer convex optimization","authors":"Amitabh Basu, Hongyi Jiang, Phillip Kerger, Marco Molinaro","doi":"10.1007/s10107-024-02099-8","DOIUrl":"https://doi.org/10.1007/s10107-024-02099-8","url":null,"abstract":"<p>We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental “transfer” result that shows how lower bounds on information complexity of continuous convex optimization under different oracles can be transferred to the mixed-integer setting in a black-box manner. Further, we (to the best of our knowledge) initiate the study of, and obtain the first set of results on, information complexity under oracles that only reveal <i>partial</i> first-order information, e.g., where one can only make a binary query over the function value or subgradient at a given point. We give algorithms for (mixed-integer) convex optimization that work under these less informative oracles. We also give lower bounds showing that, for some of these oracles, every algorithm requires more iterations to achieve a target error compared to when complete first-order information is available. That is, these oracles are provably less informative than full first-order oracles for the purpose of optimization.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Mathematical Programming
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1