Pub Date : 2024-01-20DOI: 10.1007/s10107-023-02047-y
Soroosh Shafiee, Fatma Kılınç-Karzan
Optimization problems involving minimization of a rank-one convex function over constraints modeling restrictions on the support of the decision variables emerge in various machine learning applications. These problems are often modeled with indicator variables for identifying the support of the continuous variables. In this paper we investigate compact extended formulations for such problems through perspective reformulation techniques. In contrast to the majority of previous work that relies on support function arguments and disjunctive programming techniques to provide convex hull results, we propose a constructive approach that exploits a hidden conic structure induced by perspective functions. To this end, we first establish a convex hull result for a general conic mixed-binary set in which each conic constraint involves a linear function of independent continuous variables and a set of binary variables. We then demonstrate that extended representations of sets associated with epigraphs of rank-one convex functions over constraints modeling indicator relations naturally admit such a conic representation. This enables us to systematically give perspective formulations for the convex hull descriptions of these sets with nonlinear separable or non-separable objective functions, sign constraints on continuous variables, and combinatorial constraints on indicator variables. We illustrate the efficacy of our results on sparse nonnegative logistic regression problems.
{"title":"Constrained optimization of rank-one functions with indicator variables","authors":"Soroosh Shafiee, Fatma Kılınç-Karzan","doi":"10.1007/s10107-023-02047-y","DOIUrl":"https://doi.org/10.1007/s10107-023-02047-y","url":null,"abstract":"<p>Optimization problems involving minimization of a rank-one convex function over constraints modeling restrictions on the support of the decision variables emerge in various machine learning applications. These problems are often modeled with indicator variables for identifying the support of the continuous variables. In this paper we investigate compact extended formulations for such problems through perspective reformulation techniques. In contrast to the majority of previous work that relies on support function arguments and disjunctive programming techniques to provide convex hull results, we propose a constructive approach that exploits a hidden conic structure induced by perspective functions. To this end, we first establish a convex hull result for a general conic mixed-binary set in which each conic constraint involves a linear function of independent continuous variables and a set of binary variables. We then demonstrate that extended representations of sets associated with epigraphs of rank-one convex functions over constraints modeling indicator relations naturally admit such a conic representation. This enables us to systematically give perspective formulations for the convex hull descriptions of these sets with nonlinear separable or non-separable objective functions, sign constraints on continuous variables, and combinatorial constraints on indicator variables. We illustrate the efficacy of our results on sparse nonnegative logistic regression problems.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139507788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-20DOI: 10.1007/s10107-023-02048-x
Marcin Briański, Martin Koutecký, Daniel Král’, Kristýna Pekárková, Felix Schröder
An intensive line of research on fixed parameter tractability of integer programming is focused on exploiting the relation between the sparsity of a constraint matrix A and the norm of the elements of its Graver basis. In particular, integer programming is fixed parameter tractable when parameterized by the primal tree-depth and the entry complexity of A, and when parameterized by the dual tree-depth and the entry complexity of A; both these parameterization imply that A is sparse, in particular, the number of its non-zero entries is linear in the number of columns or rows, respectively. We study preconditioners transforming a given matrix to a row-equivalent sparse matrix if it exists and provide structural results characterizing the existence of a sparse row-equivalent matrix in terms of the structural properties of the associated column matroid. In particular, our results imply that the (ell _1)-norm of the Graver basis is bounded by a function of the maximum (ell _1)-norm of a circuit of A. We use our results to design a parameterized algorithm that constructs a matrix row-equivalent to an input matrix A that has small primal/dual tree-depth and entry complexity if such a row-equivalent matrix exists. Our results yield parameterized algorithms for integer programming when parameterized by the (ell _1)-norm of the Graver basis of the constraint matrix, when parameterized by the (ell _1)-norm of the circuits of the constraint matrix, when parameterized by the smallest primal tree-depth and entry complexity of a matrix row-equivalent to the constraint matrix, and when parameterized by the smallest dual tree-depth and entry complexity of a matrix row-equivalent to the constraint matrix.
关于整数编程固定参数可控性的深入研究,主要集中在利用约束矩阵 A 的稀疏性与其格拉弗基元素的规范之间的关系。特别是,当以 A 的原始树深度和输入复杂度为参数时,以及以 A 的对偶树深度和输入复杂度为参数时,整数编程都是固定参数可控的;这两种参数化都意味着 A 是稀疏的,特别是,其非零条目数分别与列数或行数呈线性关系。如果存在将给定矩阵转换为行等效稀疏矩阵的预处理器,我们将对其进行研究,并根据相关列 matroid 的结构特性提供表征稀疏行等效矩阵存在性的结构性结果。特别是,我们的结果意味着格拉弗基的(ell _1)-norm是由A的一个回路的最大(ell _1)-norm的函数限定的。我们利用我们的结果设计了一种参数化算法,如果存在这样一个行等价矩阵,该算法可以构造一个与输入矩阵A行等价的矩阵,该矩阵具有较小的原始/双树深度和入口复杂度。当以约束矩阵的格拉弗基的(ell _1)-正态为参数时,当以约束矩阵的回路的(ell _1)-正态为参数时,当以与约束矩阵行向等价的矩阵的最小原始树深度和入口复杂度为参数时,以及当以与约束矩阵行向等价的矩阵的最小对偶树深度和入口复杂度为参数时,我们的结果产生了整数编程的参数化算法。
{"title":"Characterization of matrices with bounded Graver bases and depth parameters and applications to integer programming","authors":"Marcin Briański, Martin Koutecký, Daniel Král’, Kristýna Pekárková, Felix Schröder","doi":"10.1007/s10107-023-02048-x","DOIUrl":"https://doi.org/10.1007/s10107-023-02048-x","url":null,"abstract":"<p>An intensive line of research on fixed parameter tractability of integer programming is focused on exploiting the relation between the sparsity of a constraint matrix <i>A</i> and the norm of the elements of its Graver basis. In particular, integer programming is fixed parameter tractable when parameterized by the primal tree-depth and the entry complexity of <i>A</i>, and when parameterized by the dual tree-depth and the entry complexity of <i>A</i>; both these parameterization imply that <i>A</i> is sparse, in particular, the number of its non-zero entries is linear in the number of columns or rows, respectively. We study preconditioners transforming a given matrix to a row-equivalent sparse matrix if it exists and provide structural results characterizing the existence of a sparse row-equivalent matrix in terms of the structural properties of the associated column matroid. In particular, our results imply that the <span>(ell _1)</span>-norm of the Graver basis is bounded by a function of the maximum <span>(ell _1)</span>-norm of a circuit of <i>A</i>. We use our results to design a parameterized algorithm that constructs a matrix row-equivalent to an input matrix <i>A</i> that has small primal/dual tree-depth and entry complexity if such a row-equivalent matrix exists. Our results yield parameterized algorithms for integer programming when parameterized by the <span>(ell _1)</span>-norm of the Graver basis of the constraint matrix, when parameterized by the <span>(ell _1)</span>-norm of the circuits of the constraint matrix, when parameterized by the smallest primal tree-depth and entry complexity of a matrix row-equivalent to the constraint matrix, and when parameterized by the smallest dual tree-depth and entry complexity of a matrix row-equivalent to the constraint matrix.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"29 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139507778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-10DOI: 10.1007/s10107-023-02045-0
Benjamin Moseley, Kirk Pruhs, Clifford Stein, Rudy Zhou
This paper considers the basic problem of scheduling jobs online with preemption to maximize the number of jobs completed by their deadline on m identical machines. The main result is an O(1) competitive deterministic algorithm for any number of machines (m >1).
本文研究了一个基本问题,即通过抢占式在线作业调度,在 m 台相同机器上最大限度地提高在截止日期前完成作业的数量。主要结果是针对任意机器数量(m >1)的 O(1)竞争确定性算法。
{"title":"A competitive algorithm for throughput maximization on identical machines","authors":"Benjamin Moseley, Kirk Pruhs, Clifford Stein, Rudy Zhou","doi":"10.1007/s10107-023-02045-0","DOIUrl":"https://doi.org/10.1007/s10107-023-02045-0","url":null,"abstract":"<p>This paper considers the basic problem of scheduling jobs online with preemption to maximize the number of jobs completed by their deadline on <i>m</i> identical machines. The main result is an <i>O</i>(1) competitive deterministic algorithm for any number of machines <span>(m >1)</span>.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"44 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139421255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-08DOI: 10.1007/s10107-023-02042-3
Abstract
This note provides a counterexample to a theorem announced in the last part of the paper (Vicente and Custódio Math Program 133:299–325, 2012). The counterexample involves an objective function (f: mathbb {R}rightarrow mathbb {R}) which satisfies all the assumptions required by the theorem but contradicts some of its conclusions. A corollary of this theorem is also affected by this counterexample. The main flaw revealed by the counterexample is the possibility that a directional direct search method (dDSM) generates a sequence of trial points ((x_k)_{k in mathbb {N}}) converging to a point (x_*) where f is discontinuous, lower semicontinuous and whose objective function value (f(x_*)) is strictly less than (lim _{krightarrow infty } f(x_k)). Moreover the dDSM generates trial points in only one of the continuity sets of f near (x_*). This note also investigates the proof of the theorem to highlight the inexact statements in the original paper. Finally this work introduces a modification of the dDSM that allows, in usual cases, to recover the properties broken by the counterexample.
摘要 本注释提供了论文最后一部分(Vicente and Custódio Math Program 133:299-325, 2012)中公布的一个定理的反例。该反例涉及一个目标函数(f: mathbb {R}rightarrow mathbb {R}/),它满足定理所要求的所有假设,但与定理的某些结论相矛盾。该定理的一个推论也受到了这个反例的影响。这个反例揭示的主要缺陷是定向直接搜索法(dDSM)有可能产生一连串的试验点 ((x_k)_{k in mathbb {N}}) 收敛到 f 不连续的点(x_*)、并且其目标函数值 (f(x_*))严格小于 (f(x_k))。此外,dDSM 只在在(x_*)附近的 f 的连续性集合中的一个集合中产生试验点。本注释还研究了定理的证明,以突出原论文中不精确的陈述。最后,本文介绍了对 dDSM 的修改,在通常情况下,它可以恢复被反例破坏的性质。
{"title":"Counterexample and an additional revealing poll step for a result of “analysis of direct searches for discontinuous functions”","authors":"","doi":"10.1007/s10107-023-02042-3","DOIUrl":"https://doi.org/10.1007/s10107-023-02042-3","url":null,"abstract":"<h3>Abstract</h3> <p>This note provides a counterexample to a theorem announced in the last part of the paper (Vicente and Custódio Math Program 133:299–325, 2012). The counterexample involves an objective function <span> <span>(f: mathbb {R}rightarrow mathbb {R})</span> </span> which satisfies all the assumptions required by the theorem but contradicts some of its conclusions. A corollary of this theorem is also affected by this counterexample. The main flaw revealed by the counterexample is the possibility that a directional direct search method (dDSM) generates a sequence of trial points <span> <span>((x_k)_{k in mathbb {N}})</span> </span> converging to a point <span> <span>(x_*)</span> </span> where <em>f</em> is discontinuous, lower semicontinuous and whose objective function value <span> <span>(f(x_*))</span> </span> is strictly less than <span> <span>(lim _{krightarrow infty } f(x_k))</span> </span>. Moreover the dDSM generates trial points in only one of the continuity sets of <em>f</em> near <span> <span>(x_*)</span> </span>. This note also investigates the proof of the theorem to highlight the inexact statements in the original paper. Finally this work introduces a modification of the dDSM that allows, in usual cases, to recover the properties broken by the counterexample. </p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139410419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-06DOI: 10.1007/s10107-023-02031-6
Lai Tian, Anthony Man-Cho So
We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization.
{"title":"No dimension-free deterministic algorithm computes approximate stationarities of Lipschitzians","authors":"Lai Tian, Anthony Man-Cho So","doi":"10.1007/s10107-023-02031-6","DOIUrl":"https://doi.org/10.1007/s10107-023-02031-6","url":null,"abstract":"<p>We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"89 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139373802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-05DOI: 10.1007/s10107-023-02040-5
Jelena Diakonikolas, Cristóbal Guzmán
Composite minimization is a powerful framework in large-scale convex optimization, based on decoupling of the objective function into terms with structurally different properties and allowing for more flexible algorithmic design. We introduce a new algorithmic framework for complementary composite minimization, where the objective function decouples into a (weakly) smooth and a uniformly convex term. This particular form of decoupling is pervasive in statistics and machine learning, due to its link to regularization. The main contributions of our work are summarized as follows. First, we introduce the problem of complementary composite minimization in general normed spaces; second, we provide a unified accelerated algorithmic framework to address broad classes of complementary composite minimization problems; and third, we prove that the algorithms resulting from our framework are near-optimal in most of the standard optimization settings. Additionally, we show that our algorithmic framework can be used to address the problem of making the gradients small in general normed spaces. As a concrete example, we obtain a nearly-optimal method for the standard (ell _1) setup (small gradients in the (ell _infty ) norm), essentially matching the bound of Nesterov (Optima Math Optim Soc Newsl 88:10–11, 2012) that was previously known only for the Euclidean setup. Finally, we show that our composite methods are broadly applicable to a number of regression and other classes of optimization problems, where regularization plays a key role. Our methods lead to complexity bounds that are either new or match the best existing ones.
{"title":"Complementary composite minimization, small gradients in general norms, and applications","authors":"Jelena Diakonikolas, Cristóbal Guzmán","doi":"10.1007/s10107-023-02040-5","DOIUrl":"https://doi.org/10.1007/s10107-023-02040-5","url":null,"abstract":"<p>Composite minimization is a powerful framework in large-scale convex optimization, based on decoupling of the objective function into terms with structurally different properties and allowing for more flexible algorithmic design. We introduce a new algorithmic framework for <i>complementary composite minimization</i>, where the objective function decouples into a (weakly) smooth and a uniformly convex term. This particular form of decoupling is pervasive in statistics and machine learning, due to its link to regularization. The main contributions of our work are summarized as follows. First, we introduce the problem of complementary composite minimization in general normed spaces; second, we provide a unified accelerated algorithmic framework to address broad classes of complementary composite minimization problems; and third, we prove that the algorithms resulting from our framework are near-optimal in most of the standard optimization settings. Additionally, we show that our algorithmic framework can be used to address the problem of making the gradients small in general normed spaces. As a concrete example, we obtain a nearly-optimal method for the standard <span>(ell _1)</span> setup (small gradients in the <span>(ell _infty )</span> norm), essentially matching the bound of Nesterov (Optima Math Optim Soc Newsl 88:10–11, 2012) that was previously known only for the Euclidean setup. Finally, we show that our composite methods are broadly applicable to a number of regression and other classes of optimization problems, where regularization plays a key role. Our methods lead to complexity bounds that are either new or match the best existing ones.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"20 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139373803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-04DOI: 10.1007/s10107-023-02044-1
Vincent Cohen-Addad, Tobias Mömke, Victor Verdugo
In the non-uniform sparsest cut problem, we are given a supply graph G and a demand graph D, both with the same set of nodes V. The goal is to find a cut of V that minimizes the ratio of the total capacity on the edges of G crossing the cut over the total demand of the crossing edges of D. In this work, we study the non-uniform sparsest cut problem for supply graphs with bounded treewidth k. For this case, Gupta et al. (ACM STOC, 2013) obtained a 2-approximation with polynomial running time for fixed k, and it remained open the question of whether there exists a c-approximation algorithm for a constant c independent of k, that runs in (textsf{FPT}) time. We answer this question in the affirmative. We design a 2-approximation algorithm for the non-uniform sparsest cut with bounded treewidth supply graphs that runs in (textsf{FPT}) time, when parameterized by the treewidth. Our algorithm is based on rounding the optimal solution of a linear programming relaxation inspired by the Sherali-Adams hierarchy. In contrast to the classic Sherali-Adams approach, we construct a relaxation driven by a tree decomposition of the supply graph by including a carefully chosen set of lifting variables and constraints to encode information of subsets of nodes with super-constant size, and at the same time we have a sufficiently small linear program that can be solved in (textsf{FPT}) time.
在非均匀最疏剪切问题中,我们给定了一个供应图 G 和一个需求图 D,两者都有相同的节点集 V。我们的目标是找到 V 的一个剪切点,该剪切点能使 G 的交叉边上的总容量与 D 的交叉边上的总需求之比最小化。在这项工作中,我们将研究具有有界树宽 k 的供应图的非均匀最疏剪切问题。对于这种情况,Gupta 等人(ACM STOC,2013 年)在固定 k 的情况下获得了运行时间为多项式的 2-approximation 算法,而对于与 k 无关的常数 c,是否存在一种运行时间为 (textsf{FPT}) 的 c-approximation 算法,这个问题仍然悬而未决。我们的回答是肯定的。我们为具有有界树宽的非均匀最疏剪切供应图设计了一种 2-approximation 算法,当以树宽为参数时,该算法能在(textsf{FPT}) 时间内运行。我们的算法基于对受 Sherali-Adams 层次结构启发的线性规划松弛的最优解进行舍入。与经典的 Sherali-Adams 方法不同的是,我们构建了一种由供应图的树形分解驱动的松弛,包括精心选择的一组提升变量和约束条件,以编码具有超常大小的节点子集的信息,同时我们有一个足够小的线性规划,可以在 (textsf{FPT})时间内求解。
{"title":"A 2-approximation for the bounded treewidth sparsest cut problem in $$textsf{FPT}$$ Time","authors":"Vincent Cohen-Addad, Tobias Mömke, Victor Verdugo","doi":"10.1007/s10107-023-02044-1","DOIUrl":"https://doi.org/10.1007/s10107-023-02044-1","url":null,"abstract":"<p>In the non-uniform sparsest cut problem, we are given a supply graph <i>G</i> and a demand graph <i>D</i>, both with the same set of nodes <i>V</i>. The goal is to find a cut of <i>V</i> that minimizes the ratio of the total capacity on the edges of <i>G</i> crossing the cut over the total demand of the crossing edges of <i>D</i>. In this work, we study the non-uniform sparsest cut problem for supply graphs with bounded treewidth <i>k</i>. For this case, Gupta et al. (ACM STOC, 2013) obtained a 2-approximation with polynomial running time for fixed <i>k</i>, and it remained open the question of whether there exists a <i>c</i>-approximation algorithm for a constant <i>c</i> independent of <i>k</i>, that runs in <span>(textsf{FPT})</span> time. We answer this question in the affirmative. We design a 2-approximation algorithm for the non-uniform sparsest cut with bounded treewidth supply graphs that runs in <span>(textsf{FPT})</span> time, when parameterized by the treewidth. Our algorithm is based on rounding the optimal solution of a linear programming relaxation inspired by the Sherali-Adams hierarchy. In contrast to the classic Sherali-Adams approach, we construct a relaxation driven by a tree decomposition of the supply graph by including a carefully chosen set of lifting variables and constraints to encode information of subsets of nodes with super-constant size, and at the same time we have a sufficiently small linear program that can be solved in <span>(textsf{FPT})</span> time.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139105285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-04DOI: 10.1007/s10107-023-02046-z
Abstract
In this paper, a special case of the generalized 4-block n-fold IPs is investigated, where (B_i=B) and B has a rank at most 1. Such IPs, called almost combinatorial 4-block n-fold IPs, include the generalized n-fold IPs as a subcase. We are interested in fixed parameter tractable (FPT) algorithms by taking as parameters the dimensions of the blocks and the largest coefficient. For almost combinatorial 4-block n-fold IPs, we first show that there exists some (lambda le g(gamma )) such that for any nonzero kernel element ({textbf{g}}), (lambda {textbf{g}}) can always be decomposed into kernel elements in the same orthant whose (ell _{infty })-norm is bounded by (g(gamma )) (while ({textbf{g}}) itself might not admit such a decomposition), where g is a computable function and (gamma ) is an upper bound on the dimensions of the blocks and the largest coefficient. Based on this, we are able to bound the (ell _{infty })-norm of Graver basis elements by ({mathcal {O}}(g(gamma )n)) and develop an ({mathcal {O}}(g(gamma )n^{3+o(1)}hat{L}^2))-time algorithm (here (hat{L}) denotes the logarithm of the largest absolute value occurring in the input). Additionally, we show that the (ell _{infty })-norm of Graver basis elements is (varOmega (n)). As applications, almost combinatorial 4-block n-fold IPs can be used to model generalizations of classical problems, including scheduling with rejection, bi-criteria scheduling, and a generalized delivery problem. Therefore, our FPT algorithm establishes a general framework to settle these problems.
摘要 本文研究了广义 4 块 n 折 IP 的一个特例,其中 (B_i=B)且 B 的秩最多为 1。这种 IP 被称为近似组合 4 块 n 折 IP,包括广义 n 折 IP 的一个子例。我们感兴趣的是以块的维数和最大系数为参数的固定参数可操作性(FPT)算法。对于几乎是组合型的4块n折叠IP,我们首先证明存在一些 (lambda le g(gamma )) 这样的内核元素:对于任何非零内核元素 ({textbf{g}}) 、 (lambda{textbf{g}})总是可以分解成同一个正交的内核元素,其(ell _{infty }) -norm受(g(gamma ))约束(而({textbf{g}})本身可能不允许这样的分解)、其中,g 是一个可计算的函数,而 (gamma ) 是块的维数和最大系数的上限。在此基础上,我们可以通过 ({mathcal {O}}(g(gamma )n)) 来约束格拉弗基元的 (ell _{infty }) -norm,并开发出一种 ({mathcal {O}}(g(gamma )n^{3+o(1)}hat{L}^2))-时间算法(这里的 (hat{L} 表示输入中出现的最大绝对值的对数)。此外,我们还证明了 Graver 基元的 (ell _{infty }) -norm是 (varOmega (n)) 。作为应用,几乎可以用组合 4 块 n 折 IP 来模拟经典问题的一般化,包括拒绝调度、双标准调度和一般化交付问题。因此,我们的 FPT 算法建立了解决这些问题的通用框架。
{"title":"FPT algorithms for a special block-structured integer program with applications in scheduling","authors":"","doi":"10.1007/s10107-023-02046-z","DOIUrl":"https://doi.org/10.1007/s10107-023-02046-z","url":null,"abstract":"<h3>Abstract</h3> <p>In this paper, a special case of the generalized 4-block <em>n</em>-fold IPs is investigated, where <span> <span>(B_i=B)</span> </span> and <em>B</em> has a rank at most 1. Such IPs, called <em>almost combinatorial 4-block n-fold IPs</em>, include the generalized <em>n</em>-fold IPs as a subcase. We are interested in fixed parameter tractable (FPT) algorithms by taking as parameters the dimensions of the blocks and the largest coefficient. For almost combinatorial 4-block <em>n</em>-fold IPs, we first show that there exists some <span> <span>(lambda le g(gamma ))</span> </span> such that for any nonzero kernel element <span> <span>({textbf{g}})</span> </span>, <span> <span>(lambda {textbf{g}})</span> </span> can always be decomposed into kernel elements in the same orthant whose <span> <span>(ell _{infty })</span> </span>-norm is bounded by <span> <span>(g(gamma ))</span> </span> (while <span> <span>({textbf{g}})</span> </span> itself might not admit such a decomposition), where <em>g</em> is a computable function and <span> <span>(gamma )</span> </span> is an upper bound on the dimensions of the blocks and the largest coefficient. Based on this, we are able to bound the <span> <span>(ell _{infty })</span> </span>-norm of Graver basis elements by <span> <span>({mathcal {O}}(g(gamma )n))</span> </span> and develop an <span> <span>({mathcal {O}}(g(gamma )n^{3+o(1)}hat{L}^2))</span> </span>-time algorithm (here <span> <span>(hat{L})</span> </span> denotes the logarithm of the largest absolute value occurring in the input). Additionally, we show that the <span> <span>(ell _{infty })</span> </span>-norm of Graver basis elements is <span> <span>(varOmega (n))</span> </span>. As applications, almost combinatorial 4-block <em>n</em>-fold IPs can be used to model generalizations of classical problems, including scheduling with rejection, bi-criteria scheduling, and a generalized delivery problem. Therefore, our FPT algorithm establishes a general framework to settle these problems.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"34 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139105010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-04DOI: 10.1007/s10107-023-02041-4
Masoud Ahookhosh, Yurii Nesterov
We introduce a Bi-level OPTimization (BiOPT) framework for minimizing the sum of two convex functions, where one of them is smooth enough. The BiOPT framework offers three levels of freedom: (i) choosing the order p of the proximal term; (ii) designing an inexact pth-order proximal-point method in the upper level; (iii) solving the auxiliary problem with a lower-level non-Euclidean method in the lower level. We here regularize the objective by a ((p+1))th-order proximal term (for arbitrary integer (pge 1)) and then develop the generic inexact high-order proximal-point scheme and its acceleration using the standard estimating sequence technique at the upper level. This follows at the lower level with solving the corresponding pth-order proximal auxiliary problem inexactly either by one iteration of the pth-order tensor method or by a lower-order non-Euclidean composite gradient scheme. Ultimately, it is shown that applying the accelerated inexact pth-order proximal-point method at the upper level and handling the auxiliary problem by the non-Euclidean composite gradient scheme lead to a 2q-order method with the convergence rate ({mathcal {O}}(k^{-(p+1)})) (for (q=lfloor p/2rfloor ) and the iteration counter k), which can result to a superfast method for some specific class of problems.
{"title":"High-order methods beyond the classical complexity bounds: inexact high-order proximal-point methods","authors":"Masoud Ahookhosh, Yurii Nesterov","doi":"10.1007/s10107-023-02041-4","DOIUrl":"https://doi.org/10.1007/s10107-023-02041-4","url":null,"abstract":"<p>We introduce a <i>Bi-level OPTimization</i> (BiOPT) framework for minimizing the sum of two convex functions, where one of them is smooth enough. The BiOPT framework offers three levels of freedom: (i) choosing the order <i>p</i> of the proximal term; (ii) designing an inexact <i>p</i>th-order proximal-point method in the upper level; (iii) solving the auxiliary problem with a lower-level non-Euclidean method in the lower level. We here regularize the objective by a <span>((p+1))</span>th-order proximal term (for arbitrary integer <span>(pge 1)</span>) and then develop the generic inexact high-order proximal-point scheme and its acceleration using the standard estimating sequence technique at the upper level. This follows at the lower level with solving the corresponding <i>p</i>th-order proximal auxiliary problem inexactly either by one iteration of the <i>p</i>th-order tensor method or by a lower-order non-Euclidean composite gradient scheme. Ultimately, it is shown that applying the accelerated inexact <i>p</i>th-order proximal-point method at the upper level and handling the auxiliary problem by the non-Euclidean composite gradient scheme lead to a 2<i>q</i>-order method with the convergence rate <span>({mathcal {O}}(k^{-(p+1)}))</span> (for <span>(q=lfloor p/2rfloor )</span> and the iteration counter <i>k</i>), which can result to a superfast method for some specific class of problems.\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"65 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139105015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-03DOI: 10.1007/s10107-023-02043-2
Quang Minh Bui, Margarida Carvalho, José Neto
The network pricing problem (NPP) is a bilevel problem, where the leader optimizes its revenue by deciding on the prices of certain arcs in a graph, while expecting the followers (also known as the commodities) to choose a shortest path based on those prices. In this paper, we investigate the complexity of the NPP with respect to two parameters: the number of tolled arcs, and the number of commodities. We devise a simple algorithm showing that if the number of tolled arcs is fixed, then the problem can be solved in polynomial time with respect to the number of commodities. In contrast, even if there is only one commodity, once the number of tolled arcs is not fixed, the problem becomes NP-hard. We characterize this asymmetry in the complexity with a novel property named strong bilevel feasibility. Finally, we describe an algorithm to generate valid inequalities to the NPP based on this property, whose numerical results illustrate its potential for effectively solving the NPP with a high number of commodities.
{"title":"Asymmetry in the complexity of the multi-commodity network pricing problem","authors":"Quang Minh Bui, Margarida Carvalho, José Neto","doi":"10.1007/s10107-023-02043-2","DOIUrl":"https://doi.org/10.1007/s10107-023-02043-2","url":null,"abstract":"<p>The network pricing problem (NPP) is a bilevel problem, where the leader optimizes its revenue by deciding on the prices of certain arcs in a graph, while expecting the followers (also known as the commodities) to choose a shortest path based on those prices. In this paper, we investigate the complexity of the NPP with respect to two parameters: the number of tolled arcs, and the number of commodities. We devise a simple algorithm showing that if the number of tolled arcs is fixed, then the problem can be solved in polynomial time with respect to the number of commodities. In contrast, even if there is only one commodity, once the number of tolled arcs is not fixed, the problem becomes NP-hard. We characterize this asymmetry in the complexity with a novel property named strong bilevel feasibility. Finally, we describe an algorithm to generate valid inequalities to the NPP based on this property, whose numerical results illustrate its potential for effectively solving the NPP with a high number of commodities.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"36 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139094988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}