Pub Date : 2024-04-30DOI: 10.1007/s11590-024-02118-9
Yassine Nabou, François Glineur, Ion Necoara
We introduce the concept of inexact first-order oracle of degree q for a possibly nonconvex and nonsmooth function, which naturally appears in the context of approximate gradient, weak level of smoothness and other situations. Our definition is less conservative than those found in the existing literature, and it can be viewed as an interpolation between fully exact and the existing inexact first-order oracle definitions. We analyze the convergence behavior of a (fast) inexact proximal gradient method using such an oracle for solving (non)convex composite minimization problems. We derive complexity estimates and study the dependence between the accuracy of the oracle and the desired accuracy of the gradient or of the objective function. Our results show that better rates can be obtained both theoretically and in numerical simulations when q is large.
{"title":"Proximal gradient methods with inexact oracle of degree q for composite optimization","authors":"Yassine Nabou, François Glineur, Ion Necoara","doi":"10.1007/s11590-024-02118-9","DOIUrl":"https://doi.org/10.1007/s11590-024-02118-9","url":null,"abstract":"<p>We introduce the concept of inexact first-order oracle of degree <i>q</i> for a possibly nonconvex and nonsmooth function, which naturally appears in the context of approximate gradient, weak level of smoothness and other situations. Our definition is less conservative than those found in the existing literature, and it can be viewed as an interpolation between fully exact and the existing inexact first-order oracle definitions. We analyze the convergence behavior of a (fast) inexact proximal gradient method using such an oracle for solving (non)convex composite minimization problems. We derive complexity estimates and study the dependence between the accuracy of the oracle and the desired accuracy of the gradient or of the objective function. Our results show that better rates can be obtained both theoretically and in numerical simulations when <i>q</i> is large.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140838484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-26DOI: 10.1007/s11590-024-02116-x
José Maurício Fernandes Medeiros, Anand Subramanian, Eduardo Queiroga
This work addresses a parallel batch machine scheduling problem subject to tardiness penalties, release dates, and incompatible job families. In this environment, jobs of the same family are partitioned into batches and each batch is assigned to a machine. The objective is to determine the sequence in which the batches will be processed on each machine with a view of minimizing the total weighted tardiness. To solve the problem, we propose a population-based iterated local search algorithm that makes use of multiple neighborhood structures and an efficient perturbation mechanism. The algorithm also incorporates the time window decomposition (TWD) heuristic to generate the initial population and employs population control strategies aiming to promote individuals with higher fitness by combining the total weighted tardiness with the contribution to the diversity of the population. Extensive computational experiments were conducted on 4860 benchmark instances and the results obtained compare very favorably with those found by the best existing algorithms.
{"title":"Population-based iterated local search for batch scheduling on parallel machines with incompatible job families, release dates, and tardiness penalties","authors":"José Maurício Fernandes Medeiros, Anand Subramanian, Eduardo Queiroga","doi":"10.1007/s11590-024-02116-x","DOIUrl":"https://doi.org/10.1007/s11590-024-02116-x","url":null,"abstract":"<p>This work addresses a parallel batch machine scheduling problem subject to tardiness penalties, release dates, and incompatible job families. In this environment, jobs of the same family are partitioned into batches and each batch is assigned to a machine. The objective is to determine the sequence in which the batches will be processed on each machine with a view of minimizing the total weighted tardiness. To solve the problem, we propose a population-based iterated local search algorithm that makes use of multiple neighborhood structures and an efficient perturbation mechanism. The algorithm also incorporates the time window decomposition (TWD) heuristic to generate the initial population and employs population control strategies aiming to promote individuals with higher fitness by combining the total weighted tardiness with the contribution to the diversity of the population. Extensive computational experiments were conducted on 4860 benchmark instances and the results obtained compare very favorably with those found by the best existing algorithms.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-25DOI: 10.1007/s11590-024-02112-1
Francesco Marchetti, Sabrina Guastavino, Cristina Campi, Federico Benvenuto, Michele Piana
In many contexts, customized and weighted classification scores are designed in order to evaluate the goodness of the predictions carried out by neural networks. However, there exists a discrepancy between the maximization of such scores and the minimization of the loss function in the training phase. In this paper, we provide a complete theoretical setting that formalizes weighted classification metrics and then allows the construction of losses that drive the model to optimize these metrics of interest. After a detailed theoretical analysis, we show that our framework includes as particular instances well-established approaches such as classical cost-sensitive learning, weighted cross entropy loss functions and value-weighted skill scores.
{"title":"A comprehensive theoretical framework for the optimization of neural networks classification performance with respect to weighted metrics","authors":"Francesco Marchetti, Sabrina Guastavino, Cristina Campi, Federico Benvenuto, Michele Piana","doi":"10.1007/s11590-024-02112-1","DOIUrl":"https://doi.org/10.1007/s11590-024-02112-1","url":null,"abstract":"<p>In many contexts, customized and weighted classification scores are designed in order to evaluate the goodness of the predictions carried out by neural networks. However, there exists a discrepancy between the maximization of such scores and the minimization of the loss function in the training phase. In this paper, we provide a complete theoretical setting that formalizes weighted classification metrics and then allows the construction of losses that drive the model to optimize these metrics of interest. After a detailed theoretical analysis, we show that our framework includes as particular instances well-established approaches such as classical cost-sensitive learning, weighted cross entropy loss functions and value-weighted skill scores.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-18DOI: 10.1007/s11590-024-02113-0
Kirill V. Kaymakov, Dmitry S. Malyshev
For given edge-capacitated connected graph and two its vertices s and t, the bottleneck (or (max min )) path problem is to find the maximum value of path-minimum edge capacities among all paths, connecting s and t. It can be generalized by finding the bottleneck values between s and all possible t. These problems arise as subproblems in the known maximum flow problem, having applications in many real-life tasks. For any graph with n vertices and m edges, they can be solved in O(m) and O(t(m, n)) times, respectively, where (t(m,n)=min (m+nlog (n),malpha (m,n))) and (alpha (cdot ,cdot )) is the inverse Ackermann function. In this paper, we generalize of the bottleneck path problems by considering their versions with k sources. For the first of them, where k pairs of sources and targets are (offline or online) given, we present an (O((m+k)log (n)))-time randomized and an (O(m+(n+k)log (n)))-time deterministic algorithms for the offline and online versions, respectively. For the second one, where the bottleneck values are found between k sources and all targets, we present an (O(t(m,n)+kn))-time offline/online algorithm.
对于给定的有边容量的连通图及其两个顶点 s 和 t,瓶颈(或 (max min ))路径问题是在连接 s 和 t 的所有路径中找到路径最小边容量的最大值。对于任何有 n 个顶点和 m 条边的图,它们可以分别在 O(m) 和 O(t(m, n)) 次内求解,其中(t(m,n)=min (m+nlog (n),malpha (m,n)))和(alpha (cdot ,cdot))是反阿克曼函数。在本文中,我们通过考虑有 k 个来源的瓶颈路径问题来概括这些问题。对于其中的第一个版本,即 k 对来源和目标是(离线或在线)给定的,我们为离线和在线版本分别提出了一个(O((m+k)log (n))-time 随机算法和一个(O(m+(n+k)log (n))-time 确定性算法。对于第二种算法,即在 k 个来源和所有目标之间找到瓶颈值,我们提出了一种离线/在线算法(O(t(m,n)+kn)t(m,n)+kn)-time)。
{"title":"On efficient algorithms for bottleneck path problems with many sources","authors":"Kirill V. Kaymakov, Dmitry S. Malyshev","doi":"10.1007/s11590-024-02113-0","DOIUrl":"https://doi.org/10.1007/s11590-024-02113-0","url":null,"abstract":"<p>For given edge-capacitated connected graph and two its vertices <i>s</i> and <i>t</i>, the bottleneck (or <span>(max min )</span>) path problem is to find the maximum value of path-minimum edge capacities among all paths, connecting <i>s</i> and <i>t</i>. It can be generalized by finding the bottleneck values between <i>s</i> and all possible <i>t</i>. These problems arise as subproblems in the known maximum flow problem, having applications in many real-life tasks. For any graph with <i>n</i> vertices and <i>m</i> edges, they can be solved in <i>O</i>(<i>m</i>) and <i>O</i>(<i>t</i>(<i>m</i>, <i>n</i>)) times, respectively, where <span>(t(m,n)=min (m+nlog (n),malpha (m,n)))</span> and <span>(alpha (cdot ,cdot ))</span> is the inverse Ackermann function. In this paper, we generalize of the bottleneck path problems by considering their versions with <i>k</i> sources. For the first of them, where <i>k</i> pairs of sources and targets are (offline or online) given, we present an <span>(O((m+k)log (n)))</span>-time randomized and an <span>(O(m+(n+k)log (n)))</span>-time deterministic algorithms for the offline and online versions, respectively. For the second one, where the bottleneck values are found between <i>k</i> sources and all targets, we present an <span>(O(t(m,n)+kn))</span>-time offline/online algorithm.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140624385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-13DOI: 10.1007/s11590-024-02111-2
Robert D. Barish, Tetsuo Shibuya
We introduce affine optimalk-proper connected edge colorings as a variation on Fujita’s notion of optimalk-proper connected colorings (Fujita in Optim Lett 14(6):1371–1380, 2020. https://doi.org/10.1007/s11590-019-01442-9) with applications to the frequency assignment problem. Here, for a simple undirected graph G with edge set (E_G), such a coloring corresponds to a decomposition of (E_G) into color classes (C_1, C_2, ldots , C_n), with associated weights (w_1, w_2, ldots , w_n), minimizing a specified affine function ({mathcal {A}}, {:=},sum _{i=1}^{n} left( w_i cdot |C_i|right)), while also ensuring the existence of k vertex disjoint proper paths (i.e., simple paths with no two adjacent edges in the same color class) between all pairs of vertices. In this context, we define (zeta _{{mathcal {A}}}^k(G)) as the minimum possible value of ({mathcal {A}}) under a k-proper connectivity requirement. For any fixed number of color classes, we show that computing (zeta _{{mathcal {A}}}^k(G)) is treewidth fixed parameter tractable. However, we also show that determining (zeta _{{mathcal {A}}^{prime }}^k(G)) with the affine function ({mathcal {A}}^{prime } , {:=},0 cdot |C_1| + |C_2|) is NP-hard for 2-connected planar graphs in the case where (k = 1), cubic 3-connected planar graphs for (k = 2), and k-connected graphs (forall k ge 3). We also show that no fully polynomial-time randomized approximation scheme can exist for approximating (zeta _{{mathcal {A}}^{prime }}^k(G)) under any of the aforementioned constraints unless (NP=RP).
{"title":"Affine optimal k-proper connected edge colorings","authors":"Robert D. Barish, Tetsuo Shibuya","doi":"10.1007/s11590-024-02111-2","DOIUrl":"https://doi.org/10.1007/s11590-024-02111-2","url":null,"abstract":"<p>We introduce <i>affine optimal</i> <i>k</i>-<i>proper connected edge colorings</i> as a variation on Fujita’s notion of <i>optimal</i> <i>k</i>-<i>proper connected colorings</i> (Fujita in Optim Lett 14(6):1371–1380, 2020. https://doi.org/10.1007/s11590-019-01442-9) with applications to the frequency assignment problem. Here, for a simple undirected graph <i>G</i> with edge set <span>(E_G)</span>, such a coloring corresponds to a decomposition of <span>(E_G)</span> into color classes <span>(C_1, C_2, ldots , C_n)</span>, with associated weights <span>(w_1, w_2, ldots , w_n)</span>, minimizing a specified affine function <span>({mathcal {A}}, {:=},sum _{i=1}^{n} left( w_i cdot |C_i|right))</span>, while also ensuring the existence of <i>k</i> vertex disjoint <i>proper paths</i> (i.e., simple paths with no two adjacent edges in the same color class) between all pairs of vertices. In this context, we define <span>(zeta _{{mathcal {A}}}^k(G))</span> as the minimum possible value of <span>({mathcal {A}})</span> under a <i>k</i>-proper connectivity requirement. For any fixed number of color classes, we show that computing <span>(zeta _{{mathcal {A}}}^k(G))</span> is treewidth fixed parameter tractable. However, we also show that determining <span>(zeta _{{mathcal {A}}^{prime }}^k(G))</span> with the affine function <span>({mathcal {A}}^{prime } , {:=},0 cdot |C_1| + |C_2|)</span> is <i>NP</i>-hard for 2-connected planar graphs in the case where <span>(k = 1)</span>, cubic 3-connected planar graphs for <span>(k = 2)</span>, and <i>k</i>-connected graphs <span>(forall k ge 3)</span>. We also show that no fully polynomial-time randomized approximation scheme can exist for approximating <span>(zeta _{{mathcal {A}}^{prime }}^k(G))</span> under any of the aforementioned constraints unless <span>(NP=RP)</span>.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-05DOI: 10.1007/s11590-024-02110-3
Arnold Neumaier, Morteza Kimiaei
This paper introduces CLS, a new line search along an arbitrary smooth search path, that starts at the current iterate tangentially to a descent direction. Like the Goldstein line search and unlike the Wolfe line search, the new line search uses, beyond the gradient at the current iterate, only function values. Using this line search with search directions satisfying the bounded angle condition, global convergence to a stationary point is proved for continuously differentiable objective functions that are bounded below and have Lipschitz continuous gradients. The standard complexity bounds are proved under several natural assumptions.
{"title":"An improvement of the Goldstein line search","authors":"Arnold Neumaier, Morteza Kimiaei","doi":"10.1007/s11590-024-02110-3","DOIUrl":"https://doi.org/10.1007/s11590-024-02110-3","url":null,"abstract":"<p>This paper introduces <span>CLS</span>, a new line search along an arbitrary smooth search path, that starts at the current iterate tangentially to a descent direction. Like the Goldstein line search and unlike the Wolfe line search, the new line search uses, beyond the gradient at the current iterate, only function values. Using this line search with search directions satisfying the bounded angle condition, global convergence to a stationary point is proved for continuously differentiable objective functions that are bounded below and have Lipschitz continuous gradients. The standard complexity bounds are proved under several natural assumptions.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s11590-024-02107-y
Li Zhang, Jing Yuan, Qiaoliang Li
We investigate k-level squared metric facility location problem with outliers (k-SMFLPWO) for any constant k. In k-SMFLPWO, given k facilities set ({mathcal {F}}_{l}), where (lin {1, 2, cdots , k}), clients set ({mathcal {C}}) with cardinality n and a non-negative integer (q<n). The sum of opening and connection cost will be substantially increased by distant clients. To minimize the total cost, some distant clients can not be connected, in short, at least (n-q) clients in clients set ({mathcal {C}}) are connected to the path (p=(i_{1}in {mathcal {F}}_{1}, i_{2}in {mathcal {F}}_{2}, cdots , i_{k}in {mathcal {F}}_{k})) where the facilities in path p are opened. Based on primal-dual approximation algorithm and the property of squared metric triangle inequality, we present a constant factor approximation algorithm for k-SMFLPWO.
{"title":"An approximation algorithm for k-level squared metric facility location problem with outliers","authors":"Li Zhang, Jing Yuan, Qiaoliang Li","doi":"10.1007/s11590-024-02107-y","DOIUrl":"https://doi.org/10.1007/s11590-024-02107-y","url":null,"abstract":"<p>We investigate <i>k</i>-level squared metric facility location problem with outliers (<i>k</i>-SMFLPWO) for any constant <i>k</i>. In <i>k</i>-SMFLPWO, given <i>k</i> facilities set <span>({mathcal {F}}_{l})</span>, where <span>(lin {1, 2, cdots , k})</span>, clients set <span>({mathcal {C}})</span> with cardinality <i>n</i> and a non-negative integer <span>(q<n)</span>. The sum of opening and connection cost will be substantially increased by distant clients. To minimize the total cost, some distant clients can not be connected, in short, at least <span>(n-q)</span> clients in clients set <span>({mathcal {C}})</span> are connected to the path <span>(p=(i_{1}in {mathcal {F}}_{1}, i_{2}in {mathcal {F}}_{2}, cdots , i_{k}in {mathcal {F}}_{k}))</span> where the facilities in path <i>p</i> are opened. Based on primal-dual approximation algorithm and the property of squared metric triangle inequality, we present a constant factor approximation algorithm for <i>k</i>-SMFLPWO.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-03DOI: 10.1007/s11590-024-02108-x
Donatella Granata
A meticulous description of a real network with respect to its heterogeneous physical infrastructure and properties is necessary for network design assessment. Quantifying the costs of making these structures work together effectively, and taking into account any hidden charges they may incur, can lead to improve the quality of service and reduce mandatory maintenance requirements, and mitigate the cost associated with finding a valid solution. For these reasons, we devote our attention to a novel approach to produce a more complete representation of the overall costs on the reload cost network. This approach considers both the cost of reloading due to linking structures and their internal charges, which we refer to as the penalized reload cost. We investigate the complexity and approximability of finding an optimal path, walk, tour, and maximum flow problems under penalized reload cost. All these problems turn out to be NP-complete. We prove that, unless P=NP, even if the reload cost matrix is symmetric and satisfies the triangle inequality, the problem of finding a path, tour, and a maximum flow with a minimum penalized reload cost cannot be approximated within any constant (alpha <2), and finding a walk is not approximable within any factor (beta le 3).
{"title":"On penalized reload cost path, walk, tour and maximum flow: hardness and approximation","authors":"Donatella Granata","doi":"10.1007/s11590-024-02108-x","DOIUrl":"https://doi.org/10.1007/s11590-024-02108-x","url":null,"abstract":"<p>A meticulous description of a real network with respect to its heterogeneous physical infrastructure and properties is necessary for network design assessment. Quantifying the costs of making these structures work together effectively, and taking into account any hidden charges they may incur, can lead to improve the quality of service and reduce mandatory maintenance requirements, and mitigate the cost associated with finding a valid solution. For these reasons, we devote our attention to a novel approach to produce a more complete representation of the overall costs on the reload cost network. This approach considers both the cost of reloading due to linking structures and their internal charges, which we refer to as the <i>penalized reload cost</i>. We investigate the complexity and approximability of finding an optimal path, walk, tour, and maximum flow problems under <i>penalized reload cost</i>. All these problems turn out to be NP-complete. We prove that, unless P=NP, even if the reload cost matrix is symmetric and satisfies the triangle inequality, the problem of finding a path, tour, and a maximum flow with a minimum <i>penalized reload cost</i> cannot be approximated within any constant <span>(alpha <2)</span>, and finding a walk is not approximable within any factor <span>(beta le 3)</span>.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-31DOI: 10.1007/s11590-024-02109-w
C. J. Price, B. Robertson, M. Reale
{"title":"Extending oscars-ii to generally constrained global optimization","authors":"C. J. Price, B. Robertson, M. Reale","doi":"10.1007/s11590-024-02109-w","DOIUrl":"https://doi.org/10.1007/s11590-024-02109-w","url":null,"abstract":"","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140358046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-26DOI: 10.1007/s11590-024-02101-4
Xuxing Chen, Minhui Huang, Shiqian Ma
Bilevel optimization has been successfully applied to many important machine learning problems. Algorithms for solving bilevel optimization have been studied under various settings. In this paper, we study the nonconvex-strongly-convex bilevel optimization under a decentralized setting. We design decentralized algorithms for both deterministic and stochastic bilevel optimization problems. Moreover, we analyze the convergence rates of the proposed algorithms in difference scenarios including the case where data heterogeneity is observed across agents. Numerical experiments on both synthetic and real data demonstrate that the proposed methods are efficient.
{"title":"Decentralized bilevel optimization","authors":"Xuxing Chen, Minhui Huang, Shiqian Ma","doi":"10.1007/s11590-024-02101-4","DOIUrl":"https://doi.org/10.1007/s11590-024-02101-4","url":null,"abstract":"<p>Bilevel optimization has been successfully applied to many important machine learning problems. Algorithms for solving bilevel optimization have been studied under various settings. In this paper, we study the nonconvex-strongly-convex bilevel optimization under a decentralized setting. We design decentralized algorithms for both deterministic and stochastic bilevel optimization problems. Moreover, we analyze the convergence rates of the proposed algorithms in difference scenarios including the case where data heterogeneity is observed across agents. Numerical experiments on both synthetic and real data demonstrate that the proposed methods are efficient.</p>","PeriodicalId":49720,"journal":{"name":"Optimization Letters","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}