Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100050
Hanyu Gu, Hue Chi Lam, Yakov Zinder
This study addresses the scheduling problem where every job requires several types of resources. At every point in time, the capacity of resources is limited. When necessary, the capacity can be increased at a cost. Each job has a due date, and the processing times of jobs are random variables with a known probability distribution. The considered problem is to determine a schedule that minimises the total cost, which consists of the cost incurred due to the violation of resource limits and the total tardiness of jobs. A genetic algorithm enhanced by local search is proposed. The sample average approximation method is used to construct a confidence interval for the optimality gap of the obtained solutions. Computational study on the application of the sample average approximation method and genetic algorithm is presented. It is revealed that the proposed method is capable of providing high-quality solutions to large instances in a reasonable time.
{"title":"A hybrid genetic algorithm for scheduling jobs sharing multiple resources under uncertainty","authors":"Hanyu Gu, Hue Chi Lam, Yakov Zinder","doi":"10.1016/j.ejco.2022.100050","DOIUrl":"10.1016/j.ejco.2022.100050","url":null,"abstract":"<div><p>This study addresses the scheduling problem where every job requires several types of resources. At every point in time, the capacity of resources is limited. When necessary, the capacity can be increased at a cost. Each job has a due date, and the processing times of jobs are random variables with a known probability distribution. The considered problem is to determine a schedule that minimises the total cost, which consists of the cost incurred due to the violation of resource limits and the total tardiness of jobs. A genetic algorithm enhanced by local search is proposed. The sample average approximation method is used to construct a confidence interval for the optimality gap of the obtained solutions. Computational study on the application of the sample average approximation method and genetic algorithm is presented. It is revealed that the proposed method is capable of providing high-quality solutions to large instances in a reasonable time.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100050"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000260/pdfft?md5=ee7f15de9e360359d6fb7832f6237849&pid=1-s2.0-S2192440622000260-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131870981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100036
Miguel F. Anjos , Tibor Illés , Tamás Terlaky
{"title":"Celebrating 20 years of EUROpt","authors":"Miguel F. Anjos , Tibor Illés , Tamás Terlaky","doi":"10.1016/j.ejco.2022.100036","DOIUrl":"10.1016/j.ejco.2022.100036","url":null,"abstract":"","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100036"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000120/pdfft?md5=53e8cc7c713850c9400ffef8e4c21569&pid=1-s2.0-S2192440622000120-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116341196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100026
Immanuel Bomze (Editor-in-Chief)
{"title":"The Marguerite Frank Award for the best EJCO paper 2021","authors":"Immanuel Bomze (Editor-in-Chief)","doi":"10.1016/j.ejco.2022.100026","DOIUrl":"10.1016/j.ejco.2022.100026","url":null,"abstract":"","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100026"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000028/pdfft?md5=3e9e8cb255c605a328f628abce4f05a2&pid=1-s2.0-S2192440622000028-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127151953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100046
S. Cipolla, J. Gondzio
Typically, nonlinear Support Vector Machines (SVMs) produce significantly higher classification quality when compared to linear ones but, at the same time, their computational complexity is prohibitive for large-scale datasets: this drawback is essentially related to the necessity to store and manipulate large, dense and unstructured kernel matrices. Despite the fact that at the core of training an SVM there is a simple convex optimization problem, the presence of kernel matrices is responsible for dramatic performance reduction, making SVMs unworkably slow for large problems. Aiming at an efficient solution of large-scale nonlinear SVM problems, we propose the use of the Alternating Direction Method of Multipliers coupled with Hierarchically Semi-Separable (HSS) kernel approximations. As shown in this work, the detailed analysis of the interaction among their algorithmic components unveils a particularly efficient framework and indeed, the presented experimental results demonstrate, in the case of Radial Basis Kernels, a significant speed-up when compared to the state-of-the-art nonlinear SVM libraries (without significantly affecting the classification accuracy).
{"title":"Training very large scale nonlinear SVMs using Alternating Direction Method of Multipliers coupled with the Hierarchically Semi-Separable kernel approximations","authors":"S. Cipolla, J. Gondzio","doi":"10.1016/j.ejco.2022.100046","DOIUrl":"10.1016/j.ejco.2022.100046","url":null,"abstract":"<div><p>Typically, nonlinear Support Vector Machines (SVMs) produce significantly higher classification quality when compared to linear ones but, at the same time, their computational complexity is prohibitive for large-scale datasets: this drawback is essentially related to the necessity to store and manipulate large, dense and unstructured kernel matrices. Despite the fact that at the core of training an SVM there is a <em>simple</em> convex optimization problem, the presence of kernel matrices is responsible for dramatic performance reduction, making SVMs unworkably slow for large problems. Aiming at an efficient solution of large-scale nonlinear SVM problems, we propose the use of the <em>Alternating Direction Method of Multipliers</em> coupled with <em>Hierarchically Semi-Separable</em> (HSS) kernel approximations. As shown in this work, the detailed analysis of the interaction among their algorithmic components unveils a particularly efficient framework and indeed, the presented experimental results demonstrate, in the case of Radial Basis Kernels, a significant speed-up when compared to the <em>state-of-the-art</em> nonlinear SVM libraries (without significantly affecting the classification accuracy).</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100046"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000223/pdfft?md5=f2bf24cfecce1c28928c1acc1630dd05&pid=1-s2.0-S2192440622000223-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128898167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100025
Luis Gustavo Nonato , Pedro Peixoto , Tiago Pereira , Claudia Sagastizábal , Paulo J.S. Silva
Robot Dance is a computational optimization platform developed in response to the COVID-19 outbreak, to support the decision-making on public policies at a regional level. The tool is suitable for understanding and suggesting levels of intervention needed to contain the spread of infectious diseases when the mobility of inhabitants through a regional network is a concern. Such is the case for the SARS-CoV-2 virus that is highly contagious and, therefore, makes it crucial to incorporate the circulation of people in the epidemiological compartmental models. Robot Dance anticipates the spread of an epidemic in a complex regional network, helping to identify fragile links where applying differentiated measures of containment, testing, and vaccination is important. Based on stochastic optimization, the model determines efficient strategies on the basis of commuting of individuals and the situation of hospitals in each district. Uncertainty in the capacity of intensive care beds is handled by a chance-constraint approach. Some functionalities of Robot Dance are illustrated in the state of São Paulo in Brazil, using real data for a region with more than forty million inhabitants.
{"title":"Robot Dance: A mathematical optimization platform for intervention against COVID-19 in a complex network","authors":"Luis Gustavo Nonato , Pedro Peixoto , Tiago Pereira , Claudia Sagastizábal , Paulo J.S. Silva","doi":"10.1016/j.ejco.2022.100025","DOIUrl":"10.1016/j.ejco.2022.100025","url":null,"abstract":"<div><p>Robot Dance is a computational optimization platform developed in response to the COVID-19 outbreak, to support the decision-making on public policies at a regional level. The tool is suitable for understanding and suggesting levels of intervention needed to contain the spread of infectious diseases when the mobility of inhabitants through a regional network is a concern. Such is the case for the SARS-CoV-2 virus that is highly contagious and, therefore, makes it crucial to incorporate the circulation of people in the epidemiological compartmental models. Robot Dance anticipates the spread of an epidemic in a complex regional network, helping to identify fragile links where applying differentiated measures of containment, testing, and vaccination is important. Based on stochastic optimization, the model determines efficient strategies on the basis of commuting of individuals and the situation of hospitals in each district. Uncertainty in the capacity of intensive care beds is handled by a chance-constraint approach. Some functionalities of Robot Dance are illustrated in the state of São Paulo in Brazil, using real data for a region with more than forty million inhabitants.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100025"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000016/pdfft?md5=481392c4a63aa5d41081f96af794659f&pid=1-s2.0-S2192440622000016-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46259155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2021.100024
Sara Ceschia, Luca Di Gaspero, Roberto Maria Rosati, Andrea Schaerf
We consider the Minimum Interference Frequency Assignment Problem and we propose a novel Simulated Annealing approach that makes use of a portfolio of different neighborhoods, specifically designed for this problem.
We undertake at once the two versions of the problem proposed by Correia (2001) and by Montemanni et al. (2001), respectively, and the corresponding benchmark instances. With the aim of determining the best configuration of the solver for the specific version of the problem we perform a comprehensive and statistically-principled tuning procedure.
Even tough a totally precise comparison is not possible, the experimental analysis show that we outperform all previous results on most instances for the first version of the problem, and we are at the same level of the best ones for the second version.
As a byproduct of this research, we designed a new robust file format for instances and solutions, and a data repository for validating and maintaining the available solutions.
我们考虑了最小干扰频率分配问题,并提出了一种新的模拟退火方法,该方法利用了专门为该问题设计的不同邻域组合。我们立即分别对Correia(2001)和Montemanni et al.(2001)提出的两个版本的问题,以及相应的基准实例进行研究。为了确定问题的特定版本的求解器的最佳配置,我们执行了一个全面的、符合统计原则的调优过程。即使完全精确的比较是不可能的,实验分析表明,在大多数情况下,我们在第一个版本的问题上优于所有以前的结果,并且我们在第二个版本的最佳水平上。作为这项研究的副产品,我们为实例和解决方案设计了一种新的健壮的文件格式,并为验证和维护可用的解决方案设计了一个数据存储库。
{"title":"Multi-Neighborhood simulated annealing for the minimum interference frequency assignment problem","authors":"Sara Ceschia, Luca Di Gaspero, Roberto Maria Rosati, Andrea Schaerf","doi":"10.1016/j.ejco.2021.100024","DOIUrl":"https://doi.org/10.1016/j.ejco.2021.100024","url":null,"abstract":"<div><p>We consider the Minimum Interference Frequency Assignment Problem and we propose a novel Simulated Annealing approach that makes use of a portfolio of different neighborhoods, specifically designed for this problem.</p><p>We undertake at once the two versions of the problem proposed by Correia (2001) and by Montemanni et al. (2001), respectively, and the corresponding benchmark instances. With the aim of determining the best configuration of the solver for the specific version of the problem we perform a comprehensive and statistically-principled tuning procedure.</p><p>Even tough a totally precise comparison is not possible, the experimental analysis show that we outperform all previous results on most instances for the first version of the problem, and we are at the same level of the best ones for the second version.</p><p>As a byproduct of this research, we designed a new robust file format for instances and solutions, and a data repository for validating and maintaining the available solutions.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100024"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440621001519/pdfft?md5=bd15ad8aa76d9ff1c05a75649a88a79a&pid=1-s2.0-S2192440621001519-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92106812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100035
Fred Roosta , Yang Liu , Peng Xu , Michael W. Mahoney
We consider a variant of inexact Newton Method [20], [40], called Newton-MR, in which the least-squares sub-problems are solved approximately using Minimum Residual method [79]. By construction, Newton-MR can be readily applied for unconstrained optimization of a class of non-convex problems known as invex, which subsumes convexity as a sub-class. For invex optimization, instead of the classical Lipschitz continuity assumptions on gradient and Hessian, Newton-MR's global convergence can be guaranteed under a weaker notion of joint regularity of Hessian and gradient. We also obtain Newton-MR's problem-independent local convergence to the set of minima. We show that fast local/global convergence can be guaranteed under a novel inexactness condition, which, to our knowledge, is much weaker than the prior related works. Numerical results demonstrate the performance of Newton-MR as compared with several other Newton-type alternatives on a few machine learning problems.
{"title":"Newton-MR: Inexact Newton Method with minimum residual sub-problem solver","authors":"Fred Roosta , Yang Liu , Peng Xu , Michael W. Mahoney","doi":"10.1016/j.ejco.2022.100035","DOIUrl":"10.1016/j.ejco.2022.100035","url":null,"abstract":"<div><p>We consider a variant of inexact Newton Method <span>[20]</span>, <span>[40]</span>, called Newton-MR, in which the least-squares sub-problems are solved approximately using Minimum Residual method <span>[79]</span>. By construction, Newton-MR can be readily applied for unconstrained optimization of a class of non-convex problems known as invex, which subsumes convexity as a sub-class. For invex optimization, instead of the classical Lipschitz continuity assumptions on gradient and Hessian, Newton-MR's global convergence can be guaranteed under a weaker notion of joint regularity of Hessian and gradient. We also obtain Newton-MR's problem-independent local convergence to the set of minima. We show that fast local/global convergence can be guaranteed under a novel inexactness condition, which, to our knowledge, is much weaker than the prior related works. Numerical results demonstrate the performance of Newton-MR as compared with several other Newton-type alternatives on a few machine learning problems.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100035"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000119/pdfft?md5=d469cd05ef15c6b063a51fd431c7a8dd&pid=1-s2.0-S2192440622000119-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123761580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100031
Thorsten Koch , Timo Berthold , Jaap Pedersen , Charlie Vanaret
This study investigates the progress made in lp and milp solver performance during the last two decades by comparing the solver software from the beginning of the millennium with the codes available today. On average, we found out that for solving lp/milp, computer hardware got about 20 times faster, and the algorithms improved by a factor of about nine for lp and around 50 for milp, which gives a total speed-up of about 180 and 1,000 times, respectively. However, these numbers have a very high variance and they considerably underestimate the progress made on the algorithmic side: many problem instances can nowadays be solved within seconds, which the old codes are not able to solve within any reasonable time.
{"title":"Progress in mathematical programming solvers from 2001 to 2020","authors":"Thorsten Koch , Timo Berthold , Jaap Pedersen , Charlie Vanaret","doi":"10.1016/j.ejco.2022.100031","DOIUrl":"10.1016/j.ejco.2022.100031","url":null,"abstract":"<div><p>This study investigates the progress made in <span>lp</span> and <span>milp</span> solver performance during the last two decades by comparing the solver software from the beginning of the millennium with the codes available today. On average, we found out that for solving <span>lp</span>/<span>milp</span>, computer hardware got about 20 times faster, and the algorithms improved by a factor of about nine for <span>lp</span> and around 50 for <span>milp</span>, which gives a total speed-up of about 180 and 1,000 times, respectively. However, these numbers have a very high variance and they considerably underestimate the progress made on the algorithmic side: many problem instances can nowadays be solved within seconds, which the old codes are not able to solve within any reasonable time.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100031"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000077/pdfft?md5=79377e1d524040849993372a12f99ead&pid=1-s2.0-S2192440622000077-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131506187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100049
Roberto Montemanni , Xiaochen Chou , Derek H. Smith
B-coloring is a problem in graph theory. It can model some real applications, as well as being used to enhance solution methods for the classical graph coloring problem. In turn, improved solutions for the classical coloring problem would impact a larger pool of practical applications in several different fields such as scheduling, timetabling and telecommunications. Given a graph , the b-coloring problem aims to maximize the number of colors used while assigning a color to every vertex in V, preventing adjacent vertices from receiving the same color, with every color represented by a special vertex, called a b-vertex. A vertex can be a b-vertex only if the set of colors assigned to its adjacent vertices includes all the colors, apart from the one assigned to the vertex itself.
This work employs methods based on Linear Programming to derive new upper and lower bounds for the problem. In particular, starting from a Mixed Integer Linear Programming model recently presented, upper bounds are obtained through partial linear relaxations of this model, while lower bounds are derived by considering different variations of the original model, modified to target a specific number of colors provided as input. The experimental campaign documented in the paper led to several improvements to the state-of-the-art results.
{"title":"Upper and lower bounds based on linear programming for the b-coloring problem","authors":"Roberto Montemanni , Xiaochen Chou , Derek H. Smith","doi":"10.1016/j.ejco.2022.100049","DOIUrl":"10.1016/j.ejco.2022.100049","url":null,"abstract":"<div><p>B-coloring is a problem in graph theory. It can model some real applications, as well as being used to enhance solution methods for the classical graph coloring problem. In turn, improved solutions for the classical coloring problem would impact a larger pool of practical applications in several different fields such as scheduling, timetabling and telecommunications. Given a graph <span><math><mi>G</mi><mo>=</mo><mo>(</mo><mi>V</mi><mo>,</mo><mi>E</mi><mo>)</mo></math></span>, the <em>b-coloring problem</em> aims to maximize the number of colors used while assigning a color to every vertex in <em>V</em>, preventing adjacent vertices from receiving the same color, with every color represented by a special vertex, called a b-vertex. A vertex can be a <em>b-vertex</em> only if the set of colors assigned to its adjacent vertices includes all the colors, apart from the one assigned to the vertex itself.</p><p>This work employs methods based on Linear Programming to derive new upper and lower bounds for the problem. In particular, starting from a Mixed Integer Linear Programming model recently presented, upper bounds are obtained through partial linear relaxations of this model, while lower bounds are derived by considering different variations of the original model, modified to target a specific number of colors provided as input. The experimental campaign documented in the paper led to several improvements to the state-of-the-art results.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100049"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000259/pdfft?md5=4554fc69f0108b024eff3393ae695fc6&pid=1-s2.0-S2192440622000259-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128592840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100044
Rémi Chan–Renous-Legoubin , Clément W. Royer
Nonlinear conjugate gradients are among the most popular techniques for solving continuous optimization problems. Although these schemes have long been studied from a global convergence standpoint, their worst-case complexity properties have yet to be fully understood, especially in the nonconvex setting. In particular, it is unclear whether nonlinear conjugate gradient methods possess better guarantees than first-order methods such as gradient descent. Meanwhile, recent experiments have shown impressive performance of standard nonlinear conjugate gradient techniques on certain nonconvex problems, even when compared with methods endowed with the best known complexity guarantees.
In this paper, we propose a nonlinear conjugate gradient scheme based on a simple line-search paradigm and a modified restart condition. These two ingredients allow for monitoring the properties of the search directions, which is instrumental in obtaining complexity guarantees. Our complexity results illustrate the possible discrepancy between nonlinear conjugate gradient methods and classical gradient descent. A numerical investigation on nonconvex robust regression problems as well as a standard benchmark illustrate that the restarting condition can track the behavior of a standard implementation.
{"title":"A nonlinear conjugate gradient method with complexity guarantees and its application to nonconvex regression","authors":"Rémi Chan–Renous-Legoubin , Clément W. Royer","doi":"10.1016/j.ejco.2022.100044","DOIUrl":"10.1016/j.ejco.2022.100044","url":null,"abstract":"<div><p>Nonlinear conjugate gradients are among the most popular techniques for solving continuous optimization problems. Although these schemes have long been studied from a global convergence standpoint, their worst-case complexity properties have yet to be fully understood, especially in the nonconvex setting. In particular, it is unclear whether nonlinear conjugate gradient methods possess better guarantees than first-order methods such as gradient descent. Meanwhile, recent experiments have shown impressive performance of standard nonlinear conjugate gradient techniques on certain nonconvex problems, even when compared with methods endowed with the best known complexity guarantees.</p><p>In this paper, we propose a nonlinear conjugate gradient scheme based on a simple line-search paradigm and a modified restart condition. These two ingredients allow for monitoring the properties of the search directions, which is instrumental in obtaining complexity guarantees. Our complexity results illustrate the possible discrepancy between nonlinear conjugate gradient methods and classical gradient descent. A numerical investigation on nonconvex robust regression problems as well as a standard benchmark illustrate that the restarting condition can track the behavior of a standard implementation.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100044"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S219244062200020X/pdfft?md5=32a8c7d35ac8b53e431514d2573efa79&pid=1-s2.0-S219244062200020X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114696474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}