Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100048
Ekaterina Borodich , Vladislav Tominin , Yaroslav Tominin , Dmitry Kovalev , Alexander Gasnikov , Pavel Dvurechensky
We consider composite minimax optimization problems where the goal is to find a saddle-point of a large sum of non-bilinear objective functions augmented by simple composite regularizers for the primal and dual variables. For such problems, under the average-smoothness assumption, we propose accelerated stochastic variance-reduced algorithms with optimal up to logarithmic factors complexity bounds. In particular, we consider strongly-convex-strongly-concave, convex-strongly-concave, and convex-concave objectives. To the best of our knowledge, these are the first nearly-optimal algorithms for this setting.
{"title":"Accelerated variance-reduced methods for saddle-point problems","authors":"Ekaterina Borodich , Vladislav Tominin , Yaroslav Tominin , Dmitry Kovalev , Alexander Gasnikov , Pavel Dvurechensky","doi":"10.1016/j.ejco.2022.100048","DOIUrl":"10.1016/j.ejco.2022.100048","url":null,"abstract":"<div><p>We consider composite minimax optimization problems where the goal is to find a saddle-point of a large sum of non-bilinear objective functions augmented by simple composite regularizers for the primal and dual variables. For such problems, under the average-smoothness assumption, we propose accelerated stochastic variance-reduced algorithms with optimal up to logarithmic factors complexity bounds. In particular, we consider strongly-convex-strongly-concave, convex-strongly-concave, and convex-concave objectives. To the best of our knowledge, these are the first nearly-optimal algorithms for this setting.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100048"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000247/pdfft?md5=41248ad222d5ad361783568adf860824&pid=1-s2.0-S2192440622000247-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116295129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100033
Tibor Illés , Tamás Terlaky
This brief note presents a personal recollection of the early history of EUROpt, the Continuous Optimization Working Group of EURO. This historical note details the events that happened before the formation of EUROpt Working Group and the first five years of its existence. During the early years EUROpt Working Group established a conference series, organized thematic EURO Mini conferences, launched the EUROpt Fellow program, developed an effective rotating management structure, and grown to a large, matured, very active and high impact EURO Working Group.
{"title":"EUROpt, the Continuous Optimization Working Group of EURO: From idea to maturity","authors":"Tibor Illés , Tamás Terlaky","doi":"10.1016/j.ejco.2022.100033","DOIUrl":"10.1016/j.ejco.2022.100033","url":null,"abstract":"<div><p>This brief note presents a personal recollection of the early history of EUR<em>O</em>pt, the Continuous Optimization Working Group of EURO. This historical note details the events that happened before the formation of EUR<em>O</em>pt Working Group and the first five years of its existence. During the early years EUR<em>O</em>pt Working Group established a conference series, organized thematic EURO Mini conferences, launched the EUR<em>O</em>pt Fellow program, developed an effective rotating management structure, and grown to a large, matured, very active and high impact EURO Working Group.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100033"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000090/pdfft?md5=a62c5ab91e77a43689d735471635b334&pid=1-s2.0-S2192440622000090-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115932034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100034
Héctor G.-de-Alba , Samuel Nucamendi-Guillén , Oliver Avalos-Rosales
In this paper, the unrelated parallel machine scheduling problem with the objective of minimizing the total tardiness is addressed. For such a problem, a mixed-integer linear programming (MILP) formulation, that considers assignment and positional variables, is presented. In addition, an iterated local search (ILS) algorithm that produces high-quality solutions in reasonable times is proposed for large size instances. The ILS robustness was determined by comparing its performance with the results provided by the MILP. The instances used in this paper were constructed under a new approach which results in tighter due dates than the previous generation method for this problem. The proposed MILP formulation was able to solve instances of up to 150 jobs and 20 machines. Regarding the ILS, it yielded high-quality solutions in a reasonable time, solving instances of a size up to 400 jobs and 20 machines. Experimental results confirm that both approaches are efficient and promising.
{"title":"A mixed integer formulation and an efficient metaheuristic for the unrelated parallel machine scheduling problem: Total tardiness minimization","authors":"Héctor G.-de-Alba , Samuel Nucamendi-Guillén , Oliver Avalos-Rosales","doi":"10.1016/j.ejco.2022.100034","DOIUrl":"https://doi.org/10.1016/j.ejco.2022.100034","url":null,"abstract":"<div><p>In this paper, the unrelated parallel machine scheduling problem with the objective of minimizing the total tardiness is addressed. For such a problem, a mixed-integer linear programming (MILP) formulation, that considers assignment and positional variables, is presented. In addition, an iterated local search (ILS) algorithm that produces high-quality solutions in reasonable times is proposed for large size instances. The ILS robustness was determined by comparing its performance with the results provided by the MILP. The instances used in this paper were constructed under a new approach which results in tighter due dates than the previous generation method for this problem. The proposed MILP formulation was able to solve instances of up to 150 jobs and 20 machines. Regarding the ILS, it yielded high-quality solutions in a reasonable time, solving instances of a size up to 400 jobs and 20 machines. Experimental results confirm that both approaches are efficient and promising.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100034"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000107/pdfft?md5=fe6b0c8e039b76ee7c40763ee43095a1&pid=1-s2.0-S2192440622000107-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92090668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100045
Pavel Dvurechensky , Dmitry Kamzolov , Aleksandr Lukashevich , Soomin Lee , Erik Ordentlich , César A. Uribe , Alexander Gasnikov
Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems. In this approach, multiple worker nodes compute gradients in parallel, which are then used by the central node to update the parameter by solving an auxiliary (preconditioned) smaller-scale optimization problem. The recently proposed Statistically Preconditioned Accelerated Gradient (SPAG) method [1] has complexity bounds superior to other such algorithms but requires an exact solution for computationally intensive auxiliary optimization problems at every iteration. In this paper, we propose an Inexact SPAG (InSPAG) and explicitly characterize the accuracy by which the corresponding auxiliary subproblem needs to be solved to guarantee the same convergence rate as the exact method. We build our results by first developing an inexact adaptive accelerated Bregman proximal gradient method for general optimization problems under relative smoothness and strong convexity assumptions, which may be of independent interest. Moreover, we explore the properties of the auxiliary problem in the InSPAG algorithm assuming Lipschitz third-order derivatives and strong convexity. For such problem class, we develop a linearly convergent Hyperfast second-order method and estimate the total complexity of the InSPAG method with hyperfast auxiliary problem solver. Finally, we illustrate the proposed method's practical efficiency by performing large-scale numerical experiments on logistic regression models. To the best of our knowledge, these are the first empirical results on implementing high-order methods on large-scale problems, as we work with data where the dimension is of the order of 3 million, and the number of samples is 700 million.
{"title":"Hyperfast second-order local solvers for efficient statistically preconditioned distributed optimization","authors":"Pavel Dvurechensky , Dmitry Kamzolov , Aleksandr Lukashevich , Soomin Lee , Erik Ordentlich , César A. Uribe , Alexander Gasnikov","doi":"10.1016/j.ejco.2022.100045","DOIUrl":"10.1016/j.ejco.2022.100045","url":null,"abstract":"<div><p>Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems. In this approach, multiple worker nodes compute gradients in parallel, which are then used by the central node to update the parameter by solving an auxiliary (preconditioned) smaller-scale optimization problem. The recently proposed Statistically Preconditioned Accelerated Gradient (SPAG) method <span>[1]</span> has complexity bounds superior to other such algorithms but requires an exact solution for computationally intensive auxiliary optimization problems at every iteration. In this paper, we propose an Inexact SPAG (InSPAG) and explicitly characterize the accuracy by which the corresponding auxiliary subproblem needs to be solved to guarantee the same convergence rate as the exact method. We build our results by first developing an inexact adaptive accelerated Bregman proximal gradient method for general optimization problems under relative smoothness and strong convexity assumptions, which may be of independent interest. Moreover, we explore the properties of the auxiliary problem in the InSPAG algorithm assuming Lipschitz third-order derivatives and strong convexity. For such problem class, we develop a linearly convergent Hyperfast second-order method and estimate the total complexity of the InSPAG method with hyperfast auxiliary problem solver. Finally, we illustrate the proposed method's practical efficiency by performing large-scale numerical experiments on logistic regression models. To the best of our knowledge, these are the first empirical results on implementing high-order methods on large-scale problems, as we work with data where the dimension is of the order of 3 million, and the number of samples is 700 million.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100045"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000211/pdfft?md5=295cb611041330f3ffad8993cf73fef2&pid=1-s2.0-S2192440622000211-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121213587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100043
S. Bellavia , G. Gurioli , B. Morini , Ph.L. Toint
A trust-region algorithm is presented for finding approximate minimizers of smooth unconstrained functions whose values and derivatives are subject to random noise. It is shown that, under suitable probabilistic assumptions, the new method finds (in expectation) an ϵ-approximate minimizer of arbitrary order in at most inexact evaluations of the function and its derivatives, providing the first such result for general optimality orders. The impact of intrinsic noise limiting the validity of the assumptions is also discussed and it is shown that difficulties are unlikely to occur in the first-order version of the algorithm for sufficiently large gradients. Conversely, should these assumptions fail for specific realizations, then “degraded” optimality guarantees are shown to hold when failure occurs. These conclusions are then discussed and illustrated in the context of subsampling methods for finite-sum optimization.
{"title":"Trust-region algorithms: Probabilistic complexity and intrinsic noise with applications to subsampling techniques","authors":"S. Bellavia , G. Gurioli , B. Morini , Ph.L. Toint","doi":"10.1016/j.ejco.2022.100043","DOIUrl":"10.1016/j.ejco.2022.100043","url":null,"abstract":"<div><p>A trust-region algorithm is presented for finding approximate minimizers of smooth unconstrained functions whose values and derivatives are subject to random noise. It is shown that, under suitable probabilistic assumptions, the new method finds (in expectation) an <em>ϵ</em>-approximate minimizer of arbitrary order <span><math><mi>q</mi><mo>≥</mo><mn>1</mn></math></span> in at most <span><math><mi>O</mi><mo>(</mo><msup><mrow><mi>ϵ</mi></mrow><mrow><mo>−</mo><mo>(</mo><mi>q</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow></msup><mo>)</mo></math></span> inexact evaluations of the function and its derivatives, providing the first such result for general optimality orders. The impact of intrinsic noise limiting the validity of the assumptions is also discussed and it is shown that difficulties are unlikely to occur in the first-order version of the algorithm for sufficiently large gradients. Conversely, should these assumptions fail for specific realizations, then “degraded” optimality guarantees are shown to hold when failure occurs. These conclusions are then discussed and illustrated in the context of subsampling methods for finite-sum optimization.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100043"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000193/pdfft?md5=746d8300ed25b919398d91159dcb575f&pid=1-s2.0-S2192440622000193-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124064710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100037
Bo Peng
Recently and simultaneously, two MILP-based approaches to copositivity testing were proposed. This note tries a performance comparison, using a group of test sets containing a large number of designed instances. According to the numerical results, we find that one copositivity detection approach performs better when the function value of the defined function h of a matrix is large while the other one performs better when the dimension of problems is increasing moderately. A problem set that is hard for both approaches is also presented, which may be used as a test bed for future competing approaches. An improved variant of one of the approaches is also proposed to handle those hard instances more efficiently.
{"title":"Performance comparison of two recently proposed copositivity tests","authors":"Bo Peng","doi":"10.1016/j.ejco.2022.100037","DOIUrl":"10.1016/j.ejco.2022.100037","url":null,"abstract":"<div><p>Recently and simultaneously, two MILP-based approaches to copositivity testing were proposed. This note tries a performance comparison, using a group of test sets containing a large number of designed instances. According to the numerical results, we find that one copositivity detection approach performs better when the function value of the defined function <em>h</em> of a matrix is large while the other one performs better when the dimension of problems is increasing moderately. A problem set that is hard for both approaches is also presented, which may be used as a test bed for future competing approaches. An improved variant of one of the approaches is also proposed to handle those hard instances more efficiently.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100037"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000132/pdfft?md5=abbd19fbc87e563c0963318349831747&pid=1-s2.0-S2192440622000132-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115890151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100032
Matteo Avolio, Antonio Fuduli
We tackle a new single-machine scheduling problem, whose objective is to balance the average weighted completion times of two classes of jobs. Because both the job sets contribute to the same objective function, this problem can be interpreted as a cooperative two-agent scheduling problem, in contraposition to the standard multiagent problems, which are of the competitive type since each class of job is involved only in optimizing its agent's criterion. Balancing the completion times of different sets of tasks finds application in many fields, such as in logistics for balancing the delivery times, in manufacturing for balancing the assembly lines and in services for balancing the waiting times of groups of people.
To solve the problem, for which we show the NP-hardness, a Lagrangian heuristic algorithm is proposed. In particular, starting from a nonsmooth variant of the quadratic assignment problem, our approach is based on the Lagrangian relaxation of a linearized model and reduces to solve a finite sequence of successive linear assignment problems.
Numerical results are presented on a set of randomly generated test problems, showing the efficiency of the proposed technique.
{"title":"A Lagrangian heuristics for balancing the average weighted completion times of two classes of jobs in a single-machine scheduling problem","authors":"Matteo Avolio, Antonio Fuduli","doi":"10.1016/j.ejco.2022.100032","DOIUrl":"https://doi.org/10.1016/j.ejco.2022.100032","url":null,"abstract":"<div><p>We tackle a new single-machine scheduling problem, whose objective is to balance the average weighted completion times of two classes of jobs. Because both the job sets contribute to the same objective function, this problem can be interpreted as a cooperative two-agent scheduling problem, in contraposition to the standard multiagent problems, which are of the competitive type since each class of job is involved only in optimizing its agent's criterion. Balancing the completion times of different sets of tasks finds application in many fields, such as in logistics for balancing the delivery times, in manufacturing for balancing the assembly lines and in services for balancing the waiting times of groups of people.</p><p>To solve the problem, for which we show the NP-hardness, a Lagrangian heuristic algorithm is proposed. In particular, starting from a nonsmooth variant of the quadratic assignment problem, our approach is based on the Lagrangian relaxation of a linearized model and reduces to solve a finite sequence of successive linear assignment problems.</p><p>Numerical results are presented on a set of randomly generated test problems, showing the efficiency of the proposed technique.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100032"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000089/pdfft?md5=94a7acd23a11f1e16b1bbcf7a942c573&pid=1-s2.0-S2192440622000089-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92146562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100038
Lindon Roberts , Edward Smyth
In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified convergence theory for the generic Byzantine Resilient SGD method originally proposed by Blanchard et al. (2017) [3]. Compared to the existing analysis, we shown convergence to a stationary point in expectation under standard assumptions on the (possibly nonconvex) objective function and flexible assumptions on the stochastic gradients.
{"title":"A simplified convergence theory for Byzantine resilient stochastic gradient descent","authors":"Lindon Roberts , Edward Smyth","doi":"10.1016/j.ejco.2022.100038","DOIUrl":"10.1016/j.ejco.2022.100038","url":null,"abstract":"<div><p>In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified convergence theory for the generic Byzantine Resilient SGD method originally proposed by Blanchard et al. (2017) <span>[3]</span>. Compared to the existing analysis, we shown convergence to a stationary point in expectation under standard assumptions on the (possibly nonconvex) objective function and flexible assumptions on the stochastic gradients.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100038"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000144/pdfft?md5=bbd4aa4ea37b8349470f121ce86051dd&pid=1-s2.0-S2192440622000144-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123785179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100028
Håkon Bentsen, Arild Hoff, Lars Magnus Hvattum
Tabu search is a well-established metaheuristic framework for solving hard combinatorial optimization problems. At its core, the method uses different forms of memory to guide a local search through the solution space so as to identify high-quality local optima while avoiding getting stuck in the vicinity of any particular local optimum. This paper examines characteristics of moves that can be exploited to make good decisions about steps that lead away from recently visited local optima and towards a new local optimum. Our approach uses a new type of adaptive memory based on a construction called exponential extrapolation. The memory operates by means of threshold inequalities that ensure selected moves will not lead to a specified number of most recently encountered local optima. Computational experiments on a set of one hundred different benchmark instances for the binary integer programming problem suggest that exponential extrapolation is a useful type of memory to incorporate into a tabu search.
{"title":"Exponential extrapolation memory for tabu search","authors":"Håkon Bentsen, Arild Hoff, Lars Magnus Hvattum","doi":"10.1016/j.ejco.2022.100028","DOIUrl":"10.1016/j.ejco.2022.100028","url":null,"abstract":"<div><p>Tabu search is a well-established metaheuristic framework for solving hard combinatorial optimization problems. At its core, the method uses different forms of memory to guide a local search through the solution space so as to identify high-quality local optima while avoiding getting stuck in the vicinity of any particular local optimum. This paper examines characteristics of moves that can be exploited to make good decisions about steps that lead away from recently visited local optima and towards a new local optimum. Our approach uses a new type of adaptive memory based on a construction called exponential extrapolation. The memory operates by means of threshold inequalities that ensure selected moves will not lead to a specified number of most recently encountered local optima. Computational experiments on a set of one hundred different benchmark instances for the binary integer programming problem suggest that exponential extrapolation is a useful type of memory to incorporate into a tabu search.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100028"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440622000041/pdfft?md5=d79a522b1d114e009dc737ac4d866cee&pid=1-s2.0-S2192440622000041-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126242208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.ejco.2022.100027
Anselmo R. Pitombeira-Neto , Arthur H.F. Murta
We propose a formulation of the stochastic cutting stock problem as a discounted infinite-horizon Markov decision process. At each decision epoch, given current inventory of items, an agent chooses in which patterns to cut objects in stock in anticipation of the unknown demand. An optimal solution corresponds to a policy that associates each state with a decision and minimizes the expected total cost. Since exact algorithms scale exponentially with the state-space dimension, we develop a heuristic solution approach based on reinforcement learning. We propose an approximate policy iteration algorithm in which we apply a linear model to approximate the action-value function of a policy. Policy evaluation is performed by solving the projected Bellman equation from a sample of state transitions, decisions and costs obtained by simulation. Due to the large decision space, policy improvement is performed via the cross-entropy method. Computational experiments are carried out with the use of realistic data to illustrate the application of the algorithm. Heuristic policies obtained with polynomial and Fourier basis functions are compared with myopic and random policies. Results indicate the possibility of obtaining policies capable of adequately controlling inventories with an average cost up to 80% lower than the cost obtained by a myopic policy.
{"title":"A reinforcement learning approach to the stochastic cutting stock problem","authors":"Anselmo R. Pitombeira-Neto , Arthur H.F. Murta","doi":"10.1016/j.ejco.2022.100027","DOIUrl":"10.1016/j.ejco.2022.100027","url":null,"abstract":"<div><p>We propose a formulation of the stochastic cutting stock problem as a discounted infinite-horizon Markov decision process. At each decision epoch, given current inventory of items, an agent chooses in which patterns to cut objects in stock in anticipation of the unknown demand. An optimal solution corresponds to a policy that associates each state with a decision and minimizes the expected total cost. Since exact algorithms scale exponentially with the state-space dimension, we develop a heuristic solution approach based on reinforcement learning. We propose an approximate policy iteration algorithm in which we apply a linear model to approximate the action-value function of a policy. Policy evaluation is performed by solving the projected Bellman equation from a sample of state transitions, decisions and costs obtained by simulation. Due to the large decision space, policy improvement is performed via the cross-entropy method. Computational experiments are carried out with the use of realistic data to illustrate the application of the algorithm. Heuristic policies obtained with polynomial and Fourier basis functions are compared with myopic and random policies. Results indicate the possibility of obtaining policies capable of adequately controlling inventories with an average cost up to 80% lower than the cost obtained by a myopic policy.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"10 ","pages":"Article 100027"},"PeriodicalIF":2.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S219244062200003X/pdfft?md5=135d32e50b9857c32c1577a7a14985fc&pid=1-s2.0-S219244062200003X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89403511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}