首页 > 最新文献

Computational Mathematics and Mathematical Physics最新文献

英文 中文
On the Redundancy of Hessian Nonsingularity for Linear Convergence Rate of the Newton Method Applied to the Minimization of Convex Functions 论应用于凸函数最小化的牛顿法线性收敛率的黑森非奇异性的冗余性
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700040
Yu. G. Evtushenko, A. A. Tret’yakov

Abstract

A new property of convex functions that makes it possible to achieve the linear rate of convergence of the Newton method during the minimization process is established. Namely, it is proved that, even in the case of singularity of the Hessian at the solution, the Newtonian system is solvable in the vicinity of the minimizer; i.e., the gradient of the objective function belongs to the image of the matrix of second derivatives and, therefore, analogs of the Newton method may be used.

摘要 建立了凸函数的一个新特性,它使牛顿方法在最小化过程中实现线性收敛率成为可能。即,证明了即使在解的 Hessian 存在奇异性的情况下,牛顿系统仍可在最小化附近求解;也就是说,目标函数的梯度属于二阶导数矩阵的图像,因此可以使用牛顿方法的类似方法。
{"title":"On the Redundancy of Hessian Nonsingularity for Linear Convergence Rate of the Newton Method Applied to the Minimization of Convex Functions","authors":"Yu. G. Evtushenko, A. A. Tret’yakov","doi":"10.1134/s0965542524700040","DOIUrl":"https://doi.org/10.1134/s0965542524700040","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>A new property of convex functions that makes it possible to achieve the linear rate of convergence of the Newton method during the minimization process is established. Namely, it is proved that, even in the case of singularity of the Hessian at the solution, the Newtonian system is solvable in the vicinity of the minimizer; i.e., the gradient of the objective function belongs to the image of the matrix of second derivatives and, therefore, analogs of the Newton method may be used.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Another Approach to Build Lyapunov Functions for the First Order Methods in the Quadratic Case 在二次情况下为一阶方法建立 Lyapunov 函数的另一种方法
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700131
D. M. Merkulov, I. V. Oseledets

Abstract

Lyapunov functions play a fundamental role in analyzing the stability and convergence properties of optimization methods. In this paper, we propose a novel and straightforward approach for constructing Lyapunov functions for first-order methods applied to quadratic functions. Our approach involves bringing the iteration matrix to an upper triangular form using Schur decomposition, then examining the value of the last coordinate of the state vector. This value is multiplied by a magnitude smaller than one at each iteration. Consequently, this value should decrease at each iteration, provided that the method converges. We rigorously prove the suitability of this Lyapunov function for all first-order methods and derive the necessary conditions for the proposed function to decrease monotonically. Experiments conducted with general convex functions are also presented, alongside a study on the limitations of the proposed approach. Remarkably, the newly discovered L-yapunov function is straightforward and does not explicitly depend on the exact method formulation or function characteristics like strong convexity or smoothness constants. In essence, a single expression serves as a Lyapunov function for several methods, including Heavy Ball, Nesterov Accelerated Gradient, and Triple Momentum, among others. To the best of our knowledge, this approach has not been previously reported in the literature.

摘要 李雅普诺夫函数在分析优化方法的稳定性和收敛性方面起着重要作用。在本文中,我们提出了一种新颖而直接的方法,用于构建适用于二次函数的一阶方法的 Lyapunov 函数。我们的方法包括利用舒尔分解将迭代矩阵转化为上三角形式,然后检查状态向量最后一个坐标的值。每次迭代时,该值都会乘以一个小于 1 的量级。因此,只要方法收敛,该值在每次迭代时都会减小。我们严格证明了这个 Lyapunov 函数适用于所有一阶方法,并推导出了建议函数单调递减的必要条件。我们还介绍了用一般凸函数进行的实验,以及对所提方法局限性的研究。值得注意的是,新发现的 L-yapunov 函数简单明了,并不明确依赖于确切的方法表述或函数特征,如强凸性或平滑常数。从本质上讲,一个表达式就可以作为多种方法的 Lyapunov 函数,包括重球、涅斯捷罗夫加速梯度和三重动量等。据我们所知,这种方法以前从未在文献中报道过。
{"title":"Another Approach to Build Lyapunov Functions for the First Order Methods in the Quadratic Case","authors":"D. M. Merkulov, I. V. Oseledets","doi":"10.1134/s0965542524700131","DOIUrl":"https://doi.org/10.1134/s0965542524700131","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Lyapunov functions play a fundamental role in analyzing the stability and convergence properties of optimization methods. In this paper, we propose a novel and straightforward approach for constructing Lyapunov functions for first-order methods applied to quadratic functions. Our approach involves bringing the iteration matrix to an upper triangular form using Schur decomposition, then examining the value of the last coordinate of the state vector. This value is multiplied by a magnitude smaller than one at each iteration. Consequently, this value should decrease at each iteration, provided that the method converges. We rigorously prove the suitability of this Lyapunov function for all first-order methods and derive the necessary conditions for the proposed function to decrease monotonically. Experiments conducted with general convex functions are also presented, alongside a study on the limitations of the proposed approach. Remarkably, the newly discovered L-yapunov function is straightforward and does not explicitly depend on the exact method formulation or function characteristics like strong convexity or smoothness constants. In essence, a single expression serves as a Lyapunov function for several methods, including Heavy Ball, Nesterov Accelerated Gradient, and Triple Momentum, among others. To the best of our knowledge, this approach has not been previously reported in the literature.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Determination of the Thermal Conductivity and Volumetric Heat Capacity of Substance from Heat Flux 根据热通量测定物质的导热系数和体积热容
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700039
A. Yu. Gorchakov, V. I. Zubov

Abstract

The study of nonlinear problems related to heat transfer in a substance is of great practical important. Earlier, this paper’s authors proposed an effective algorithm for determining the volumetric heat capacity and thermal conductivity of a substance based on experimental observations of the dynamics of the temperature field in the object. In this paper, the problem of simultaneous identification of temperature-dependent volumetric heat capacity and thermal conductivity of the substance under study from the heat flux at the boundary of the domain is investigated. The consideration is based on the first (Dirichlet) boundary value problem for a one-dimensional unsteady heat equation. The coefficient inverse problem under consideration is reduced to a variational problem, which is solved by gradient methods based on the application of fast automatic differentiation. The uniqueness of the solution of the inverse problem is investigated.

摘要 研究与物质传热有关的非线性问题具有重要的现实意义。此前,本文作者提出了一种基于物体温度场动态实验观测的有效算法,用于确定物质的体积热容和导热系数。本文研究了从域边界的热通量同时确定被研究物质与温度相关的容积热容量和导热系数的问题。考虑的基础是一维非稳态热方程的第一(Dirichlet)边界值问题。所考虑的系数逆问题被简化为一个变分问题,并在应用快速自动微分的基础上通过梯度法加以解决。研究了逆问题解的唯一性。
{"title":"Determination of the Thermal Conductivity and Volumetric Heat Capacity of Substance from Heat Flux","authors":"A. Yu. Gorchakov, V. I. Zubov","doi":"10.1134/s0965542524700039","DOIUrl":"https://doi.org/10.1134/s0965542524700039","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The study of nonlinear problems related to heat transfer in a substance is of great practical important. Earlier, this paper’s authors proposed an effective algorithm for determining the volumetric heat capacity and thermal conductivity of a substance based on experimental observations of the dynamics of the temperature field in the object. In this paper, the problem of simultaneous identification of temperature-dependent volumetric heat capacity and thermal conductivity of the substance under study from the heat flux at the boundary of the domain is investigated. The consideration is based on the first (Dirichlet) boundary value problem for a one-dimensional unsteady heat equation. The coefficient inverse problem under consideration is reduced to a variational problem, which is solved by gradient methods based on the application of fast automatic differentiation. The uniqueness of the solution of the inverse problem is investigated.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alternating Projection Method for Intersection of Convex Sets, Multi-Agent Consensus Algorithms, and Averaging Inequalities 用于凸集交集的交替投影法、多代理共识算法和平均不等式
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700155
A. V. Proskurnikov, I. S. Zabarianska

Abstract

The history of the alternating projection method for finding a common point of several convex sets in Euclidean space goes back to the well-known Kaczmarz algorithm for solving systems of linear equations, which was devised in the 1930s and later found wide applications in image processing and computed tomography. An important role in the study of this method was played by I.I. Eremin’s, L.M. Bregman’s, and B.T. Polyak’s works, which appeared nearly simultaneously and contained general results concerning the convergence of alternating projections to a point in the intersection of sets, assuming that this intersection is nonempty. In this paper, we consider a modification of the convex set intersection problem that is related to the theory of multi-agent systems and is called the constrained consensus problem. Each convex set in this problem is associated with a certain agent and, generally speaking, is inaccessible to the other agents. A group of agents is interested in finding a common point of these sets, that is, a point satisfying all the constraints. Distributed analogues of the alternating projection method proposed for solving this problem lead to a rather complicated nonlinear system of equations, the convergence of which is usually proved using special Lyapunov functions. A brief survey of these methods is given, and their relation to the theorem ensuring consensus in a system of averaging inequalities recently proved by the second author is shown (this theorem develops convergence results for the usual method of iterative averaging as applied to the consensus problem).

摘要 在欧几里得空间中寻找几个凸集的公共点的交替投影法的历史可以追溯到著名的求解线性方程组的 Kaczmarz 算法,该算法设计于 20 世纪 30 年代,后来在图像处理和计算机断层扫描中得到广泛应用。I.I. Eremin、L.M. Bregman 和 B.T. Polyak 的著作对这一方法的研究起到了重要作用,这些著作几乎同时问世,其中包含了关于交替投影收敛到集合交点的一般结果,假定这个交点是非空的。在本文中,我们考虑的是凸集相交问题的一个修正,它与多代理系统理论有关,被称为约束共识问题。该问题中的每个凸集都与某个代理相关联,一般来说,其他代理无法访问。一组代理感兴趣的是找到这些集合的共同点,即满足所有约束条件的点。为解决这一问题而提出的交替投影法的分布式类似方法会导致一个相当复杂的非线性方程组,其收敛性通常使用特殊的 Lyapunov 函数来证明。本文简要介绍了这些方法,并说明了它们与第二位作者最近证明的确保在平均不等式系统中达成共识的定理的关系(该定理发展了应用于共识问题的通常迭代平均方法的收敛结果)。
{"title":"Alternating Projection Method for Intersection of Convex Sets, Multi-Agent Consensus Algorithms, and Averaging Inequalities","authors":"A. V. Proskurnikov, I. S. Zabarianska","doi":"10.1134/s0965542524700155","DOIUrl":"https://doi.org/10.1134/s0965542524700155","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The history of the alternating projection method for finding a common point of several convex sets in Euclidean space goes back to the well-known Kaczmarz algorithm for solving systems of linear equations, which was devised in the 1930s and later found wide applications in image processing and computed tomography. An important role in the study of this method was played by I.I. Eremin’s, L.M. Bregman’s, and B.T. Polyak’s works, which appeared nearly simultaneously and contained general results concerning the convergence of alternating projections to a point in the intersection of sets, assuming that this intersection is nonempty. In this paper, we consider a modification of the convex set intersection problem that is related to the theory of multi-agent systems and is called the constrained consensus problem. Each convex set in this problem is associated with a certain agent and, generally speaking, is inaccessible to the other agents. A group of agents is interested in finding a common point of these sets, that is, a point satisfying all the constraints. Distributed analogues of the alternating projection method proposed for solving this problem lead to a rather complicated nonlinear system of equations, the convergence of which is usually proved using special Lyapunov functions. A brief survey of these methods is given, and their relation to the theorem ensuring consensus in a system of averaging inequalities recently proved by the second author is shown (this theorem develops convergence results for the usual method of iterative averaging as applied to the consensus problem).</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic Gradient Descent with Preconditioned Polyak Step-Size 采用预处理波利克步长的随机梯度下降法
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700052
F. Abdukhakimov, C. Xiang, D. Kamzolov, M. Takáč

Abstract

Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems. These methods display valuable properties and attract researchers and industrial machine learning engineers with their simplicity. However, one of the weaknesses of this type of methods is the necessity to tune learning rate (step-size) for every loss function and dataset combination to solve an optimization problem and get an efficient performance in a given time budget. Stochastic Gradient Descent with Polyak Step-size (SPS) is a method that offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. In this paper, we propose an extension of SPS that employs preconditioning techniques, such as Hutchinson’s method, Adam, and AdaGrad, to improve its performance on badly scaled and/or ill-conditioned datasets.

摘要随机梯度下降法(SGD)是广泛用于解决机器学习问题的众多迭代优化方法之一。这些方法显示出宝贵的特性,并以其简单性吸引着研究人员和工业机器学习工程师。然而,这类方法的弱点之一是必须调整每个损失函数和数据集组合的学习率(步长),才能解决优化问题,并在给定的时间预算内获得高效性能。采用 Polyak 步长的随机梯度下降法(SPS)是一种提供更新规则的方法,可减轻对优化器学习率进行微调的需要。在本文中,我们提出了随机梯度下降法的扩展方案,该方案采用了 Hutchinson 方法、Adam 和 AdaGrad 等预处理技术,以提高随机梯度下降法在严重缩放和/或条件不良数据集上的性能。
{"title":"Stochastic Gradient Descent with Preconditioned Polyak Step-Size","authors":"F. Abdukhakimov, C. Xiang, D. Kamzolov, M. Takáč","doi":"10.1134/s0965542524700052","DOIUrl":"https://doi.org/10.1134/s0965542524700052","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems. These methods display valuable properties and attract researchers and industrial machine learning engineers with their simplicity. However, one of the weaknesses of this type of methods is the necessity to tune learning rate (step-size) for every loss function and dataset combination to solve an optimization problem and get an efficient performance in a given time budget. Stochastic Gradient Descent with Polyak Step-size (SPS) is a method that offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. In this paper, we propose an extension of SPS that employs preconditioning techniques, such as Hutchinson’s method, Adam, and AdaGrad, to improve its performance on badly scaled and/or ill-conditioned datasets.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diffusion Approximations and Control Variates for MCMC MCMC 的扩散近似值和控制变量
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700167
N. Brosse, A. Durmus, S. Meyn, E. Moulines, S. Samsonov

Abstract

A new method is introduced for the construction of control variates to reduce the variance of additive functionals of Markov Chain Monte Carlo (MCMC) samplers. These control variates are obtained by minimizing the asymptotic variance associated with the Langevin diffusion over a family of functions. To motivate our approach, we then show that the asymptotic variance of some well-known MCMC algorithms, including the Random Walk Metropolis and the (Metropolis) Unadjusted/Adjusted Langevin Algorithm, are well approximated by that of the Langevin diffusion. We finally theoretically justify the use of a class of linear control variates we introduce. In particular, we show that the variance of the resulting estimators is smaller, for a given computational complexity, than the standard Monte Carlo estimator. Several examples of Bayesian inference problems support our findings showing, in some cases, very significant reduction of the variance.

摘要 介绍了一种构建控制变量的新方法,以减小马尔可夫链蒙特卡罗(MCMC)采样器的加法函数方差。这些控制变量是通过最小化与函数族的朗格文扩散相关的渐近方差而获得的。为了激发我们的方法,我们随后展示了一些著名 MCMC 算法的渐近方差,包括随机漫步 Metropolis 算法和(Metropolis)未调整/调整 Langevin 算法,它们都可以很好地近似于 Langevin 扩散。最后,我们从理论上证明了使用我们引入的一类线性控制变量的合理性。特别是,我们证明了在计算复杂度给定的情况下,所产生的估计器方差小于标准蒙特卡罗估计器。几个贝叶斯推理问题的例子证明了我们的发现,在某些情况下,方差的减小非常明显。
{"title":"Diffusion Approximations and Control Variates for MCMC","authors":"N. Brosse, A. Durmus, S. Meyn, E. Moulines, S. Samsonov","doi":"10.1134/s0965542524700167","DOIUrl":"https://doi.org/10.1134/s0965542524700167","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>A new method is introduced for the construction of control variates to reduce the variance of additive functionals of Markov Chain Monte Carlo (MCMC) samplers. These control variates are obtained by minimizing the asymptotic variance associated with the Langevin diffusion over a family of functions. To motivate our approach, we then show that the asymptotic variance of some well-known MCMC algorithms, including the Random Walk Metropolis and the (Metropolis) Unadjusted/Adjusted Langevin Algorithm, are well approximated by that of the Langevin diffusion. We finally theoretically justify the use of a class of linear control variates we introduce. In particular, we show that the variance of the resulting estimators is smaller, for a given computational complexity, than the standard Monte Carlo estimator. Several examples of Bayesian inference problems support our findings showing, in some cases, very significant reduction of the variance.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Higher-Order Iterative Learning Control Algorithms for Linear Systems 线性系统的高阶迭代学习控制算法
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700064
P. V. Pakshin, J. P. Emelianova, M. A. Emelianov

Abstract

Iterative learning control (ILC) algorithms appeared in connection with the problems of increasing the accuracy of performing repetitive operations by robots. They use information from previous repetitions to adjust the control signal on the current repetition. Most often, information from the previous repetition only is used. ILC algorithms that use information from several previous iterations are called higher-order algorithms. Recently, interest in these algorithms has increased in the literature in connection with robotic additive manufacturing problems. However, in addition to the fact that these algorithms have been little studied, there are conflicting estimates regarding their properties. This paper proposes new higher-order ILC algorithms for linear discrete and differential systems. The idea of these algorithms is based on an analogy with multi-step methods in optimization theory, in particular, with the heavy ball method. An example is given that confirms the possibility to accelerate convergence of the learning error when using such algorithms.

摘要 迭代学习控制(ILC)算法的出现与提高机器人执行重复操作的准确性问题有关。它们利用前一次重复操作的信息来调整当前重复操作的控制信号。大多数情况下,只使用前一次重复的信息。使用前几次迭代信息的 ILC 算法被称为高阶算法。最近,这些算法在与机器人增材制造问题相关的文献中越来越受到关注。然而,除了对这些算法研究甚少这一事实外,对其特性的估计也相互矛盾。本文针对线性离散和微分系统提出了新的高阶 ILC 算法。这些算法的思想基于与优化理论中的多步方法,特别是重球方法的类比。举例说明了在使用这些算法时加速学习误差收敛的可能性。
{"title":"Higher-Order Iterative Learning Control Algorithms for Linear Systems","authors":"P. V. Pakshin, J. P. Emelianova, M. A. Emelianov","doi":"10.1134/s0965542524700064","DOIUrl":"https://doi.org/10.1134/s0965542524700064","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Iterative learning control (ILC) algorithms appeared in connection with the problems of increasing the accuracy of performing repetitive operations by robots. They use information from previous repetitions to adjust the control signal on the current repetition. Most often, information from the previous repetition only is used. ILC algorithms that use information from several previous iterations are called higher-order algorithms. Recently, interest in these algorithms has increased in the literature in connection with robotic additive manufacturing problems. However, in addition to the fact that these algorithms have been little studied, there are conflicting estimates regarding their properties. This paper proposes new higher-order ILC algorithms for linear discrete and differential systems. The idea of these algorithms is based on an analogy with multi-step methods in optimization theory, in particular, with the heavy ball method. An example is given that confirms the possibility to accelerate convergence of the learning error when using such algorithms.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Highly Smooth Zeroth-Order Methods for Solving Optimization Problems under the PL Condition 解决 PL 条件下优化问题的高平滑零阶方法
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700118
A. V. Gasnikov, A. V. Lobanov, F. S. Stonyakin

Abstract

In this paper, we study the black box optimization problem under the Polyak–Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using “kernel-based” approximations instead of the exact gradient in the Stochastic Gradient Descent method, we improve the best-known results of convergence in the class of gradient-free algorithms solving problems under the PL condition. We generalize our results to the case where a zeroth-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations.

摘要 本文研究了 Polyak-Lojasiewicz(PL)条件下的黑箱优化问题,假设目标函数不仅是平滑的,而且具有更高的平滑性。通过在随机梯度下降法中使用 "基于核 "的近似值而不是精确梯度,我们改进了解决 PL 条件下问题的无梯度算法中最著名的收敛结果。我们将结果推广到了零阶神谕在某点返回函数值并带有一些对抗噪声的情况。我们以求解非线性方程组为例,验证了我们的理论结果。
{"title":"Highly Smooth Zeroth-Order Methods for Solving Optimization Problems under the PL Condition","authors":"A. V. Gasnikov, A. V. Lobanov, F. S. Stonyakin","doi":"10.1134/s0965542524700118","DOIUrl":"https://doi.org/10.1134/s0965542524700118","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>In this paper, we study the black box optimization problem under the Polyak–Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using “kernel-based” approximations instead of the exact gradient in the Stochastic Gradient Descent method, we improve the best-known results of convergence in the class of gradient-free algorithms solving problems under the PL condition. We generalize our results to the case where a zeroth-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Numerical Range and a Generalization of Duffin’s Overdamping Criterion 数值范围和达芬过阻尼准则的广义化
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s0965542524700015
R. Hildebrand

Abstract

The joint numerical range of tuples of matrices is a powerful tool for proving results which are useful in optimization, such as the (mathcal{S})-lemma. Here we provide a similar proof for another result, namely the equivalence of a certain positivity criterion to Duffin’s overdamping condition involving quadratic matrix-valued polynomials. We show how the proof is generalizable to higher degrees of matrix-valued polynomials.

摘要矩阵元组的联合数值范围是证明优化中有用结果的有力工具,例如 (mathcal{S})-lemma 。在这里,我们为另一个结果提供了类似的证明,即涉及二次矩阵值多项式的某个正定准则与达芬过阻尼条件的等价性。我们展示了如何将证明推广到更高程度的矩阵值多项式。
{"title":"Numerical Range and a Generalization of Duffin’s Overdamping Criterion","authors":"R. Hildebrand","doi":"10.1134/s0965542524700015","DOIUrl":"https://doi.org/10.1134/s0965542524700015","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The joint numerical range of tuples of matrices is a powerful tool for proving results which are useful in optimization, such as the <span>(mathcal{S})</span>-lemma. Here we provide a similar proof for another result, namely the equivalence of a certain positivity criterion to Duffin’s overdamping condition involving quadratic matrix-valued polynomials. We show how the proof is generalizable to higher degrees of matrix-valued polynomials.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Gradient Projection Method for a Supporting Function on the Unit Sphere and Its Applications 单位球面上支撑函数的梯度投影法及其应用
IF 0.7 4区 数学 Q3 MATHEMATICS, APPLIED Pub Date : 2024-06-07 DOI: 10.1134/s096554252470009x
M. V. Balashov, A. A. Tremba

Abstract

We consider minimization of the supporting function of a convex compact set on the unit sphere. In essence, this is the problem of projecting zero onto a compact convex set. We consider sufficient conditions for solving this problem with a linear rate using a first order algorithm—the gradient projection method with a fixed step-size and with Armijo’s step-size. We consider some applications for problems with set-valued mappings. The mappings in the work basically are given through the set-valued integral of a set-valued mapping with convex and compact images or as the Minkowski sum of finite number of convex compact sets, e.g., ellipsoids. Unlike another solution ways, e.g., with approximation in a certain sense of the mapping, the considered algorithm much weaker depends on the dimension of the space and other parameters of the problem. It also allows efficient error estimation. Numerical experiments confirm the effectiveness of the considered approach.

摘要 我们考虑的是单位球面上一个凸紧凑集的支撑函数的最小化问题。实质上,这是一个将零投影到紧凑凸集上的问题。我们考虑了使用一阶算法--具有固定步长和 Armijo 步长的梯度投影法--以线性速率求解该问题的充分条件。我们还考虑了集值映射问题的一些应用。工作中的映射基本上是通过具有凸紧凑图像的集值映射的集值积分给出的,或作为有限数量凸紧凑集(如椭圆)的闵科夫斯基和给出的。与另一种求解方法(如在一定意义上对映射进行逼近)不同,所考虑的算法对空间维度和问题的其他参数的依赖性要弱得多。它还能有效地估计误差。数值实验证实了所考虑方法的有效性。
{"title":"The Gradient Projection Method for a Supporting Function on the Unit Sphere and Its Applications","authors":"M. V. Balashov, A. A. Tremba","doi":"10.1134/s096554252470009x","DOIUrl":"https://doi.org/10.1134/s096554252470009x","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>We consider minimization of the supporting function of a convex compact set on the unit sphere. In essence, this is the problem of projecting zero onto a compact convex set. We consider sufficient conditions for solving this problem with a linear rate using a first order algorithm—the gradient projection method with a fixed step-size and with Armijo’s step-size. We consider some applications for problems with set-valued mappings. The mappings in the work basically are given through the set-valued integral of a set-valued mapping with convex and compact images or as the Minkowski sum of finite number of convex compact sets, e.g., ellipsoids. Unlike another solution ways, e.g., with approximation in a certain sense of the mapping, the considered algorithm much weaker depends on the dimension of the space and other parameters of the problem. It also allows efficient error estimation. Numerical experiments confirm the effectiveness of the considered approach.</p>","PeriodicalId":55230,"journal":{"name":"Computational Mathematics and Mathematical Physics","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Mathematics and Mathematical Physics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1