Conjugate gradient methods are much effective for large‐scale unconstrained optimization problems by their simple computations and low memory requirements. The Perry conjugate gradient method has been considered to be one of the most efficient methods in the context of unconstrained minimization. However, a globally convergent result for general functions has not been established yet. In this paper, an improved three‐term Perry‐type algorithm is proposed which automatically satisfies the sufficient descent property independent of the accuracy of line search strategy. Under the standard Wolfe line search technique and a modified secant condition, the proposed algorithm is globally convergent for general nonlinear functions without convexity assumption. Numerical results compared with the Perry method for stability, two modified Perry‐type conjugate gradient methods and two effective three‐term conjugate gradient methods for large‐scale problems up to 300,000 dimensions indicate that the proposed algorithm is more efficient and reliable than the other methods for the testing problems. Additionally, we also apply it to some image restoration problems.
{"title":"An improved descent Perry‐type algorithm for large‐scale unconstrained nonconvex problems and applications to image restoration problems","authors":"Xiaoliang Wang, Jian Lv, Na Xu","doi":"10.1002/nla.2577","DOIUrl":"https://doi.org/10.1002/nla.2577","url":null,"abstract":"Conjugate gradient methods are much effective for large‐scale unconstrained optimization problems by their simple computations and low memory requirements. The Perry conjugate gradient method has been considered to be one of the most efficient methods in the context of unconstrained minimization. However, a globally convergent result for general functions has not been established yet. In this paper, an improved three‐term Perry‐type algorithm is proposed which automatically satisfies the sufficient descent property independent of the accuracy of line search strategy. Under the standard Wolfe line search technique and a modified secant condition, the proposed algorithm is globally convergent for general nonlinear functions without convexity assumption. Numerical results compared with the Perry method for stability, two modified Perry‐type conjugate gradient methods and two effective three‐term conjugate gradient methods for large‐scale problems up to 300,000 dimensions indicate that the proposed algorithm is more efficient and reliable than the other methods for the testing problems. Additionally, we also apply it to some image restoration problems.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"24 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roel Van Beeumen, Lana Periša, Daniel Kressner, Chao Yang
We examine a method for solving an infinite‐dimensional tensor eigenvalue problem , where the infinite‐dimensional symmetric matrix exhibits a translational invariant structure. We provide a formulation of this type of problem from a numerical linear algebra point of view and describe how a power method applied to is used to obtain an approximation to the desired eigenvector. This infinite‐dimensional eigenvector is represented in a compact way by a translational invariant infinite Tensor Ring (iTR). Low rank approximation is used to keep the cost of subsequent power iterations bounded while preserving the iTR structure of the approximate eigenvector. We show how the averaged Rayleigh quotient of an iTR eigenvector approximation can be efficiently computed and introduce a projected residual to monitor its convergence. In the numerical examples, we illustrate that the norm of this projected iTR residual can also be used to automatically modify the time step to ensure accurate and rapid convergence of the power method.
{"title":"Solving a class of infinite‐dimensional tensor eigenvalue problems by translational invariant tensor ring approximations","authors":"Roel Van Beeumen, Lana Periša, Daniel Kressner, Chao Yang","doi":"10.1002/nla.2573","DOIUrl":"https://doi.org/10.1002/nla.2573","url":null,"abstract":"We examine a method for solving an infinite‐dimensional tensor eigenvalue problem , where the infinite‐dimensional symmetric matrix exhibits a translational invariant structure. We provide a formulation of this type of problem from a numerical linear algebra point of view and describe how a power method applied to is used to obtain an approximation to the desired eigenvector. This infinite‐dimensional eigenvector is represented in a compact way by a translational invariant infinite Tensor Ring (iTR). Low rank approximation is used to keep the cost of subsequent power iterations bounded while preserving the iTR structure of the approximate eigenvector. We show how the averaged Rayleigh quotient of an iTR eigenvector approximation can be efficiently computed and introduce a projected residual to monitor its convergence. In the numerical examples, we illustrate that the norm of this projected iTR residual can also be used to automatically modify the time step to ensure accurate and rapid convergence of the power method.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"18 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work considers the low‐rank approximation of a matrix depending on a parameter in a compact set . Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low‐rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to would involve different, independent DRMs for every , which is not only expensive but also leads to inherently non‐smooth approximations. In this work, we propose to use constant DRMs, that is, is multiplied with the same DRM for every . The resulting parameter‐dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nyström method, are computationally attractive, especially when admits an affine linear decomposition with respect to . We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi‐best low‐rank approximations.
{"title":"Randomized low‐rank approximation of parameter‐dependent matrices","authors":"Daniel Kressner, Hei Yin Lam","doi":"10.1002/nla.2576","DOIUrl":"https://doi.org/10.1002/nla.2576","url":null,"abstract":"This work considers the low‐rank approximation of a matrix depending on a parameter in a compact set . Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low‐rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to would involve different, independent DRMs for every , which is not only expensive but also leads to inherently non‐smooth approximations. In this work, we propose to use constant DRMs, that is, is multiplied with the same DRM for every . The resulting parameter‐dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nyström method, are computationally attractive, especially when admits an affine linear decomposition with respect to . We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi‐best low‐rank approximations.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"11 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Directional interpolation is a fast and efficient compression technique for high‐frequency Helmholtz boundary integral equations, but requires a very large amount of storage in its original form. Algebraic recompression can significantly reduce the storage requirements and speed up the solution process accordingly. During the recompression process, weight matrices are required to correctly measure the influence of different basis vectors on the final result, and for highly accurate approximations, these weight matrices require more storage than the final compressed matrix. We present a compression method for the weight matrices and demonstrate that it introduces only a controllable error to the overall approximation. Numerical experiments show that the new method leads to a significant reduction in storage requirements.
{"title":"Memory‐efficient compression of 𝒟ℋ2‐matrices for high‐frequency Helmholtz problems","authors":"Steffen Börm, Janne Henningsen","doi":"10.1002/nla.2575","DOIUrl":"https://doi.org/10.1002/nla.2575","url":null,"abstract":"Directional interpolation is a fast and efficient compression technique for high‐frequency Helmholtz boundary integral equations, but requires a very large amount of storage in its original form. Algebraic recompression can significantly reduce the storage requirements and speed up the solution process accordingly. During the recompression process, weight matrices are required to correctly measure the influence of different basis vectors on the final result, and for highly accurate approximations, these weight matrices require more storage than the final compressed matrix. We present a compression method for the weight matrices and demonstrate that it introduces only a controllable error to the overall approximation. Numerical experiments show that the new method leads to a significant reduction in storage requirements.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"85 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The main aim of this paper is to develop a new algorithm for computing a nonnegative low multi‐rank tensor approximation for a nonnegative tensor. In the literature, there are several nonnegative tensor factorizations or decompositions, and their approaches are to enforce the nonnegativity constraints in the factors of tensor factorizations or decompositions. In this paper, we study nonnegativity constraints in tensor entries directly, and a low rank approximation for the transformed tensor by using discrete Fourier transformation matrix, discrete cosine transformation matrix, or unitary transformation matrix. This strategy is particularly useful in imaging science as nonnegative pixels appear in tensor entries and a low rank structure can be obtained in the transformation domain. We propose an alternating projections algorithm for computing such a nonnegative low multi‐rank tensor approximation. The convergence of the proposed projection method is established. Numerical examples for multidimensional images are presented to demonstrate that the performance of the proposed method is better than that of nonnegative low Tucker rank tensor approximation and the other nonnegative tensor factorizations and decompositions.
{"title":"Nonnegative low multi‐rank third‐order tensor approximation via transformation","authors":"Guang‐Jing Song, Yexun Hu, Cobi Xu, Michael K. Ng","doi":"10.1002/nla.2574","DOIUrl":"https://doi.org/10.1002/nla.2574","url":null,"abstract":"The main aim of this paper is to develop a new algorithm for computing a nonnegative low multi‐rank tensor approximation for a nonnegative tensor. In the literature, there are several nonnegative tensor factorizations or decompositions, and their approaches are to enforce the nonnegativity constraints in the factors of tensor factorizations or decompositions. In this paper, we study nonnegativity constraints in tensor entries directly, and a low rank approximation for the transformed tensor by using discrete Fourier transformation matrix, discrete cosine transformation matrix, or unitary transformation matrix. This strategy is particularly useful in imaging science as nonnegative pixels appear in tensor entries and a low rank structure can be obtained in the transformation domain. We propose an alternating projections algorithm for computing such a nonnegative low multi‐rank tensor approximation. The convergence of the proposed projection method is established. Numerical examples for multidimensional images are presented to demonstrate that the performance of the proposed method is better than that of nonnegative low Tucker rank tensor approximation and the other nonnegative tensor factorizations and decompositions.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"28 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We obtain an expression for the error in the approximation of and with rational Krylov methods, where is a symmetric matrix, is a vector and the function admits an integral representation. The error expression is obtained by linking the matrix function error with the error in the approximate solution of shifted linear systems using the same rational Krylov subspace, and it can be exploited to derive both a priori and a posteriori error bounds. The error bounds are a generalization of the ones given in Chen et al. for the Lanczos method for matrix functions. A technique that we employ in the rational Krylov context can also be applied to refine the bounds for the Lanczos case.
{"title":"Error bounds for the approximation of matrix functions with rational Krylov methods","authors":"Igor Simunec","doi":"10.1002/nla.2571","DOIUrl":"https://doi.org/10.1002/nla.2571","url":null,"abstract":"We obtain an expression for the error in the approximation of and with rational Krylov methods, where is a symmetric matrix, is a vector and the function admits an integral representation. The error expression is obtained by linking the matrix function error with the error in the approximate solution of shifted linear systems using the same rational Krylov subspace, and it can be exploited to derive both a priori and a posteriori error bounds. The error bounds are a generalization of the ones given in Chen et al. for the Lanczos method for matrix functions. A technique that we employ in the rational Krylov context can also be applied to refine the bounds for the Lanczos case.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"33 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141515341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ulrich Langer, Richard Löscher, Olaf Steinbach, Huidong Yang
The purpose of this article is to investigate the effects of the use of mass‐lumping in the finite element discretization with mesh size of the reduced first‐order optimality system arising from a standard tracking‐type, distributed elliptic optimal control problem with regularization, involving a regularization (cost) parameter on which the solution depends. We show that mass‐lumping will not affect the error between the desired state and the computed finite element state , but will lead to a Schur‐complement system that allows for a fast matrix‐by‐vector multiplication. We show that the use of the Schur‐complement preconditioned conjugate gradient method in a nested iteration setting leads to an asymptotically optimal solver with respect to the complexity. While the proposed approach is applicable independently of the regularity of the given target, our particular interest is in discontinuous desired states that do not belong to the state space. However, the corresponding control belongs to whereas the cost as . This motivates to use in order to balance the error and the maximal costs we are willing to accept. This can be embedded into a nested iteration process on a sequence of refined finite element meshes in order to control the error and the cost simultaneously.
{"title":"Mass‐lumping discretization and solvers for distributed elliptic optimal control problems","authors":"Ulrich Langer, Richard Löscher, Olaf Steinbach, Huidong Yang","doi":"10.1002/nla.2564","DOIUrl":"https://doi.org/10.1002/nla.2564","url":null,"abstract":"The purpose of this article is to investigate the effects of the use of mass‐lumping in the finite element discretization with mesh size of the reduced first‐order optimality system arising from a standard tracking‐type, distributed elliptic optimal control problem with regularization, involving a regularization (cost) parameter on which the solution depends. We show that mass‐lumping will not affect the error between the desired state and the computed finite element state , but will lead to a Schur‐complement system that allows for a fast matrix‐by‐vector multiplication. We show that the use of the Schur‐complement preconditioned conjugate gradient method in a nested iteration setting leads to an asymptotically optimal solver with respect to the complexity. While the proposed approach is applicable independently of the regularity of the given target, our particular interest is in discontinuous desired states that do not belong to the state space. However, the corresponding control belongs to whereas the cost as . This motivates to use in order to balance the error and the maximal costs we are willing to accept. This can be embedded into a nested iteration process on a sequence of refined finite element meshes in order to control the error and the cost simultaneously.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"33 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SummaryWe provide rigorous theoretical bounds for Anderson acceleration (AA) that allow for approximate calculations when applied to solve linear problems. We show that, when the approximate calculations satisfy the provided error bounds, the convergence of AA is maintained while the computational time could be reduced. We also provide computable heuristic quantities, guided by the theoretical error bounds, which can be used to automate the tuning of accuracy while performing approximate calculations. For linear problems, the use of heuristics to monitor the error introduced by approximate calculations, combined with the check on monotonicity of the residual, ensures the convergence of the numerical scheme within a prescribed residual tolerance. Motivated by the theoretical studies, we propose a reduced variant of AA, which consists in projecting the least‐squares used to compute the Anderson mixing onto a subspace of reduced dimension. The dimensionality of this subspace adapts dynamically at each iteration as prescribed by the computable heuristic quantities. We numerically show and assess the performance of AA with approximate calculations on: (i) linear deterministic fixed‐point iterations arising from the Richardson's scheme to solve linear systems with open‐source benchmark matrices with various preconditioners and (ii) non‐linear deterministic fixed‐point iterations arising from non‐linear time‐dependent Boltzmann equations.
摘要我们为安德森加速度(AA)提供了严格的理论界限,允许在应用于求解线性问题时进行近似计算。我们证明,当近似计算满足所提供的误差边界时,AA 的收敛性得以保持,同时计算时间可以缩短。我们还提供了以理论误差边界为指导的可计算启发式量,可用于在执行近似计算时自动调整精度。对于线性问题,使用启发式方法监测近似计算引入的误差,并结合残差单调性检查,可确保数值方案在规定的残差容限内收敛。受理论研究的启发,我们提出了 AA 的缩减变体,即把用于计算安德森混合的最小二乘法投影到一个缩减维度的子空间上。这个子空间的维度在每次迭代时都会根据可计算的启发式数量进行动态调整。我们用数值显示并评估了 AA 在以下方面的近似计算性能:(i) 由 Richardson 方案产生的线性确定性定点迭代,利用各种预处理器解决带有开源基准矩阵的线性系统;以及 (ii) 由非线性时变玻尔兹曼方程产生的非线性确定性定点迭代。
{"title":"Anderson acceleration with approximate calculations: Applications to scientific computing","authors":"Massimiliano Lupo Pasini, M. Paul Laiu","doi":"10.1002/nla.2562","DOIUrl":"https://doi.org/10.1002/nla.2562","url":null,"abstract":"SummaryWe provide rigorous theoretical bounds for Anderson acceleration (AA) that allow for approximate calculations when applied to solve linear problems. We show that, when the approximate calculations satisfy the provided error bounds, the convergence of AA is maintained while the computational time could be reduced. We also provide computable heuristic quantities, guided by the theoretical error bounds, which can be used to automate the tuning of accuracy while performing approximate calculations. For linear problems, the use of heuristics to monitor the error introduced by approximate calculations, combined with the check on monotonicity of the residual, ensures the convergence of the numerical scheme within a prescribed residual tolerance. Motivated by the theoretical studies, we propose a reduced variant of AA, which consists in projecting the least‐squares used to compute the Anderson mixing onto a subspace of reduced dimension. The dimensionality of this subspace adapts dynamically at each iteration as prescribed by the computable heuristic quantities. We numerically show and assess the performance of AA with approximate calculations on: (i) linear deterministic fixed‐point iterations arising from the Richardson's scheme to solve linear systems with open‐source benchmark matrices with various preconditioners and (ii) non‐linear deterministic fixed‐point iterations arising from non‐linear time‐dependent Boltzmann equations.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"4 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140927055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, a regularized Kaczmarz method has been proposed to solve tensor recovery problems. In this article, we propose a sampling greedy average regularized Kaczmarz method. This method can be viewed as a block or mini‐batch version of the regularized Kaczmarz method, which is based on averaging several regularized Kaczmarz steps with a constant or adaptive extrapolated step size. Also, it is equipped with a sampling greedy strategy to select the working tensor slices from the sensing tensor. We prove that our new method converges linearly in expectation and show that the sampling greedy strategy can exhibit an accelerated convergence rate compared to the random sampling strategy. Numerical experiments are carried out to show the feasibility and efficiency of our new method on various signal/image recovery problems, including sparse signal recovery, image inpainting, and image deconvolution.
{"title":"A sampling greedy average regularized Kaczmarz method for tensor recovery","authors":"Xiaoqing Zhang, Xiaofeng Guo, Jianyu Pan","doi":"10.1002/nla.2560","DOIUrl":"https://doi.org/10.1002/nla.2560","url":null,"abstract":"Recently, a regularized Kaczmarz method has been proposed to solve tensor recovery problems. In this article, we propose a sampling greedy average regularized Kaczmarz method. This method can be viewed as a block or mini‐batch version of the regularized Kaczmarz method, which is based on averaging several regularized Kaczmarz steps with a constant or adaptive extrapolated step size. Also, it is equipped with a sampling greedy strategy to select the working tensor slices from the sensing tensor. We prove that our new method converges linearly in expectation and show that the sampling greedy strategy can exhibit an accelerated convergence rate compared to the random sampling strategy. Numerical experiments are carried out to show the feasibility and efficiency of our new method on various signal/image recovery problems, including sparse signal recovery, image inpainting, and image deconvolution.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"64 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140926994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SummaryThe trust‐region subproblem (TRS) plays a vital role in numerical optimization, numerical linear algebra, and many other applications. It is known that the TRS may have multiple optimal solutions in the hard case. In [Carmon and Duchi, SIAM Rev., 62 (2020), pp. 395–436], a block Lanczos method was proposed to solve the TRS in the hard case, and the convergence of the optimal objective value was established. However, the convergence of the KKT error as well as that of the approximate solution are still unknown for this method. In this paper, we give a more detailed convergence analysis on the block Lanczos method for the TRS in the hard case. First, we improve the convergence speed of the approximate objective value. Second, we derive the speed of the KKT error tends to zero. Third, we establish the convergence of the approximation solution, and show theoretically that the projected TRS obtained from the block Lanczos method will be close to the hard case more and more as the block Lanczos process proceeds. Numerical experiments illustrate the effectiveness of our theoretical results.
{"title":"Convergence of the block Lanczos method for the trust‐region subproblem in the hard case","authors":"Bo Feng, Gang Wu","doi":"10.1002/nla.2561","DOIUrl":"https://doi.org/10.1002/nla.2561","url":null,"abstract":"SummaryThe trust‐region subproblem (TRS) plays a vital role in numerical optimization, numerical linear algebra, and many other applications. It is known that the TRS may have multiple optimal solutions in the hard case. In [Carmon and Duchi, SIAM Rev., 62 (2020), pp. 395–436], a block Lanczos method was proposed to solve the TRS in the <jats:italic>hard case</jats:italic>, and the convergence of the optimal objective value was established. However, the convergence of the KKT error as well as that of the approximate solution are still unknown for this method. In this paper, we give a more detailed convergence analysis on the block Lanczos method for the TRS in the <jats:italic>hard case</jats:italic>. First, we improve the convergence speed of the approximate objective value. Second, we derive the speed of the KKT error tends to zero. Third, we establish the convergence of the approximation solution, and show theoretically that the <jats:italic>projected</jats:italic> TRS obtained from the block Lanczos method will be close to the <jats:italic>hard case</jats:italic> more and more as the block Lanczos process proceeds. Numerical experiments illustrate the effectiveness of our theoretical results.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":"146 1","pages":""},"PeriodicalIF":4.3,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140927058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}