{"title":"Parallel cyclic reduction of padded bordered almost block diagonal matrices","authors":"Enrico Bertolazzi, Davide Stocco","doi":"10.1016/j.cam.2024.116331","DOIUrl":null,"url":null,"abstract":"<div><div>The solution of linear systems is a crucial and indispensable technique in the field of numerical analysis. Among linear system solvers, the cyclic reduction algorithm stands out for its natural inclination to parallelization. So far, the cyclic reduction has been applied primarily to linear systems with almost block diagonal matrices. Some of its variants widen the usage to almost block diagonal matrices with a last block of rows introducing a set of non-zero elements in the first group of columns. In this work, we extend cyclic reduction to matrices with additional non-zero elements below and to the right without any limitations. These matrices, called padded bordered almost block diagonal, arise from the discretization of optimal control problems featuring arbitrary boundary conditions and free parameters. Nonetheless, they also appear in two-point boundary value problems with free parameters. The proposed algorithm is based on the LU factorizations, and it is designed to be executed in parallel on multi-thread architectures. The algorithm performance is assessed through numerical experiments with different matrix sizes and threads. The computation times and speedups obtained with the parallel implementation indicate that the suggested algorithm is a robust solution for solving padded bordered almost block diagonal linear systems. Furthermore, its structure makes it suitable for the use of different matrix factorization techniques, such as QR or SVD. This flexibility enables tailored customization of the algorithm on the basis of the specific application requirements.</div></div>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S037704272400579X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
The solution of linear systems is a crucial and indispensable technique in the field of numerical analysis. Among linear system solvers, the cyclic reduction algorithm stands out for its natural inclination to parallelization. So far, the cyclic reduction has been applied primarily to linear systems with almost block diagonal matrices. Some of its variants widen the usage to almost block diagonal matrices with a last block of rows introducing a set of non-zero elements in the first group of columns. In this work, we extend cyclic reduction to matrices with additional non-zero elements below and to the right without any limitations. These matrices, called padded bordered almost block diagonal, arise from the discretization of optimal control problems featuring arbitrary boundary conditions and free parameters. Nonetheless, they also appear in two-point boundary value problems with free parameters. The proposed algorithm is based on the LU factorizations, and it is designed to be executed in parallel on multi-thread architectures. The algorithm performance is assessed through numerical experiments with different matrix sizes and threads. The computation times and speedups obtained with the parallel implementation indicate that the suggested algorithm is a robust solution for solving padded bordered almost block diagonal linear systems. Furthermore, its structure makes it suitable for the use of different matrix factorization techniques, such as QR or SVD. This flexibility enables tailored customization of the algorithm on the basis of the specific application requirements.
线性系统求解是数值分析领域不可或缺的关键技术。在线性系统求解器中,循环缩减算法因其天然的并行化倾向而脱颖而出。迄今为止,循环缩减主要应用于矩阵几乎为对角线的线性系统。它的一些变体将其应用范围扩大到了在第一组列中引入一组非零元素的最后一组行的几乎是对角线的矩阵。在这项工作中,我们将循环缩减扩展到下面和右边有额外非零元素的矩阵,没有任何限制。这些矩阵被称为有边框的近似块对角矩阵,产生于具有任意边界条件和自由参数的最优控制问题的离散化。不过,它们也出现在具有自由参数的两点边界值问题中。所提出的算法以 LU 因子化为基础,可在多线程架构上并行执行。算法性能通过不同矩阵大小和线程的数值实验进行评估。并行执行所获得的计算时间和速度提升表明,所建议的算法是解决填充边界几乎对角线块线性系统的稳健解决方案。此外,该算法的结构使其适合使用不同的矩阵因式分解技术,如 QR 或 SVD。这种灵活性使算法可以根据具体应用要求进行定制。