首页 > 最新文献

ACM Transactions on Mathematical Software最新文献

英文 中文
Algorithm XXX: Concurrent Alternating Least Squares for multiple simultaneous Canonical Polyadic Decompositions 算法XXX:并行交替最小二乘的多重同时正则多进分解
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-04-29 DOI: 10.1145/3519383
C. Psarras, L. Karlsson, R. Bro, P. Bientinesi
Tensor decompositions, such as CANDECOMP/PARAFAC (CP), are widely used in a variety of applications, such as chemometrics, signal processing, and machine learning. A broadly used method for computing such decompositions relies on the Alternating Least Squares (ALS) algorithm. When the number of components is small, regardless of its implementation, ALS exhibits low arithmetic intensity, which severely hinders its performance and makes GPU offloading ineffective. We observe that, in practice, experts often have to compute multiple decompositions of the same tensor, each with a small number of components (typically fewer than 20), to ultimately find the best ones to use for the application at hand. In this paper, we illustrate how multiple decompositions of the same tensor can be fused together at the algorithmic level to increase the arithmetic intensity. Therefore, it becomes possible to make efficient use of GPUs for further speedups; at the same time the technique is compatible with many enhancements typically used in ALS, such as line search, extrapolation, and non-negativity constraints. We introduce the Concurrent ALS algorithm and library, which offers an interface to MATLAB, and a mechanism to effectively deal with the issue that decompositions complete at different times. Experimental results on artificial and real datasets demonstrate a shorter time to completion due to increased arithmetic intensity.
张量分解,如CANDECOMP/PARAFAC (CP),被广泛应用于各种应用,如化学计量学,信号处理和机器学习。一种广泛使用的计算这种分解的方法依赖于交替最小二乘(ALS)算法。在组件数量较少的情况下,无论采用何种实现方式,ALS的运算强度都很低,严重影响了ALS的性能,导致GPU卸载效率低下。我们观察到,在实践中,专家经常需要计算相同张量的多次分解,每次分解都有少量的组件(通常少于20个),以最终找到适合手头应用程序的最佳组件。在本文中,我们说明了如何在算法层面上将同一张量的多个分解融合在一起以增加算法强度。因此,可以有效地利用gpu来进一步提高速度;同时,该技术与ALS中通常使用的许多增强功能兼容,例如线搜索、外推和非负性约束。本文介绍了并行ALS算法和库,它提供了一个与MATLAB的接口,以及一种有效处理分解在不同时间完成问题的机制。在人工和真实数据集上的实验结果表明,由于提高了算法强度,算法完成时间缩短。
{"title":"Algorithm XXX: Concurrent Alternating Least Squares for multiple simultaneous Canonical Polyadic Decompositions","authors":"C. Psarras, L. Karlsson, R. Bro, P. Bientinesi","doi":"10.1145/3519383","DOIUrl":"https://doi.org/10.1145/3519383","url":null,"abstract":"Tensor decompositions, such as CANDECOMP/PARAFAC (CP), are widely used in a variety of applications, such as chemometrics, signal processing, and machine learning. A broadly used method for computing such decompositions relies on the Alternating Least Squares (ALS) algorithm. When the number of components is small, regardless of its implementation, ALS exhibits low arithmetic intensity, which severely hinders its performance and makes GPU offloading ineffective. We observe that, in practice, experts often have to compute multiple decompositions of the same tensor, each with a small number of components (typically fewer than 20), to ultimately find the best ones to use for the application at hand. In this paper, we illustrate how multiple decompositions of the same tensor can be fused together at the algorithmic level to increase the arithmetic intensity. Therefore, it becomes possible to make efficient use of GPUs for further speedups; at the same time the technique is compatible with many enhancements typically used in ALS, such as line search, extrapolation, and non-negativity constraints. We introduce the Concurrent ALS algorithm and library, which offers an interface to MATLAB, and a mechanism to effectively deal with the issue that decompositions complete at different times. Experimental results on artificial and real datasets demonstrate a shorter time to completion due to increased arithmetic intensity.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41347702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Algorithm xxx: Restoration of function by integrals with cubic integral smoothing spline in R 算法xxx:用R中的三次积分平滑样条积分恢复函数
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-03-30 DOI: 10.1145/3519384
Yu. D. Korablev
In this paper, a cubic integral smoothing spline with roughness penalty for restoring a function by integrals is described. A mathematical method for building such a spline is described in detail. The method is based on cubic integral spline with a penalty function, which minimizes the sum of squares of the difference between the observed integrals of the unknown function and the integrals of the spline being constructed, plus an additional penalty for the nonlinearity (roughness) of the spline. This method has a matrix form, and this paper shows in detail how to fill in each matrix. The parameter α governs the desired smoothness of the restored function. Spline knots can be chosen independently of observations, and a weight can be defined for each observation for more control over the resulting spline shape. An implementation in the R language as function int_spline is given. The function int_spline is easy to use, with all arguments completely described and corresponding examples given. An example of the application of the method in rare event analysis and forecasting is given.
本文描述了一种用积分恢复函数的带粗糙度惩罚的三次积分光滑样条。详细描述了建立这种样条曲线的数学方法。该方法基于具有惩罚函数的三次积分样条,该函数最小化未知函数的观测积分与正在构建的样条积分之间的差的平方和,加上对样条非线性(粗糙度)的额外惩罚。这种方法有一个矩阵形式,本文详细说明了如何填充每个矩阵。参数α控制恢复函数的期望平滑度。样条曲线节点可以独立于观测值进行选择,并且可以为每个观测值定义权重,以更好地控制生成的样条曲线形状。给出了函数int_spline在R语言中的实现。函数int_spline很容易使用,所有参数都有完整的描述,并给出了相应的示例。给出了该方法在罕见事件分析与预测中的应用实例。
{"title":"Algorithm xxx: Restoration of function by integrals with cubic integral smoothing spline in R","authors":"Yu. D. Korablev","doi":"10.1145/3519384","DOIUrl":"https://doi.org/10.1145/3519384","url":null,"abstract":"\u0000 In this paper, a cubic integral smoothing spline with roughness penalty for restoring a function by integrals is described. A mathematical method for building such a spline is described in detail. The method is based on cubic integral spline with a penalty function, which minimizes the sum of squares of the difference between the observed integrals of the unknown function and the integrals of the spline being constructed, plus an additional penalty for the nonlinearity (roughness) of the spline. This method has a matrix form, and this paper shows in detail how to fill in each matrix. The parameter\u0000 α\u0000 governs the desired smoothness of the restored function. Spline knots can be chosen independently of observations, and a weight can be defined for each observation for more control over the resulting spline shape. An implementation in the R language as function\u0000 int_spline\u0000 is given. The function\u0000 int_spline\u0000 is easy to use, with all arguments completely described and corresponding examples given. An example of the application of the method in rare event analysis and forecasting is given.\u0000","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46245619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Algorithm xxx: Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures 算法xxx:在并行计算架构上计算显示秩的UTV分解的有效算法
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-03-25 DOI: 10.1145/3507466
N. Heavner, F. D. Igual, G. Quintana-Ortí, P.G. Martinsson
The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm “randUTV” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations like matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods like column-pivoted QR in most high performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm , based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performances of our new codes compare favorably with competing factorizations available on both shared-memory and distributed-memory architectures.
随机奇异值分解(RSVD)是目前公认的有效计算矩阵近似奇异值分解的技术。基于支持RSVD的思想,最近提出的算法“randUTV”计算给定矩阵的全因子分解,该矩阵提供具有接近最优误差的低秩近似。由于randUTV的大部分是根据矩阵矩阵乘法和非分解QR因子分解等通信高效运算进行的,因此在大多数高性能计算环境中,它比列轴QR等竞争性的秩揭示因子分解方法更快。在本文中,为共享内存和分布式内存计算环境提供了优化的randUTV实现。对于共享内存,randUTV根据块算法进行了重新设计,该算法与运行时任务调度器一起,消除了数据同步点的瓶颈,以实现比标准块算法更快的速度,基于纯粹的fork-join方法。分布式内存实现基于ScaLAPACK库。我们的新代码的性能与共享内存和分布式内存架构上的竞争因子分解相比是有利的。
{"title":"Algorithm xxx: Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures","authors":"N. Heavner, F. D. Igual, G. Quintana-Ortí, P.G. Martinsson","doi":"10.1145/3507466","DOIUrl":"https://doi.org/10.1145/3507466","url":null,"abstract":"\u0000 The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm “randUTV” computes a\u0000 full\u0000 factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of\u0000 randUTV\u0000 is cast in terms of communication-efficient operations like matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods like column-pivoted QR in most high performance computational settings. In this article, optimized\u0000 randUTV\u0000 implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory,\u0000 randUTV\u0000 is redesigned in terms of an\u0000 algorithm-by-blocks\u0000 that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard\u0000 blocked algorithm\u0000 , based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performances of our new codes compare favorably with competing factorizations available on both shared-memory and distributed-memory architectures.\u0000","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43296455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Algorithm xxx: Spherical Triangle Algorithm: A Fast Oracle for Convex Hull Membership Queries 算法xxx:球面三角形算法:凸壳成员查询的快速Oracle
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-03-07 DOI: 10.1145/3516520
B. Kalantari, Yikai Zhang
The Convex Hull Membership (CHM) tests whether (p in conv(S) ) , where (p ) and the (n ) points of (S ) lie in (mathbb { R}^m ) . CHM finds applications in Linear Programming, Computational Geometry and Machine Learning. The Triangle Algorithm (TA), previously developed, in (O(1/varepsilon ^2) ) iterations computes (p^{prime } in conv(S) ) , either an (varepsilon ) - approximate solution , or a witness certifying (p notin conv(S) ) . We first prove the equivalence of exact and approximate versions of CHM and Spherical -CHM, where (p=0 ) and (Vert vVert =1 ) for each (v )
凸壳隶属度(CHM)测试(S )的点(p in conv(S) )、(p )和(n )是否在(mathbb { R}^m )中。CHM在线性规划、计算几何和机器学习中都有应用。以前开发的三角算法(TA)在(O(1/varepsilon ^2) )迭代中计算(p^{prime } in conv(S) ),要么是(varepsilon ) -近似解,要么是证明(p notin conv(S) )的证人。我们首先证明了精确和近似版本的CHM和球面-CHM的等价性,其中(S )中的(v )分别为(p=0 )和(Vert vVert =1 )。如果对于一些(M ge 1 )每一个没有(Vert p^{prime }Vert gt varepsilon )的证人都承认(v in S )满足(Vert p^{prime } - vVert ge sqrt {1+varepsilon /M} ),我们证明迭代次数提高到(O(M/varepsilon) )和(M le 1/varepsilon )总是成立。基于最小包球法和球面包球法的等价性,可以改进最小包球法求解包球法。然而,我们证明了MEB中的((1+ varepsilon) ) -近似是sphericchm中的(Omega (sqrt {varepsilon }) ) -近似。因此,即使(O(1/varepsilon) )迭代MEB算法也不优于sphere - ta。对于MEB核心集也证明了类似的弱点。Spherical-TA还产生了一个变种的全顶点三角形算法(AVTA),用于计算(conv(S) )的所有顶点。对不同问题的大量计算表明,TA和sphere -TA通常比Frank-Wolfe、MEB和LP-Solver等算法具有更高的效率。
{"title":"Algorithm xxx: Spherical Triangle Algorithm: A Fast Oracle for Convex Hull Membership Queries","authors":"B. Kalantari, Yikai Zhang","doi":"10.1145/3516520","DOIUrl":"https://doi.org/10.1145/3516520","url":null,"abstract":"<jats:p>\u0000 The\u0000 <jats:italic>Convex Hull Membership</jats:italic>\u0000 (CHM) tests whether\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(p in conv(S) )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 , where\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(p )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 and the\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(n )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 points of\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(S )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 lie in\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(mathbb { R}^m )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 . CHM finds applications in Linear Programming, Computational Geometry and Machine Learning. The\u0000 <jats:italic>Triangle Algorithm</jats:italic>\u0000 (TA), previously developed, in\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(O(1/varepsilon ^2) )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 iterations computes\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(p^{prime } in conv(S) )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 , either an\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(varepsilon )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 -\u0000 <jats:italic>approximate solution</jats:italic>\u0000 , or a\u0000 <jats:italic>witness</jats:italic>\u0000 certifying\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(p notin conv(S) )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 . We first prove the equivalence of exact and approximate versions of CHM and\u0000 <jats:italic>Spherical</jats:italic>\u0000 -CHM, where\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(p=0 )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 and\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(Vert vVert =1 )</jats:tex-math>\u0000 </jats:inline-formula>\u0000 for each\u0000 <jats:inline-formula content-type=\"math/tex\">\u0000 <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(v )</jats:tex-mat","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43633429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
pylspack: Parallel Algorithms and Data Structures for Sketching, Column Subset Selection, Regression, and Leverage Scores pylspack:绘制、列子集选择、回归和杠杆分数的并行算法和数据结构
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-03-05 DOI: 10.1145/3555370
Aleksandros Sobczyk, Efstratios Gallopoulos
We present parallel algorithms and data structures for three fundamental operations in Numerical Linear Algebra: (i) Gaussian and CountSketch random projections and their combination, (ii) computation of the Gram matrix, and (iii) computation of the squared row norms of the product of two matrices, with a special focus on “tall-and-skinny” matrices, which arise in many applications. We provide a detailed analysis of the ubiquitous CountSketch transform and its combination with Gaussian random projections, accounting for memory requirements, computational complexity and workload balancing. We also demonstrate how these results can be applied to column subset selection, least squares regression and leverage scores computation. These tools have been implemented in pylspack, a publicly available Python package1 whose core is written in C++ and parallelized with OpenMP and that is compatible with standard matrix data structures of SciPy and NumPy. Extensive numerical experiments indicate that the proposed algorithms scale well and significantly outperform existing libraries for tall-and-skinny matrices.
我们提出了数值线性代数中三种基本运算的并行算法和数据结构:(i)高斯和CountSketch随机投影及其组合,(ii)Gram矩阵的计算,以及(iii)两个矩阵乘积的平方行范数的计算,特别关注在许多应用中出现的“高和瘦”矩阵。我们详细分析了无处不在的CountSketch变换及其与高斯随机投影的结合,考虑了内存需求、计算复杂性和工作负载平衡。我们还演示了如何将这些结果应用于列子集选择、最小二乘回归和杠杆分数计算。这些工具已经在pylspack中实现,pylspack是一个公开的Python包1,其核心是用C++编写的,并与OpenMP并行,与SciPy和NumPy的标准矩阵数据结构兼容。大量的数值实验表明,所提出的算法具有良好的扩展性,并且显著优于现有的高矩阵和瘦矩阵库。
{"title":"pylspack: Parallel Algorithms and Data Structures for Sketching, Column Subset Selection, Regression, and Leverage Scores","authors":"Aleksandros Sobczyk, Efstratios Gallopoulos","doi":"10.1145/3555370","DOIUrl":"https://doi.org/10.1145/3555370","url":null,"abstract":"We present parallel algorithms and data structures for three fundamental operations in Numerical Linear Algebra: (i) Gaussian and CountSketch random projections and their combination, (ii) computation of the Gram matrix, and (iii) computation of the squared row norms of the product of two matrices, with a special focus on “tall-and-skinny” matrices, which arise in many applications. We provide a detailed analysis of the ubiquitous CountSketch transform and its combination with Gaussian random projections, accounting for memory requirements, computational complexity and workload balancing. We also demonstrate how these results can be applied to column subset selection, least squares regression and leverage scores computation. These tools have been implemented in pylspack, a publicly available Python package1 whose core is written in C++ and parallelized with OpenMP and that is compatible with standard matrix data structures of SciPy and NumPy. Extensive numerical experiments indicate that the proposed algorithms scale well and significantly outperform existing libraries for tall-and-skinny matrices.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43272294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Algorithm 1035: A Gradient-based Implementation of the Polyhedral Active Set Algorithm 算法1035:基于梯度的多面体主动集算法的实现
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2022-02-10 DOI: 10.1145/3583559
W. Hager, Hongchao Zhang
The Polyhedral Active Set Algorithm (PASA) is designed to optimize a general nonlinear function over a polyhedron. Phase one of the algorithm is a nonmonotone gradient projection algorithm, while phase two is an active set algorithm that explores faces of the constraint polyhedron. A gradient-based implementation is presented, where a projected version of the conjugate gradient algorithm is employed in phase two. Asymptotically, only phase two is performed. Comparisons are given with IPOPT using polyhedral-constrained problems from CUTEst and the Maros/Meszaros quadratic programming test set.
多面体主动集算法(PASA)设计用于优化多面体上的一般非线性函数。该算法的第一阶段是非单调梯度投影算法,而第二阶段是探索约束多面体面的主动集算法。提出了一种基于梯度的实现,其中在第二阶段采用了共轭梯度算法的投影版本。渐进地,只执行第二阶段。使用CUTEst中的多面体约束问题和Maros/Meszaros二次规划测试集的IPOPT进行了比较。
{"title":"Algorithm 1035: A Gradient-based Implementation of the Polyhedral Active Set Algorithm","authors":"W. Hager, Hongchao Zhang","doi":"10.1145/3583559","DOIUrl":"https://doi.org/10.1145/3583559","url":null,"abstract":"The Polyhedral Active Set Algorithm (PASA) is designed to optimize a general nonlinear function over a polyhedron. Phase one of the algorithm is a nonmonotone gradient projection algorithm, while phase two is an active set algorithm that explores faces of the constraint polyhedron. A gradient-based implementation is presented, where a projected version of the conjugate gradient algorithm is employed in phase two. Asymptotically, only phase two is performed. Comparisons are given with IPOPT using polyhedral-constrained problems from CUTEst and the Maros/Meszaros quadratic programming test set.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49613741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
hIPPYlib-MUQ: A Bayesian Inference Software Framework for Integration of Data with Complex Predictive Models under Uncertainty hipylib - muq:一种用于不确定性下复杂预测模型数据集成的贝叶斯推理软件框架
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2021-12-01 DOI: 10.1145/3580278
Ki-tae Kim, U. Villa, M. Parno, Y. Marzouk, O. Ghattas, N. Petra
Bayesian inference provides a systematic framework for integration of data with mathematical models to quantify the uncertainty in the solution of the inverse problem. However, the solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an extensible and scalable software framework that contains implementations of state-of-the art algorithms aimed to overcome the challenges of high-dimensional, PDE-constrained Bayesian inverse problems. These algorithms accelerate MCMC sampling by exploiting the geometry and intrinsic low-dimensionality of parameter space via derivative information and low rank approximation. The software integrates two complementary open-source software packages, hIPPYlib and MUQ. hIPPYlib solves PDE-constrained inverse problems using automatically-generated adjoint-based derivatives, but it lacks full Bayesian capabilities. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients and Hessians to permit large-scale solution. By combining these two complementary libraries, we created a robust, scalable, and efficient software framework that realizes the benefits of each and allows us to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. To illustrate the capabilities of hIPPYlib-MUQ, we present a comparison of a number of MCMC methods available in the integrated software on several high-dimensional Bayesian inverse problems. These include problems characterized by both linear and nonlinear PDEs, various noise models, and different parameter dimensions. The results demonstrate that large (∼ 50×) speedups over conventional black box and gradient-based MCMC algorithms can be obtained by exploiting Hessian information (from the log-posterior), underscoring the power of the integrated hIPPYlib-MUQ framework.
贝叶斯推理为数据与数学模型的集成提供了一个系统框架,以量化反问题解中的不确定性。然而,由偏微分方程(PDEs)描述的复杂正演模型控制的贝叶斯反问题的解仍然是黑盒马尔可夫链蒙特卡罗(MCMC)方法所禁止的。我们提出了hIPPYlib-MUQ,这是一个可扩展和可扩展的软件框架,包含了旨在克服高维、pde约束的贝叶斯逆问题挑战的最先进算法的实现。这些算法通过导数信息和低秩近似,利用参数空间的几何特性和固有的低维性,加快了MCMC采样速度。该软件集成了两个互补的开源软件包,hIPPYlib和MUQ。hipylib使用自动生成的基于伴随导数的导数来解决pde约束的逆问题,但它缺乏完整的贝叶斯功能。MUQ提供了一系列强大的贝叶斯反演模型和算法,但希望正演模型配备梯度和Hessians,以允许大规模解决。通过结合这两个互补的库,我们创建了一个健壮的、可伸缩的、高效的软件框架,它实现了每个库的优点,并允许我们在广泛的科学和工程学科范围内处理复杂的大规模贝叶斯反问题。为了说明hipylib - muq的功能,我们对几个高维贝叶斯反问题的集成软件中可用的一些MCMC方法进行了比较。这些问题包括线性和非线性偏微分方程、各种噪声模型和不同的参数尺寸。结果表明,通过利用Hessian信息(来自对数后验),可以获得比传统黑盒和基于梯度的MCMC算法大(~ 50倍)的加速,强调了集成hipylib - muq框架的强大功能。
{"title":"hIPPYlib-MUQ: A Bayesian Inference Software Framework for Integration of Data with Complex Predictive Models under Uncertainty","authors":"Ki-tae Kim, U. Villa, M. Parno, Y. Marzouk, O. Ghattas, N. Petra","doi":"10.1145/3580278","DOIUrl":"https://doi.org/10.1145/3580278","url":null,"abstract":"Bayesian inference provides a systematic framework for integration of data with mathematical models to quantify the uncertainty in the solution of the inverse problem. However, the solution of Bayesian inverse problems governed by complex forward models described by partial differential equations (PDEs) remains prohibitive with black-box Markov chain Monte Carlo (MCMC) methods. We present hIPPYlib-MUQ, an extensible and scalable software framework that contains implementations of state-of-the art algorithms aimed to overcome the challenges of high-dimensional, PDE-constrained Bayesian inverse problems. These algorithms accelerate MCMC sampling by exploiting the geometry and intrinsic low-dimensionality of parameter space via derivative information and low rank approximation. The software integrates two complementary open-source software packages, hIPPYlib and MUQ. hIPPYlib solves PDE-constrained inverse problems using automatically-generated adjoint-based derivatives, but it lacks full Bayesian capabilities. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients and Hessians to permit large-scale solution. By combining these two complementary libraries, we created a robust, scalable, and efficient software framework that realizes the benefits of each and allows us to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. To illustrate the capabilities of hIPPYlib-MUQ, we present a comparison of a number of MCMC methods available in the integrated software on several high-dimensional Bayesian inverse problems. These include problems characterized by both linear and nonlinear PDEs, various noise models, and different parameter dimensions. The results demonstrate that large (∼ 50×) speedups over conventional black box and gradient-based MCMC algorithms can be obtained by exploiting Hessian information (from the log-posterior), underscoring the power of the integrated hIPPYlib-MUQ framework.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48074933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Computing with B-series 使用B系列进行计算
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2021-11-23 DOI: 10.1145/3573384
D. Ketcheson, Hendrik Ranocha
We present BSeries.jl, a Julia package for the computation and manipulation of B-series, which are a versatile theoretical tool for understanding and designing discretizations of differential equations. We give a short introduction to the theory of B-series and associated concepts and provide examples of their use, including method composition and backward error analysis. The associated software is highly performant and makes it possible to work with B-series of high order.
我们提出b系列。jl,一个用于计算和操作b系列的Julia包,它是一个用于理解和设计微分方程离散化的通用理论工具。本文简要介绍了b级数的理论和相关概念,并举例说明了它们的应用,包括方法组成和逆向误差分析。配套软件性能优异,可与b系列高阶产品配套使用。
{"title":"Computing with B-series","authors":"D. Ketcheson, Hendrik Ranocha","doi":"10.1145/3573384","DOIUrl":"https://doi.org/10.1145/3573384","url":null,"abstract":"We present BSeries.jl, a Julia package for the computation and manipulation of B-series, which are a versatile theoretical tool for understanding and designing discretizations of differential equations. We give a short introduction to the theory of B-series and associated concepts and provide examples of their use, including method composition and backward error analysis. The associated software is highly performant and makes it possible to work with B-series of high order.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48570599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Computational Graphs for Matrix Functions 矩阵函数的计算图
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2021-07-26 DOI: 10.1145/3568991
E. Jarlebring, M. Fasi, Emil Ringh
Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective approach to study existing techniques, improve them, and eventually derive new ones. The accuracy of these matrix techniques can be characterized by the accuracy of their scalar counterparts, thus designing algorithms for matrix functions can be regarded as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG in a fashion analogous to backpropagation. This article describes GraphMatFun.jl, a Julia package that offers the means to generate and manipulate computational graphs, optimize their coefficients, and generate Julia, MATLAB, and C code to evaluate them efficiently at a matrix argument. The software also provides tools to estimate the accuracy of a graph-based algorithm and thus obtain numerically reliable methods. For the exponential, for example, using a particular form (degree-optimal) of polynomials produces implementations that in many cases are cheaper, in terms of computational cost, than the Padé-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.
许多计算矩阵函数的数值方法可以很自然地看作计算图。将这些方法重新表述为有向无环图(dag)是一种特别有效的方法,可以研究现有技术,改进它们,并最终获得新技术。这些矩阵技术的精度可以通过其标量对应的精度来表征,因此设计矩阵函数的算法可以被视为一个标量值优化问题。优化过程中所需的导数可以通过利用DAG的结构以类似于反向传播的方式自动计算。本文描述了GraphMatFun。jl,一个Julia包,它提供了生成和操作计算图的方法,优化它们的系数,并生成Julia、MATLAB和C代码,以便在矩阵参数处有效地对它们进行计算。该软件还提供了工具来估计基于图形的算法的准确性,从而获得数值上可靠的方法。例如,对于指数,使用多项式的特定形式(度最优)产生的实现在许多情况下比数学软件中通常使用的基于pad的技术更便宜(就计算成本而言)。优化的图形和相应的生成代码可在线获得。
{"title":"Computational Graphs for Matrix Functions","authors":"E. Jarlebring, M. Fasi, Emil Ringh","doi":"10.1145/3568991","DOIUrl":"https://doi.org/10.1145/3568991","url":null,"abstract":"Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective approach to study existing techniques, improve them, and eventually derive new ones. The accuracy of these matrix techniques can be characterized by the accuracy of their scalar counterparts, thus designing algorithms for matrix functions can be regarded as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG in a fashion analogous to backpropagation. This article describes GraphMatFun.jl, a Julia package that offers the means to generate and manipulate computational graphs, optimize their coefficients, and generate Julia, MATLAB, and C code to evaluate them efficiently at a matrix argument. The software also provides tools to estimate the accuracy of a graph-based algorithm and thus obtain numerically reliable methods. For the exponential, for example, using a particular form (degree-optimal) of polynomials produces implementations that in many cases are cheaper, in terms of computational cost, than the Padé-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44753027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Geometric Multigrid Method for Space-Time Finite Element Discretizations of the Navier–Stokes Equations and its Application to 3D Flow Simulation Navier-Stokes方程时空有限元离散化的几何多重网格方法及其在三维流动模拟中的应用
IF 2.7 1区 数学 Q1 Mathematics Pub Date : 2021-07-22 DOI: 10.1145/3582492
Mathias Anselmann, M. Bause
We present a parallelized geometric multigrid (GMG) method, based on the cell-based Vanka smoother, for higher order space-time finite element methods (STFEM) to the incompressible Navier–Stokes equations. The STFEM is implemented as a time marching scheme. The GMG solver is applied as a preconditioner for generalized minimal residual iterations. Its performance properties are demonstrated for 2D and 3D benchmarks of flow around a cylinder. The key ingredients of the GMG approach are the construction of the local Vanka smoother over all degrees of freedom in time of the respective subinterval and its efficient application. For this, data structures that store pre-computed cell inverses of the Jacobian for all hierarchical levels and require only a reasonable amount of memory overhead are generated. The GMG method is built for the deal.II finite element library. The concepts are flexible and can be transferred to similar software platforms.
我们提出了一种基于单元Vanka平滑器的并行几何多重网格(GMG)方法,用于不可压缩Navier–Stokes方程的高阶时空有限元方法(STFEM)。STFEM被实现为时间行进方案。GMG求解器被用作广义最小残差迭代的预处理器。它的性能特性在圆柱体周围流动的2D和3D基准上进行了演示。GMG方法的关键组成部分是在各个子区间的所有时间自由度上构造局部Vanka平滑器及其有效应用。为此,生成了存储所有层次级别的雅可比矩阵的预先计算的单元逆并且只需要合理量的存储器开销的数据结构。GMG方法是为这笔交易而构建的。II有限元库。这些概念是灵活的,可以转移到类似的软件平台上。
{"title":"A Geometric Multigrid Method for Space-Time Finite Element Discretizations of the Navier–Stokes Equations and its Application to 3D Flow Simulation","authors":"Mathias Anselmann, M. Bause","doi":"10.1145/3582492","DOIUrl":"https://doi.org/10.1145/3582492","url":null,"abstract":"We present a parallelized geometric multigrid (GMG) method, based on the cell-based Vanka smoother, for higher order space-time finite element methods (STFEM) to the incompressible Navier–Stokes equations. The STFEM is implemented as a time marching scheme. The GMG solver is applied as a preconditioner for generalized minimal residual iterations. Its performance properties are demonstrated for 2D and 3D benchmarks of flow around a cylinder. The key ingredients of the GMG approach are the construction of the local Vanka smoother over all degrees of freedom in time of the respective subinterval and its efficient application. For this, data structures that store pre-computed cell inverses of the Jacobian for all hierarchical levels and require only a reasonable amount of memory overhead are generated. The GMG method is built for the deal.II finite element library. The concepts are flexible and can be transferred to similar software platforms.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2021-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49403692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
ACM Transactions on Mathematical Software
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1