首页 > 最新文献

SIAM Journal on Scientific Computing最新文献

英文 中文
A New Provably Stable Weighted State Redistribution Algorithm 一种新的可证明稳定的加权状态再分配算法
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-04 DOI: 10.1137/23m1597484
Marsha Berger, Andrew Giuliani
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2848-A2873, October 2024.
Abstract. We propose a practical finite volume method on cut cells using state redistribution. Our algorithm is provably monotone, total variation diminishing, and GKS (Gustafsson, Kreiss, Sundström) stable in many situations, and shuts off continuously as the cut cell size approaches a target value. Our analysis reveals why original state redistribution works so well: it results in a monotone scheme for most configurations, though at times subject to a slightly smaller CFL condition. Our analysis also explains why a premerging step is beneficial. We show computational experiments in two and three dimensions.
SIAM 科学计算期刊》,第 46 卷第 5 期,第 A2848-A2873 页,2024 年 10 月。 摘要我们提出了一种使用状态重分布的切割单元有限体积实用方法。我们的算法在很多情况下都能证明是单调的、总变异递减的和 GKS(Gustafsson, Kreiss, Sundström)稳定的,并能在切割单元大小接近目标值时连续关闭。我们的分析揭示了原始状态重分布如此有效的原因:它为大多数配置带来了单调方案,尽管有时会受到稍小的 CFL 条件的限制。我们的分析还解释了为什么预合并步骤是有益的。我们展示了二维和三维的计算实验。
{"title":"A New Provably Stable Weighted State Redistribution Algorithm","authors":"Marsha Berger, Andrew Giuliani","doi":"10.1137/23m1597484","DOIUrl":"https://doi.org/10.1137/23m1597484","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2848-A2873, October 2024. <br/> Abstract. We propose a practical finite volume method on cut cells using state redistribution. Our algorithm is provably monotone, total variation diminishing, and GKS (Gustafsson, Kreiss, Sundström) stable in many situations, and shuts off continuously as the cut cell size approaches a target value. Our analysis reveals why original state redistribution works so well: it results in a monotone scheme for most configurations, though at times subject to a slightly smaller CFL condition. Our analysis also explains why a premerging step is beneficial. We show computational experiments in two and three dimensions.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"36 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Domain Decomposition–Based CNN-DNN Architecture for Model Parallel Training Applied to Image Recognition Problems 应用于图像识别问题的基于领域分解的 CNN-DNN 模型并行训练架构
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-04 DOI: 10.1137/23m1562202
Axel Klawonn, Martin Lanser, Janine Weber
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page C557-C582, October 2024.
Abstract. Deep neural networks (DNNs) and, in particular, convolutional neural networks (CNNs) have brought significant advances in a wide range of modern computer application problems. However, the increasing availability of large numbers of datasets and the increasing available computational power of modern computers have led to steady growth in the complexity and size of DNN and CNN models, respectively, and thus, to longer training times. Hence, various methods and attempts have been developed to accelerate and parallelize the training of complex network architectures. In this work, a novel CNN-DNN architecture is proposed that naturally supports a model parallel training strategy and that is loosely inspired by two-level domain decomposition methods (DDMs). First, local CNN models, that is, subnetworks, are defined that operate on overlapping or nonoverlapping parts of the input data, for example, subimages. The subnetworks can be trained completely in parallel and independently of each other. Each subnetwork then outputs a local decision for the given machine learning problem which is exclusively based on the respective local input data. Subsequently, in a second step, an additional DNN model is trained which evaluates the local decisions of the local subnetworks and generates a final, global decision. With respect to the analogy to DDMs, the DNN models can be loosely interpreted as a coarse problem and hence, the new approach can be interpreted as a two-level domain decomposition. In this paper, we apply the proposed architecture to image classification problems using CNNs. Experimental results for different two-dimensional image classification problems are provided, as well as a face recognition problem and a classification problem for three-dimensional computed tomography (CT) scans. Therefore, classical Residual Network (ResNet) and VGG architectures are considered. More modern architectures, such as, e.g., MobileNet2, are left for future work. The results show that the proposed approach can significantly accelerate the required training time compared to the global model and, additionally, can also help to improve the accuracy of the underlying classification problem.
SIAM 科学计算期刊》,第 46 卷第 5 期,第 C557-C582 页,2024 年 10 月。 摘要深度神经网络(DNN),尤其是卷积神经网络(CNN),在解决现代计算机应用问题方面取得了重大进展。然而,随着大量数据集的不断出现和现代计算机计算能力的不断提高,DNN 和 CNN 模型的复杂性和规模也在稳步增长,从而导致训练时间的延长。因此,人们开发了各种方法和尝试来加速和并行化复杂网络架构的训练。在这项工作中,我们提出了一种新型 CNN-DNN 架构,该架构自然支持模型并行训练策略,其灵感来源于两级领域分解方法(DDM)。首先,定义局部 CNN 模型,即子网络,对输入数据的重叠或非重叠部分(如子图像)进行操作。这些子网络可以完全并行且相互独立地进行训练。然后,每个子网络针对给定的机器学习问题输出一个本地决策,该决策完全基于各自的本地输入数据。随后,在第二步中,再训练一个 DNN 模型,对本地子网络的局部决策进行评估,并生成最终的全局决策。与 DDM 类似,DNN 模型可被宽泛地解释为一个粗略的问题,因此,新方法可被解释为两级领域分解。在本文中,我们将提出的架构应用于使用 CNN 的图像分类问题。本文提供了不同二维图像分类问题、人脸识别问题和三维计算机断层扫描(CT)分类问题的实验结果。因此,我们考虑了经典的残差网络(ResNet)和 VGG 架构。更现代的架构,如 MobileNet2 等,则留待今后工作中考虑。结果表明,与全局模型相比,所提出的方法可以大大加快所需的训练时间,此外,还有助于提高基础分类问题的准确性。
{"title":"A Domain Decomposition–Based CNN-DNN Architecture for Model Parallel Training Applied to Image Recognition Problems","authors":"Axel Klawonn, Martin Lanser, Janine Weber","doi":"10.1137/23m1562202","DOIUrl":"https://doi.org/10.1137/23m1562202","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page C557-C582, October 2024. <br/> Abstract. Deep neural networks (DNNs) and, in particular, convolutional neural networks (CNNs) have brought significant advances in a wide range of modern computer application problems. However, the increasing availability of large numbers of datasets and the increasing available computational power of modern computers have led to steady growth in the complexity and size of DNN and CNN models, respectively, and thus, to longer training times. Hence, various methods and attempts have been developed to accelerate and parallelize the training of complex network architectures. In this work, a novel CNN-DNN architecture is proposed that naturally supports a model parallel training strategy and that is loosely inspired by two-level domain decomposition methods (DDMs). First, local CNN models, that is, subnetworks, are defined that operate on overlapping or nonoverlapping parts of the input data, for example, subimages. The subnetworks can be trained completely in parallel and independently of each other. Each subnetwork then outputs a local decision for the given machine learning problem which is exclusively based on the respective local input data. Subsequently, in a second step, an additional DNN model is trained which evaluates the local decisions of the local subnetworks and generates a final, global decision. With respect to the analogy to DDMs, the DNN models can be loosely interpreted as a coarse problem and hence, the new approach can be interpreted as a two-level domain decomposition. In this paper, we apply the proposed architecture to image classification problems using CNNs. Experimental results for different two-dimensional image classification problems are provided, as well as a face recognition problem and a classification problem for three-dimensional computed tomography (CT) scans. Therefore, classical Residual Network (ResNet) and VGG architectures are considered. More modern architectures, such as, e.g., MobileNet2, are left for future work. The results show that the proposed approach can significantly accelerate the required training time compared to the global model and, additionally, can also help to improve the accuracy of the underlying classification problem.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"75 1 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalizing Lloyd’s Algorithm for Graph Clustering 图形聚类的劳埃德算法推广
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-03 DOI: 10.1137/23m1556800
Tareq Zaman, Nicolas Nytko, Ali Taghibakhshi, Scott MacLachlan, Luke Olson, Matthew West
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2819-A2847, October 2024.
Abstract. Clustering is a commonplace problem in many areas of data science, with applications in biology and bioinformatics, understanding chemical structure, image segmentation, building recommender systems, and many more fields. While there are many different clustering variants (based on given distance or graph structure, probability distributions, or data density), we consider here the problem of clustering nodes in a graph, motivated by the problem of aggregating discrete degrees of freedom in multigrid and domain decomposition methods for solving sparse linear systems. Specifically, we consider the challenge of forming balanced clusters in the graph of a sparse matrix for use in algebraic multigrid, although the algorithm has general applicability. Based on an extension of the Bellman–Ford algorithm, we generalize Lloyd’s algorithm for partitioning subsets of [math] to balance the number of nodes in each cluster; this is accompanied by a rebalancing algorithm that reduces the overall energy in the system. The algorithm provides control over the number of clusters and leads to “well centered” partitions of the graph. Theoretical results are provided to establish linear complexity and numerical results in the context of algebraic multigrid highlight the benefits of improved clustering. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/lukeolson/paper-lloyd-data and in the supplementary materials (paper-lloyd-data.zip [88.1MB]).
SIAM 科学计算期刊》,第 46 卷第 5 期,第 A2819-A2847 页,2024 年 10 月。 摘要聚类是数据科学许多领域的常见问题,应用于生物学和生物信息学、理解化学结构、图像分割、构建推荐系统等许多领域。虽然有许多不同的聚类变体(基于给定的距离或图结构、概率分布或数据密度),但我们在此考虑的是图中节点的聚类问题,其动机来自多网格和域分解方法中离散自由度的聚合问题,用于求解稀疏线性系统。具体来说,我们考虑的难题是如何在稀疏矩阵图中形成平衡的簇,以用于代数多网格,尽管该算法具有普遍适用性。基于 Bellman-Ford 算法的扩展,我们对 [math] 的 Lloyd 子集划分算法进行了推广,以平衡每个簇中的节点数量;同时还采用了一种再平衡算法,以降低系统的总体能量。该算法可控制簇的数量,并导致图的 "居中 "分区。理论结果确定了线性复杂性,代数多网格的数值结果突出了改进聚类的好处。计算结果的可重复性。本文被授予 "SIAM 可重现徽章":代码和数据可用",以表彰作者遵循了 SISC 和科学计算界重视的可重现性原则。读者可以通过 https://github.com/lukeolson/paper-lloyd-data 和补充材料(paper-lloyd-data.zip [88.1MB])中的代码和数据重现本文的结果。
{"title":"Generalizing Lloyd’s Algorithm for Graph Clustering","authors":"Tareq Zaman, Nicolas Nytko, Ali Taghibakhshi, Scott MacLachlan, Luke Olson, Matthew West","doi":"10.1137/23m1556800","DOIUrl":"https://doi.org/10.1137/23m1556800","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2819-A2847, October 2024. <br/> Abstract. Clustering is a commonplace problem in many areas of data science, with applications in biology and bioinformatics, understanding chemical structure, image segmentation, building recommender systems, and many more fields. While there are many different clustering variants (based on given distance or graph structure, probability distributions, or data density), we consider here the problem of clustering nodes in a graph, motivated by the problem of aggregating discrete degrees of freedom in multigrid and domain decomposition methods for solving sparse linear systems. Specifically, we consider the challenge of forming balanced clusters in the graph of a sparse matrix for use in algebraic multigrid, although the algorithm has general applicability. Based on an extension of the Bellman–Ford algorithm, we generalize Lloyd’s algorithm for partitioning subsets of [math] to balance the number of nodes in each cluster; this is accompanied by a rebalancing algorithm that reduces the overall energy in the system. The algorithm provides control over the number of clusters and leads to “well centered” partitions of the graph. Theoretical results are provided to establish linear complexity and numerical results in the context of algebraic multigrid highlight the benefits of improved clustering. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/lukeolson/paper-lloyd-data and in the supplementary materials (paper-lloyd-data.zip [88.1MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"19 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Neural Network Approach for Stochastic Optimal Control 随机优化控制的神经网络方法
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-03 DOI: 10.1137/23m155832x
Xingjian Li, Deepanshu Verma, Lars Ruthotto
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page C535-C556, October 2024.
Abstract. We present a neural network approach for approximating the value function of high-dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and the fundamental relation between semilinear parabolic partial differential equations and forward-backward stochastic differential equations. To focus the sampling on relevant states during neural network training, we use the stochastic Pontryagin maximum principle (PMP) to obtain the optimal controls for the current value function estimate. By design, our approach coincides with the method of characteristics for the nonviscous Hamilton–Jacobi–Bellman equation arising in deterministic control problems. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the HJB equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem. Our numerical experiments highlight our scheme’s ability to identify the relevant parts of the state space and produce meaningful value estimates. Using a two-dimensional model problem, we demonstrate the importance of the stochastic PMP to inform the sampling and compare it to a finite element approach. With a nonlinear control affine quadcopter example, we illustrate that our approach can handle complicated dynamics. For a 100-dimensional benchmark problem, we demonstrate that our approach improves accuracy and time-to-solution, and, via a modification, we show the wider applicability of our scheme. Reproducibility of computational results.This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/EmoryMLIP/NeuralSOC and in the supplementary material (NeuralSOC-main.zip [ 29.9MB]).
SIAM 科学计算期刊》,第 46 卷第 5 期,第 C535-C556 页,2024 年 10 月。 摘要我们提出了一种近似高维随机控制问题价值函数的神经网络方法。我们的训练过程可同时更新我们的价值函数估计值,并确定最优轨迹可能访问的状态空间部分。我们的方法充分利用了最优控制理论以及半线性抛物线偏微分方程和前向后向随机微分方程之间的基本关系。为了在神经网络训练期间将采样重点放在相关状态上,我们使用随机庞特里亚金最大原则(PMP)来获得当前价值函数估计的最优控制。通过设计,我们的方法与确定性控制问题中出现的非粘性汉密尔顿-雅各比-贝尔曼方程的特征方法不谋而合。我们的训练损失由控制问题目标函数的加权和以及沿采样轨迹强制执行 HJB 方程的惩罚项组成。重要的是,训练是无监督的,因为它不需要控制问题的解决方案。我们的数值实验突出表明,我们的方案能够识别状态空间的相关部分,并产生有意义的值估计。通过一个二维模型问题,我们证明了随机 PMP 对采样的重要性,并将其与有限元方法进行了比较。通过一个非线性控制仿真四旋翼飞行器的例子,我们说明了我们的方法可以处理复杂的动力学问题。对于一个 100 维的基准问题,我们证明了我们的方法提高了准确性并缩短了求解时间,而且通过修改,我们展示了我们方案更广泛的适用性。计算结果的可重复性:本文被授予 "SIAM 可重复性徽章":代码和数据可用",以表彰作者遵循了 SISC 和科学计算界所珍视的可重现性原则。读者可以通过 https://github.com/EmoryMLIP/NeuralSOC 和补充材料(NeuralSOC-main.zip [ 29.9MB])中的代码和数据重现本文的结果。
{"title":"A Neural Network Approach for Stochastic Optimal Control","authors":"Xingjian Li, Deepanshu Verma, Lars Ruthotto","doi":"10.1137/23m155832x","DOIUrl":"https://doi.org/10.1137/23m155832x","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page C535-C556, October 2024. <br/> Abstract. We present a neural network approach for approximating the value function of high-dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and the fundamental relation between semilinear parabolic partial differential equations and forward-backward stochastic differential equations. To focus the sampling on relevant states during neural network training, we use the stochastic Pontryagin maximum principle (PMP) to obtain the optimal controls for the current value function estimate. By design, our approach coincides with the method of characteristics for the nonviscous Hamilton–Jacobi–Bellman equation arising in deterministic control problems. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the HJB equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem. Our numerical experiments highlight our scheme’s ability to identify the relevant parts of the state space and produce meaningful value estimates. Using a two-dimensional model problem, we demonstrate the importance of the stochastic PMP to inform the sampling and compare it to a finite element approach. With a nonlinear control affine quadcopter example, we illustrate that our approach can handle complicated dynamics. For a 100-dimensional benchmark problem, we demonstrate that our approach improves accuracy and time-to-solution, and, via a modification, we show the wider applicability of our scheme. Reproducibility of computational results.This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/EmoryMLIP/NeuralSOC and in the supplementary material (NeuralSOC-main.zip [ 29.9MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"26 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient GMRES+AMG on GPUs: Composite Smoothers And Mixed [math]-Cycles GPU 上的高效 GMRES+AMG:复合平滑器和混合[数学]循环
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-03 DOI: 10.1137/23m1578632
Stephen Thomas, Allison H. Baker
SIAM Journal on Scientific Computing, Ahead of Print.
Abstract. In this study, we introduce algorithms optimized for GPU architectures, aimed at efficiently solving large sparse linear systems, a central challenge in Navier–Stokes pressure projection problems. Our approach includes an adaptation of the GMRES algorithm, drawing inspiration from the merged vector operations first proposed by Bielich et al. [Parallel Comput., 112 (2022), 102940]. This adaptation increases computational intensity on GPU platforms through optimized vector update strategies. The algorithm incorporates modified and classical Gram–Schmidt methods with an algebraic multigrid (AMG) preconditioner, each tailored for GPU performance. A key innovation in our work is the development of a Gram–Schmidt projector [math] employing a rank-1 perturbation of the identity matrix. Designed to maximize the high memory bandwidth utilization of the AMD MI-250X GPU, this approach includes a strategy for treating the unit diagonal that minimizes memory reads, leading to a 25% increase in computational efficiency. The application of perturbation theory further ensures that orthogonality loss is limited to [math], where [math] is the number of iterations. Additionally, we introduce a mixed AMG [math]-cycle strategy combining ILU(0) and [math]-Jacobi smoothers, which achieves a 30–50% reduction in GPU compute times compared to conventional methods, while maintaining low backward error. This strategy, alongside our novel treatment of the diagonal in triangular matrices, marks a substantial increase in AMG efficicency for GPU systems. We believe that these contributions represent a significant advance in optimizing GMRES+AMG algorithms for GPU computations. The empirical results demonstrate notable speed increments and maintain rigorous backward error bounds, underscoring the potential of our methods to substantially increase computational efficiency in large-scale scientific applications.
SIAM 科学计算期刊》,提前印刷。 摘要在本研究中,我们介绍了针对 GPU 架构进行优化的算法,旨在高效求解大型稀疏线性系统,这是 Navier-Stokes 压力投影问题的核心挑战。我们的方法包括对 GMRES 算法的改编,从 Bielich 等人首次提出的合并矢量运算中汲取灵感[《并行计算》,112 (2022),102940]。这种调整通过优化向量更新策略提高了 GPU 平台的计算强度。该算法结合了修正的经典格兰-施密特方法和代数多网格(AMG)预处理器,每种方法都是为 GPU 性能量身定制的。我们工作中的一项关键创新是开发了一种格拉姆-施密特投影器[math],采用了秩-1扰动特征矩阵。这种方法旨在最大限度地利用 AMD MI-250X GPU 的高内存带宽,其中包括一种处理单元对角线的策略,可最大限度地减少内存读取,从而将计算效率提高 25%。扰动理论的应用进一步确保了正交损失仅限于 [math],其中 [math] 是迭代次数。此外,我们还引入了一种混合 AMG [math]循环策略,它结合了 ILU(0) 和 [math]-Jacobi 平滑器,与传统方法相比,GPU 计算时间减少了 30-50%,同时保持了较低的后向误差。这一策略以及我们对三角形矩阵对角线的新颖处理,标志着 GPU 系统 AMG 效率的大幅提升。我们相信,这些贡献代表了为 GPU 计算优化 GMRES+AMG 算法的重大进展。实证结果显示了显著的速度提升,并保持了严格的后向误差边界,突出了我们的方法在大规模科学应用中大幅提高计算效率的潜力。
{"title":"Efficient GMRES+AMG on GPUs: Composite Smoothers And Mixed [math]-Cycles","authors":"Stephen Thomas, Allison H. Baker","doi":"10.1137/23m1578632","DOIUrl":"https://doi.org/10.1137/23m1578632","url":null,"abstract":"SIAM Journal on Scientific Computing, Ahead of Print. <br/> Abstract. In this study, we introduce algorithms optimized for GPU architectures, aimed at efficiently solving large sparse linear systems, a central challenge in Navier–Stokes pressure projection problems. Our approach includes an adaptation of the GMRES algorithm, drawing inspiration from the merged vector operations first proposed by Bielich et al. [Parallel Comput., 112 (2022), 102940]. This adaptation increases computational intensity on GPU platforms through optimized vector update strategies. The algorithm incorporates modified and classical Gram–Schmidt methods with an algebraic multigrid (AMG) preconditioner, each tailored for GPU performance. A key innovation in our work is the development of a Gram–Schmidt projector [math] employing a rank-1 perturbation of the identity matrix. Designed to maximize the high memory bandwidth utilization of the AMD MI-250X GPU, this approach includes a strategy for treating the unit diagonal that minimizes memory reads, leading to a 25% increase in computational efficiency. The application of perturbation theory further ensures that orthogonality loss is limited to [math], where [math] is the number of iterations. Additionally, we introduce a mixed AMG [math]-cycle strategy combining ILU(0) and [math]-Jacobi smoothers, which achieves a 30–50% reduction in GPU compute times compared to conventional methods, while maintaining low backward error. This strategy, alongside our novel treatment of the diagonal in triangular matrices, marks a substantial increase in AMG efficicency for GPU systems. We believe that these contributions represent a significant advance in optimizing GMRES+AMG algorithms for GPU computations. The empirical results demonstrate notable speed increments and maintain rigorous backward error bounds, underscoring the potential of our methods to substantially increase computational efficiency in large-scale scientific applications.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"58 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bounds on Nonlinear Errors for Variance Computation with Stochastic Rounding 随机舍入方差计算的非线性误差限值
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-03 DOI: 10.1137/23m1563001
E-M. El Arar, D. Sohier, P. de Oliveira Castro, E. Petit
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page B579-B599, October 2024.
Abstract. The main objective of this work is to investigate nonlinear errors and pairwise summation using stochastic rounding (SR) in variance computation algorithms. We estimate the forward error of computations under SR through two methods: the first is based on a bound of the variance and the Bienaymé–Chebyshev inequality, while the second is based on martingales and the Azuma–Hoeffding inequality. The study shows that for pairwise summation, using SR results in a probabilistic bound of the forward error proportional to [math] rather than the deterministic bound in [math] when using the default rounding mode. We examine two algorithms that compute the variance, one called “textbook” and the other “two-pass,” which both exhibit nonlinear errors. Using the two methods mentioned above, we show that the forward errors of these algorithms have probabilistic bounds under SR in [math] instead of [math] for the deterministic bounds. We show that this advantage holds using pairwise summation for both textbook and two-pass, with probabilistic bounds of the forward error proportional to [math]. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow the reader to reproduce the results in this paper are available at https://github.com/verificarlo/sr-non-linear-bounds and in the supplementary material (sr-non-linear-bounds-main.zip [8.62KB]).
SIAM 科学计算期刊》,第 46 卷第 5 期,第 B579-B599 页,2024 年 10 月。 摘要这项工作的主要目的是研究方差计算算法中使用随机舍入(SR)的非线性误差和成对求和。我们通过两种方法估算了 SR 条件下计算的前向误差:第一种方法基于方差约束和 Bienaymé-Chebyshev 不等式,第二种方法基于马氏不等式和 Azuma-Hoeffding 不等式。研究表明,对于成对求和,在使用默认舍入模式时,使用 SR 可以得到与[math]成比例的前向误差概率约束,而不是[math]中的确定性约束。我们研究了两种计算方差的算法,一种称为 "教科书式",另一种称为 "双通式",这两种算法都表现出非线性误差。利用上述两种方法,我们证明了这些算法的前向误差在[math]的 SR 下具有概率边界,而不是[math]的确定性边界。我们证明了这一优势在教科书算法和双程算法中使用成对求和时都是成立的,前向误差的概率边界与[math]成正比。计算结果的可重复性。本文被授予 "SIAM 可重现徽章":代码和数据可用",以表彰作者遵循了 SISC 和科学计算界重视的可重现性原则。读者可以通过 https://github.com/verificarlo/sr-non-linear-bounds 和补充材料(sr-non-linear-bounds-main.zip [8.62KB])中的代码和数据重现本文的结果。
{"title":"Bounds on Nonlinear Errors for Variance Computation with Stochastic Rounding","authors":"E-M. El Arar, D. Sohier, P. de Oliveira Castro, E. Petit","doi":"10.1137/23m1563001","DOIUrl":"https://doi.org/10.1137/23m1563001","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page B579-B599, October 2024. <br/> Abstract. The main objective of this work is to investigate nonlinear errors and pairwise summation using stochastic rounding (SR) in variance computation algorithms. We estimate the forward error of computations under SR through two methods: the first is based on a bound of the variance and the Bienaymé–Chebyshev inequality, while the second is based on martingales and the Azuma–Hoeffding inequality. The study shows that for pairwise summation, using SR results in a probabilistic bound of the forward error proportional to [math] rather than the deterministic bound in [math] when using the default rounding mode. We examine two algorithms that compute the variance, one called “textbook” and the other “two-pass,” which both exhibit nonlinear errors. Using the two methods mentioned above, we show that the forward errors of these algorithms have probabilistic bounds under SR in [math] instead of [math] for the deterministic bounds. We show that this advantage holds using pairwise summation for both textbook and two-pass, with probabilistic bounds of the forward error proportional to [math]. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow the reader to reproduce the results in this paper are available at https://github.com/verificarlo/sr-non-linear-bounds and in the supplementary material (sr-non-linear-bounds-main.zip [8.62KB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"72 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse Source Problem of the Biharmonic Equation from Multifrequency Phaseless Data 从多频无相数据看比谐波方程的反源问题
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-09-03 DOI: 10.1137/24m162889x
Yan Chang, Yukun Guo, Yue Zhao
SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2799-A2818, October 2024.
Abstract. This work deals with an inverse source problem for the biharmonic wave equation. A two-stage numerical method is proposed to identify the unknown source from the multifrequency phaseless data. In the first stage, we introduce some artificial auxiliary point sources to the inverse source system and establish a phase retrieval formula. Theoretically, we point out that the phase can be uniquely determined and estimate the stability of this phase retrieval approach. Once the phase information is retrieved, the Fourier method is adopted to reconstruct the source function from the phased multifrequency data. The proposed method is easy to implement and there is no forward solver involved in the reconstruction. Numerical experiments are conducted to verify the performance of the proposed method.
SIAM 科学计算期刊》,第 46 卷第 5 期,第 A2799-A2818 页,2024 年 10 月。 摘要本研究涉及双谐波方程的反源问题。提出了一种两阶段数值方法,以从多频无相数据中识别未知源。在第一阶段,我们在反源系统中引入了一些人工辅助点源,并建立了一个相位检索公式。我们从理论上指出相位可以唯一确定,并估计了这种相位检索方法的稳定性。一旦检索到相位信息,就可以采用傅立叶方法从相位多频数据中重建源函数。所提出的方法易于实施,重建过程中不涉及前向求解器。为验证所提方法的性能,我们进行了数值实验。
{"title":"Inverse Source Problem of the Biharmonic Equation from Multifrequency Phaseless Data","authors":"Yan Chang, Yukun Guo, Yue Zhao","doi":"10.1137/24m162889x","DOIUrl":"https://doi.org/10.1137/24m162889x","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 5, Page A2799-A2818, October 2024. <br/> Abstract. This work deals with an inverse source problem for the biharmonic wave equation. A two-stage numerical method is proposed to identify the unknown source from the multifrequency phaseless data. In the first stage, we introduce some artificial auxiliary point sources to the inverse source system and establish a phase retrieval formula. Theoretically, we point out that the phase can be uniquely determined and estimate the stability of this phase retrieval approach. Once the phase information is retrieved, the Fourier method is adopted to reconstruct the source function from the phased multifrequency data. The proposed method is easy to implement and there is no forward solver involved in the reconstruction. Numerical experiments are conducted to verify the performance of the proposed method.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"163 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Pseudoreversible Normalizing Flow for Stochastic Dynamical Systems with Various Initial Distributions 具有各种初始分布的随机动力系统的伪逆转归一化流程
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-08-21 DOI: 10.1137/23m1585635
Minglei Yang, Pengjun Wang, Diego del-Castillo-Negrete, Yanzhao Cao, Guannan Zhang
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page C508-C533, August 2024.
Abstract. We present a pseudoreversible normalizing flow method for efficiently generating samples of the state of a stochastic differential equation (SDE) with various initial distributions. The primary objective is to construct an accurate and efficient sampler that can be used as a surrogate model for computationally expensive numerical integration of SDEs, such as those employed in particle simulation. After training, the normalizing flow model can directly generate samples of the SDE’s final state without simulating trajectories. The existing normalizing flow model for SDEs depends on the initial distribution, meaning the model needs to be retrained when the initial distribution changes. The main novelty of our normalizing flow model is that it can learn the conditional distribution of the state, i.e., the distribution of the final state conditional on any initial state, such that the model only needs to be trained once and the trained model can be used to handle various initial distributions. This feature can provide a significant computational saving in studies of how the final state varies with the initial distribution. Additionally, we propose to use a pseudoreversible network architecture to define the normalizing flow model, which has sufficient expressive power and training efficiency for a variety of SDEs in science and engineering, e.g., in particle physics. We provide a rigorous convergence analysis of the pseudoreversible normalizing flow model to the target probability density function in the Kullback–Leibler divergence metric. Numerical experiments are provided to demonstrate the effectiveness of the proposed normalizing flow model. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/mlmathphy/PRNF and in the supplementary materials (PRNF-main.zip [27.4MB]).
SIAM 科学计算期刊》,第 46 卷第 4 期,第 C508-C533 页,2024 年 8 月。 摘要。我们提出了一种伪可逆归一化流方法,用于高效生成具有各种初始分布的随机微分方程(SDE)的状态样本。其主要目的是构建一种精确高效的采样器,可用作计算成本高昂的 SDE 数值积分的代用模型,如粒子模拟中使用的代用模型。经过训练后,归一化流模型可以直接生成 SDE 最终状态的样本,而无需模拟轨迹。现有的 SDE 归一化流模型依赖于初始分布,这意味着当初始分布发生变化时,模型需要重新训练。我们的归一化流模型的主要新颖之处在于它可以学习状态的条件分布,即以任意初始状态为条件的最终状态分布,因此模型只需训练一次,而且训练后的模型可用于处理各种初始分布。在研究最终状态如何随初始分布变化时,这一特点可以大大节省计算量。此外,我们还建议使用伪可逆网络架构来定义归一化流模型,它具有足够的表达能力和训练效率,适用于科学和工程领域的各种 SDE,例如粒子物理学。我们用 Kullback-Leibler 发散度量对伪可逆归一化流模型到目标概率密度函数进行了严格的收敛分析。我们还提供了数值实验来证明所提出的归一化流动模型的有效性。计算结果的可重复性。本文被授予 "SIAM 可重复性徽章":代码和数据可用",以表彰作者遵循了 SISC 和科学计算界重视的可重现性原则。读者可以通过 https://github.com/mlmathphy/PRNF 和补充材料(PRNF-main.zip [27.4MB])中的代码和数据重现本文的结果。
{"title":"A Pseudoreversible Normalizing Flow for Stochastic Dynamical Systems with Various Initial Distributions","authors":"Minglei Yang, Pengjun Wang, Diego del-Castillo-Negrete, Yanzhao Cao, Guannan Zhang","doi":"10.1137/23m1585635","DOIUrl":"https://doi.org/10.1137/23m1585635","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page C508-C533, August 2024. <br/> Abstract. We present a pseudoreversible normalizing flow method for efficiently generating samples of the state of a stochastic differential equation (SDE) with various initial distributions. The primary objective is to construct an accurate and efficient sampler that can be used as a surrogate model for computationally expensive numerical integration of SDEs, such as those employed in particle simulation. After training, the normalizing flow model can directly generate samples of the SDE’s final state without simulating trajectories. The existing normalizing flow model for SDEs depends on the initial distribution, meaning the model needs to be retrained when the initial distribution changes. The main novelty of our normalizing flow model is that it can learn the conditional distribution of the state, i.e., the distribution of the final state conditional on any initial state, such that the model only needs to be trained once and the trained model can be used to handle various initial distributions. This feature can provide a significant computational saving in studies of how the final state varies with the initial distribution. Additionally, we propose to use a pseudoreversible network architecture to define the normalizing flow model, which has sufficient expressive power and training efficiency for a variety of SDEs in science and engineering, e.g., in particle physics. We provide a rigorous convergence analysis of the pseudoreversible normalizing flow model to the target probability density function in the Kullback–Leibler divergence metric. Numerical experiments are provided to demonstrate the effectiveness of the proposed normalizing flow model. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/mlmathphy/PRNF and in the supplementary materials (PRNF-main.zip [27.4MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"7 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Sketch-and-Select Arnoldi Process 草图-选择-阿诺德工艺
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-08-21 DOI: 10.1137/23m1588007
Stefan Güttel, Igor Simunec
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2774-A2797, August 2024.
Abstract. A sketch-and-select Arnoldi process to generate a well-conditioned basis of a Krylov space at low cost is proposed. At each iteration the procedure utilizes randomized sketching to select a limited number of previously computed basis vectors to project out of the current basis vector. The computational cost grows linearly with the dimension of the Krylov space. The subset selection problem for the projection step is approximately solved with a number of heuristic algorithms and greedy methods used in statistical learning and compressive sensing. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/simunec/sketch-select-arnoldi and in the supplementary materials (sketch-select-arnoldi-main.zip [2.21MB]).
SIAM 科学计算期刊》,第 46 卷第 4 期,第 A2774-A2797 页,2024 年 8 月。 摘要。本文提出了一种 "草图-选择阿诺德过程"(sketch-and-select Arnoldi process),以低成本生成条件良好的克雷洛夫空间基。在每次迭代时,该过程利用随机草图选择有限数量的先前计算的基向量,以投影出当前的基向量。计算成本与克雷洛夫空间的维度呈线性增长。投影步骤的子集选择问题可以通过统计学习和压缩传感中使用的一些启发式算法和贪婪方法近似解决。计算结果的可重复性。本文被授予 "SIAM 可重现徽章":代码和数据可用",以表彰作者遵循了 SISC 和科学计算界重视的可重现性原则。读者可以通过 https://github.com/simunec/sketch-select-arnoldi 和补充材料(sketch-select-arnoldi-main.zip [2.21MB])中的代码和数据重现本文的结果。
{"title":"A Sketch-and-Select Arnoldi Process","authors":"Stefan Güttel, Igor Simunec","doi":"10.1137/23m1588007","DOIUrl":"https://doi.org/10.1137/23m1588007","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2774-A2797, August 2024. <br/> Abstract. A sketch-and-select Arnoldi process to generate a well-conditioned basis of a Krylov space at low cost is proposed. At each iteration the procedure utilizes randomized sketching to select a limited number of previously computed basis vectors to project out of the current basis vector. The computational cost grows linearly with the dimension of the Krylov space. The subset selection problem for the projection step is approximately solved with a number of heuristic algorithms and greedy methods used in statistical learning and compressive sensing. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/simunec/sketch-select-arnoldi and in the supplementary materials (sketch-select-arnoldi-main.zip [2.21MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"63 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fast Iterative PDE-Based Algorithm for Feedback Controls of Nonsmooth Mean-Field Control Problems 基于 PDE 的非光滑平均场控制问题反馈控制快速迭代算法
IF 3.1 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2024-08-20 DOI: 10.1137/21m1441158
Christoph Reisinger, Wolfgang Stockinger, Yufei Zhang
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2737-A2773, August 2024.
Abstract. We propose a PDE-based accelerated gradient algorithm for optimal feedback controls of McKean–Vlasov dynamics that involve mean-field interactions both in the state and action. The method exploits a forward-backward splitting approach and iteratively refines the approximate controls based on the gradients of smooth costs, the proximal maps of nonsmooth costs, and dynamically updated momentum parameters. At each step, the state dynamics is approximated via a particle system, and the required gradient is evaluated through a coupled system of nonlocal linear PDEs. The latter is solved by finite difference approximation or neural network-based residual approximation, depending on the state dimension. We present exhaustive numerical experiments for low- and high-dimensional mean-field control problems, including sparse stabilization of stochastic Cucker–Smale models, which reveal that our algorithm captures important structures of the optimal feedback control and achieves a robust performance with respect to parameter perturbation.
SIAM 科学计算期刊》,第 46 卷第 4 期,第 A2737-A2773 页,2024 年 8 月。 摘要我们提出了一种基于 PDE 的加速梯度算法,用于麦金-弗拉索夫(McKean-Vlasov)动力学的最优反馈控制。该方法利用前向-后向分裂方法,根据平滑代价的梯度、非平滑代价的近似图和动态更新的动量参数迭代改进近似控制。每一步都通过粒子系统对状态动态进行近似,并通过非局部线性 PDE 耦合系统评估所需梯度。后者根据状态维度,通过有限差分近似或基于神经网络的残差近似来求解。我们针对低维和高维均场控制问题(包括随机 Cucker-Smale 模型的稀疏稳定)进行了详尽的数值实验,结果表明我们的算法捕捉到了最优反馈控制的重要结构,并在参数扰动方面实现了稳健的性能。
{"title":"A Fast Iterative PDE-Based Algorithm for Feedback Controls of Nonsmooth Mean-Field Control Problems","authors":"Christoph Reisinger, Wolfgang Stockinger, Yufei Zhang","doi":"10.1137/21m1441158","DOIUrl":"https://doi.org/10.1137/21m1441158","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2737-A2773, August 2024. <br/> Abstract. We propose a PDE-based accelerated gradient algorithm for optimal feedback controls of McKean–Vlasov dynamics that involve mean-field interactions both in the state and action. The method exploits a forward-backward splitting approach and iteratively refines the approximate controls based on the gradients of smooth costs, the proximal maps of nonsmooth costs, and dynamically updated momentum parameters. At each step, the state dynamics is approximated via a particle system, and the required gradient is evaluated through a coupled system of nonlocal linear PDEs. The latter is solved by finite difference approximation or neural network-based residual approximation, depending on the state dimension. We present exhaustive numerical experiments for low- and high-dimensional mean-field control problems, including sparse stabilization of stochastic Cucker–Smale models, which reveal that our algorithm captures important structures of the optimal feedback control and achieves a robust performance with respect to parameter perturbation.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"23 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
SIAM Journal on Scientific Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1