Shifan Zhao, Tianshi Xu, Hua Huang, Edmond Chow, Yuanzhe Xi
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2351-A2376, August 2024. Abstract. The spectrum of a kernel matrix significantly depends on the parameter values of the kernel function used to define the kernel matrix. This makes it challenging to design a preconditioner for a regularized kernel matrix that is robust across different parameter values. This paper proposes the adaptive factorized Nyström (AFN) preconditioner. The preconditioner is designed for the case where the rank [math] of the Nyström approximation is large, i.e., for kernel function parameters that lead to kernel matrices with eigenvalues that decay slowly. AFN deliberately chooses a well-conditioned submatrix to solve with and corrects a Nyström approximation with a factorized sparse approximate matrix inverse. This makes AFN efficient for kernel matrices with large numerical ranks. AFN also adaptively chooses the size of this submatrix to balance accuracy and cost. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/scalable-matrix/H2Pack/tree/AFN_precond and in the supplementary materials (H2Pack.zip [3.99MB]).
{"title":"An Adaptive Factorized Nyström Preconditioner for Regularized Kernel Matrices","authors":"Shifan Zhao, Tianshi Xu, Hua Huang, Edmond Chow, Yuanzhe Xi","doi":"10.1137/23m1565139","DOIUrl":"https://doi.org/10.1137/23m1565139","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2351-A2376, August 2024. <br/> Abstract. The spectrum of a kernel matrix significantly depends on the parameter values of the kernel function used to define the kernel matrix. This makes it challenging to design a preconditioner for a regularized kernel matrix that is robust across different parameter values. This paper proposes the adaptive factorized Nyström (AFN) preconditioner. The preconditioner is designed for the case where the rank [math] of the Nyström approximation is large, i.e., for kernel function parameters that lead to kernel matrices with eigenvalues that decay slowly. AFN deliberately chooses a well-conditioned submatrix to solve with and corrects a Nyström approximation with a factorized sparse approximate matrix inverse. This makes AFN efficient for kernel matrices with large numerical ranks. AFN also adaptively chooses the size of this submatrix to balance accuracy and cost. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/scalable-matrix/H2Pack/tree/AFN_precond and in the supplementary materials (H2Pack.zip [3.99MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"42 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2324-A2350, August 2024. Abstract. For problems of time-harmonic scattering by rational polygonal obstacles, embedding formulae express the far-field pattern induced by any incident plane wave in terms of the far-field patterns for a relatively small (frequency-independent) set of canonical incident angles. Although these remarkable formulae are exact in theory, here we demonstrate that (i) they are highly sensitive to numerical errors in practice, and (ii) direct calculation of the coefficients in these formulae may be impossible for particular sets of canonical incident angles, even in exact arithmetic. Only by overcoming these practical issues can embedding formulae provide a highly efficient approach to computing the far-field pattern induced by a large number of incident angles. Here we address challenges (i) and (ii), supporting our theory with numerical experiments. Challenge (i) is solved using techniques from computational complex analysis: we reformulate the embedding formula as a complex contour integral and prove that this is much less sensitive to numerical errors. In practice, this contour integral can be efficiently evaluated by residue calculus. Challenge (ii) is addressed using techniques from numerical linear algebra: we oversample, considering more canonical incident angles than are necessary, thus expanding the set of valid coefficient vectors. The coefficient vector can then be selected using either a least squares approach or column subset selection.
{"title":"An Efficient Frequency-Independent Numerical Method for Computing the Far-Field Pattern Induced by Polygonal Obstacles","authors":"Andrew Gibbs, Stephen Langdon","doi":"10.1137/23m1612160","DOIUrl":"https://doi.org/10.1137/23m1612160","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2324-A2350, August 2024. <br/> Abstract. For problems of time-harmonic scattering by rational polygonal obstacles, embedding formulae express the far-field pattern induced by any incident plane wave in terms of the far-field patterns for a relatively small (frequency-independent) set of canonical incident angles. Although these remarkable formulae are exact in theory, here we demonstrate that (i) they are highly sensitive to numerical errors in practice, and (ii) direct calculation of the coefficients in these formulae may be impossible for particular sets of canonical incident angles, even in exact arithmetic. Only by overcoming these practical issues can embedding formulae provide a highly efficient approach to computing the far-field pattern induced by a large number of incident angles. Here we address challenges (i) and (ii), supporting our theory with numerical experiments. Challenge (i) is solved using techniques from computational complex analysis: we reformulate the embedding formula as a complex contour integral and prove that this is much less sensitive to numerical errors. In practice, this contour integral can be efficiently evaluated by residue calculus. Challenge (ii) is addressed using techniques from numerical linear algebra: we oversample, considering more canonical incident angles than are necessary, thus expanding the set of valid coefficient vectors. The coefficient vector can then be selected using either a least squares approach or column subset selection.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"2 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B502-B526, August 2024. Abstract. For numerical approximations to stochastic differential equations using the Euler–Maruyama scheme, we propose incorporating approximate random variables computed using low precisions, such as single and half precision. We propose and justify a model for the rounding error incurred, and produce an average case error bound for two and four way differences, appropriate for regular and nested multilevel Monte Carlo estimations. Our rounding error model recovers and extends the statistical model by Arciniega and Allen [Stoch. Anal. Appl., 21 (2003), pp. 281–300], while bounding the size that systematic and biased rounding errors are permitted to be. By considering the variance structure of multilevel Monte Carlo correction terms in various precisions with and without a Kahan compensated summation, we compute the potential speed ups offered from the various precisions. We find single precision offers the potential for approximate speed improvements by a factor of 7 across a wide span of discretization levels. Half precision offers comparable improvements for several levels of coarse simulations, and even offers improvements by a factor of 10–12 for the very coarsest few levels, which are likely to dominate higher order methods such as the Milstein scheme.
{"title":"Rounding Error Using Low Precision Approximate Random Variables","authors":"Michael B. Giles, Oliver Sheridan-Methven","doi":"10.1137/23m1552814","DOIUrl":"https://doi.org/10.1137/23m1552814","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B502-B526, August 2024. <br/> Abstract. For numerical approximations to stochastic differential equations using the Euler–Maruyama scheme, we propose incorporating approximate random variables computed using low precisions, such as single and half precision. We propose and justify a model for the rounding error incurred, and produce an average case error bound for two and four way differences, appropriate for regular and nested multilevel Monte Carlo estimations. Our rounding error model recovers and extends the statistical model by Arciniega and Allen [Stoch. Anal. Appl., 21 (2003), pp. 281–300], while bounding the size that systematic and biased rounding errors are permitted to be. By considering the variance structure of multilevel Monte Carlo correction terms in various precisions with and without a Kahan compensated summation, we compute the potential speed ups offered from the various precisions. We find single precision offers the potential for approximate speed improvements by a factor of 7 across a wide span of discretization levels. Half precision offers comparable improvements for several levels of coarse simulations, and even offers improvements by a factor of 10–12 for the very coarsest few levels, which are likely to dominate higher order methods such as the Milstein scheme.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"54 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2298-A2323, August 2024. Abstract. The ParaDiag family of algorithms solves differential equations by using preconditioners that can be inverted in parallel through diagonalization. In the context of optimal control of linear parabolic PDEs, the state-of-the-art ParaDiag method is limited to solving self-adjoint problems with a tracking objective. We propose three improvements to the ParaDiag method: the use of alpha-circulant matrices to construct an alternative preconditioner, a generalization of the algorithm for solving non-self-adjoint equations, and the formulation of an algorithm for terminal-cost objectives. We present novel analytic results about the eigenvalues of the preconditioned systems for all discussed ParaDiag algorithms in the case of self-adjoint equations, which proves the favorable properties of the alpha-circulant preconditioner. We use these results to perform a theoretical parallel-scaling analysis of ParaDiag for self-adjoint problems. Numerical tests confirm our findings and suggest that the self-adjoint behavior, which is backed by theory, generalizes to the non-self-adjoint case. We provide a sequential, open-source reference solver in MATLAB for all discussed algorithms. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available,” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://gitlab.kuleuven.be/numa/public/pintopt or in the supplementary materials (repro-generalized-paradiag.zip [86.4KB]).
{"title":"On Generalized Preconditioners for Time-Parallel Parabolic Optimal Control","authors":"Arne Bouillon, Giovanni Samaey, Karl Meerbergen","doi":"10.1137/23m1553194","DOIUrl":"https://doi.org/10.1137/23m1553194","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2298-A2323, August 2024. <br/> Abstract. The ParaDiag family of algorithms solves differential equations by using preconditioners that can be inverted in parallel through diagonalization. In the context of optimal control of linear parabolic PDEs, the state-of-the-art ParaDiag method is limited to solving self-adjoint problems with a tracking objective. We propose three improvements to the ParaDiag method: the use of alpha-circulant matrices to construct an alternative preconditioner, a generalization of the algorithm for solving non-self-adjoint equations, and the formulation of an algorithm for terminal-cost objectives. We present novel analytic results about the eigenvalues of the preconditioned systems for all discussed ParaDiag algorithms in the case of self-adjoint equations, which proves the favorable properties of the alpha-circulant preconditioner. We use these results to perform a theoretical parallel-scaling analysis of ParaDiag for self-adjoint problems. Numerical tests confirm our findings and suggest that the self-adjoint behavior, which is backed by theory, generalizes to the non-self-adjoint case. We provide a sequential, open-source reference solver in MATLAB for all discussed algorithms. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available,” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://gitlab.kuleuven.be/numa/public/pintopt or in the supplementary materials (repro-generalized-paradiag.zip [86.4KB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2270-A2297, August 2024. Abstract. We present an extension of the summation-by-parts (SBP) framework to tensor-product spectral-element operators in collapsed coordinates. The proposed approach enables the construction of provably stable discretizations of arbitrary order which combine the geometric flexibility of unstructured triangular and tetrahedral meshes with the efficiency of sum-factorization algorithms. Specifically, a methodology is developed for constructing triangular and tetrahedral spectral-element operators of any order which possess the SBP property (i.e., satisfying a discrete analogue of integration by parts) as well as a tensor-product decomposition. Such operators are then employed within the context of discontinuous spectral-element methods based on nodal expansions collocated at the tensor-product quadrature nodes as well as modal expansions employing Proriol–Koornwinder–Dubiner polynomials, the latter approach resolving the time step limitation associated with the singularity of the collapsed coordinate transformation. Energy-stable formulations for curvilinear meshes are obtained using a skew-symmetric splitting of the metric terms, and a weight-adjusted approximation is used to efficiently invert the curvilinear modal mass matrix. The proposed schemes are compared to those using nontensorial multidimensional SBP operators and are found to offer comparable accuracy to such schemes in the context of smooth linear advection problems on curved meshes, but at a reduced computational cost for higher polynomial degrees. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and Data Available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/tristanmontoya/ReproduceSBPSimplex and in the supplementary materials (reproducibility_repository.zip [35.7MB]).
{"title":"Efficient Tensor-Product Spectral-Element Operators with the Summation-by-Parts Property on Curved Triangles and Tetrahedra","authors":"Tristan Montoya, David W. Zingg","doi":"10.1137/23m1573963","DOIUrl":"https://doi.org/10.1137/23m1573963","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2270-A2297, August 2024. <br/> Abstract. We present an extension of the summation-by-parts (SBP) framework to tensor-product spectral-element operators in collapsed coordinates. The proposed approach enables the construction of provably stable discretizations of arbitrary order which combine the geometric flexibility of unstructured triangular and tetrahedral meshes with the efficiency of sum-factorization algorithms. Specifically, a methodology is developed for constructing triangular and tetrahedral spectral-element operators of any order which possess the SBP property (i.e., satisfying a discrete analogue of integration by parts) as well as a tensor-product decomposition. Such operators are then employed within the context of discontinuous spectral-element methods based on nodal expansions collocated at the tensor-product quadrature nodes as well as modal expansions employing Proriol–Koornwinder–Dubiner polynomials, the latter approach resolving the time step limitation associated with the singularity of the collapsed coordinate transformation. Energy-stable formulations for curvilinear meshes are obtained using a skew-symmetric splitting of the metric terms, and a weight-adjusted approximation is used to efficiently invert the curvilinear modal mass matrix. The proposed schemes are compared to those using nontensorial multidimensional SBP operators and are found to offer comparable accuracy to such schemes in the context of smooth linear advection problems on curved meshes, but at a reduced computational cost for higher polynomial degrees. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and Data Available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/tristanmontoya/ReproduceSBPSimplex and in the supplementary materials (reproducibility_repository.zip [35.7MB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"15 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyuan Liu, Haifeng Wang, Hong Zhang, Kaijun Bao, Xu Qian, Songhe Song
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page C323-C348, August 2024. Abstract. By learning the mappings between infinite function spaces using carefully designed neural networks, the operator learning methodology has exhibited significantly more efficiency than traditional methods in solving differential equations, but faces concerns about their accuracy and reliability. To overcome these limitations through robustly enforcing boundary conditions (BCs), a general neural architecture named spectral operator learning is introduced by combining with the structures of the spectral numerical method. One variant called the orthogonal polynomial neural operator (OPNO) is proposed later, aiming at PDEs with Dirichlet, Neumann, and Robin BCs. The strict BC satisfaction properties and the universal approximation capacity of the OPNO are theoretically proven. A variety of numerical experiments with physical backgrounds demonstrate that the OPNO outperforms other existing deep learning methodologies, showcasing potential of comparable accuracy with the traditional second-order finite difference method that employs a considerably fine mesh (with relative errors on the order of [math]), and is up to almost 5 magnitudes faster over the traditional method. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/liu-ziyuan-math/spectral_operator_learning/tree/main/OPNO/Reproduce and in the supplementary materials (spectral_operator_learning-main.zip [669KB]).
{"title":"Render unto Numerics: Orthogonal Polynomial Neural Operator for PDEs with Nonperiodic Boundary Conditions","authors":"Ziyuan Liu, Haifeng Wang, Hong Zhang, Kaijun Bao, Xu Qian, Songhe Song","doi":"10.1137/23m1556320","DOIUrl":"https://doi.org/10.1137/23m1556320","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page C323-C348, August 2024. <br/> Abstract. By learning the mappings between infinite function spaces using carefully designed neural networks, the operator learning methodology has exhibited significantly more efficiency than traditional methods in solving differential equations, but faces concerns about their accuracy and reliability. To overcome these limitations through robustly enforcing boundary conditions (BCs), a general neural architecture named spectral operator learning is introduced by combining with the structures of the spectral numerical method. One variant called the orthogonal polynomial neural operator (OPNO) is proposed later, aiming at PDEs with Dirichlet, Neumann, and Robin BCs. The strict BC satisfaction properties and the universal approximation capacity of the OPNO are theoretically proven. A variety of numerical experiments with physical backgrounds demonstrate that the OPNO outperforms other existing deep learning methodologies, showcasing potential of comparable accuracy with the traditional second-order finite difference method that employs a considerably fine mesh (with relative errors on the order of [math]), and is up to almost 5 magnitudes faster over the traditional method. Reproducibility of computational results. This paper has been awarded the “SIAM Reproducibility Badge: Code and data available” as a recognition that the authors have followed reproducibility principles valued by SISC and the scientific computing community. Code and data that allow readers to reproduce the results in this paper are available at https://github.com/liu-ziyuan-math/spectral_operator_learning/tree/main/OPNO/Reproduce and in the supplementary materials (spectral_operator_learning-main.zip [669KB]).","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"25 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lisa Gaedke-Merzhäuser, Elias Krainski, Radim Janalik, Håvard Rue, Olaf Schenk
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B448-B473, August 2024. Abstract. Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatiotemporal modeling, where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace approximations (INLA) provides a framework for performing Bayesian inference applicable to a large subclass of additive Bayesian hierarchical models. In combination with the stochastic partial differential equation (SPDE) approach, it gives rise to an efficient method for spatiotemporal modeling. In this work, we build on the INLA-SPDE approach by putting forward a performant distributed memory variant, INLADIST, for large-scale applications. To perform the arising computational kernel operations, consisting of Cholesky factorizations, solving linear systems, and selected matrix inversions, we present two numerical solver options: a sparse CPU-based library and a novel blocked GPU-accelerated approach which we propose. We leverage the recurring nonzero block structure in the arising precision (inverse covariance) matrices, which allows us to employ dense subroutines within a sparse setting. Both versions of INLADIST are highly scalable, capable of performing inference on models with millions of latent parameters. We demonstrate their accuracy and performance on synthetic as well as real-world climate dataset applications.
{"title":"Integrated Nested Laplace Approximations for Large-Scale Spatiotemporal Bayesian Modeling","authors":"Lisa Gaedke-Merzhäuser, Elias Krainski, Radim Janalik, Håvard Rue, Olaf Schenk","doi":"10.1137/23m1561531","DOIUrl":"https://doi.org/10.1137/23m1561531","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B448-B473, August 2024. <br/> Abstract. Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatiotemporal modeling, where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace approximations (INLA) provides a framework for performing Bayesian inference applicable to a large subclass of additive Bayesian hierarchical models. In combination with the stochastic partial differential equation (SPDE) approach, it gives rise to an efficient method for spatiotemporal modeling. In this work, we build on the INLA-SPDE approach by putting forward a performant distributed memory variant, INLADIST, for large-scale applications. To perform the arising computational kernel operations, consisting of Cholesky factorizations, solving linear systems, and selected matrix inversions, we present two numerical solver options: a sparse CPU-based library and a novel blocked GPU-accelerated approach which we propose. We leverage the recurring nonzero block structure in the arising precision (inverse covariance) matrices, which allows us to employ dense subroutines within a sparse setting. Both versions of INLADIST are highly scalable, capable of performing inference on models with millions of latent parameters. We demonstrate their accuracy and performance on synthetic as well as real-world climate dataset applications.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"32 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2248-A2269, August 2024. Abstract. This work considers charged systems described by the modified Poisson–Nernst–Planck (PNP) equations, which incorporate ionic steric effects and the Born solvation energy for dielectric inhomogeneity. Solving the equilibrium of modified PNP equations poses numerical challenges due to the emergence of sharp boundary layers caused by small Debye lengths, particularly when local ionic concentrations reach saturation. To address this, we first reformulate the problem as a constraint optimization, where the ionic concentrations on unstructured Delaunay nodes are treated as fractional particles moving along edges between nodes. The electric fields are then updated to minimize the objective free energy while satisfying the discrete Gauss law. We develop a local relaxation method on unstructured meshes that inherently respects the discrete Gauss law, ensuring curl-free electric fields. Numerical analysis demonstrates that the optimal mass of the moving fractional particles guarantees the positivity of both ionic and solvent concentrations. Additionally, the free energy of the charged system consistently decreases during successive updates of ionic concentrations and electric fields. We conduct numerical tests to validate the expected numerical accuracy, positivity, free-energy dissipation, and robustness of our method in simulating charged systems with sharp boundary layers.
{"title":"Local Structure-Preserving Relaxation Method for Equilibrium of Charged Systems on Unstructured Meshes","authors":"Zhonghua Qiao, Zhenli Xu, Qian Yin, Shenggao Zhou","doi":"10.1137/23m1607234","DOIUrl":"https://doi.org/10.1137/23m1607234","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2248-A2269, August 2024. <br/> Abstract. This work considers charged systems described by the modified Poisson–Nernst–Planck (PNP) equations, which incorporate ionic steric effects and the Born solvation energy for dielectric inhomogeneity. Solving the equilibrium of modified PNP equations poses numerical challenges due to the emergence of sharp boundary layers caused by small Debye lengths, particularly when local ionic concentrations reach saturation. To address this, we first reformulate the problem as a constraint optimization, where the ionic concentrations on unstructured Delaunay nodes are treated as fractional particles moving along edges between nodes. The electric fields are then updated to minimize the objective free energy while satisfying the discrete Gauss law. We develop a local relaxation method on unstructured meshes that inherently respects the discrete Gauss law, ensuring curl-free electric fields. Numerical analysis demonstrates that the optimal mass of the moving fractional particles guarantees the positivity of both ionic and solvent concentrations. Additionally, the free energy of the charged system consistently decreases during successive updates of ionic concentrations and electric fields. We conduct numerical tests to validate the expected numerical accuracy, positivity, free-energy dissipation, and robustness of our method in simulating charged systems with sharp boundary layers.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"56 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Scientific Computing, Ahead of Print. Abstract. Block floating point (BFP) arithmetic is currently seeing a resurgence in interest because it requires less power and less chip area and is less complicated to implement in hardware than standard floating point arithmetic. This paper explores the application of BFP to mixed- and progressive-precision multigrid methods, enabling the solution of linear elliptic partial differential equations (PDEs) in energy- and hardware-efficient integer arithmetic. While most existing applications of BFP arithmetic tend to use small block sizes, the block size here is chosen to be maximal such that matrices and vectors share a single exponent for all entries. This is sometimes also referred to as a scaled fixed point format. We provide algorithms for BLAS-like routines for BFP arithmetic that ensure exact vector-vector and matrix-vector computations up to a specified precision. Using these algorithms, we study the asymptotic precision requirements for achieving discretization-error-accuracy. We demonstrate that some computations can be performed using only 4-bit integers, while the number of bits required to attain a certain target accuracy is similar to that of standard floating point arithmetic. Finally, we present a heuristic for full multigrid in BFP arithmetic based on saturation and truncation that still achieves discretization-error-accuracy without the need for expensive normalization steps of intermediate results.
{"title":"Multigrid Methods Using Block Floating Point Arithmetic","authors":"Nils Kohl, Stephen F. McCormick, Rasmus Tamstorf","doi":"10.1137/23m1581819","DOIUrl":"https://doi.org/10.1137/23m1581819","url":null,"abstract":"SIAM Journal on Scientific Computing, Ahead of Print. <br/> Abstract. Block floating point (BFP) arithmetic is currently seeing a resurgence in interest because it requires less power and less chip area and is less complicated to implement in hardware than standard floating point arithmetic. This paper explores the application of BFP to mixed- and progressive-precision multigrid methods, enabling the solution of linear elliptic partial differential equations (PDEs) in energy- and hardware-efficient integer arithmetic. While most existing applications of BFP arithmetic tend to use small block sizes, the block size here is chosen to be maximal such that matrices and vectors share a single exponent for all entries. This is sometimes also referred to as a scaled fixed point format. We provide algorithms for BLAS-like routines for BFP arithmetic that ensure exact vector-vector and matrix-vector computations up to a specified precision. Using these algorithms, we study the asymptotic precision requirements for achieving discretization-error-accuracy. We demonstrate that some computations can be performed using only 4-bit integers, while the number of bits required to attain a certain target accuracy is similar to that of standard floating point arithmetic. Finally, we present a heuristic for full multigrid in BFP arithmetic based on saturation and truncation that still achieves discretization-error-accuracy without the need for expensive normalization steps of intermediate results.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"37 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nick Wulbusch, Reinhild Roden, Matthias Blau, Alexey Chernov
SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B422-B447, August 2024. Abstract. We consider the problem of identifying the acoustic impedance of a wall surface from noisy pressure measurements in a closed room using a Bayesian approach. The room acoustics are modeled by the interior Helmholtz equation with impedance boundary conditions. The aim is to compute moments of the acoustic impedance to estimate a suitable density function of the impedance coefficient. For the computation of moments we use ratio estimators and Monte Carlo sampling. We consider two different experimental scenarios. In the first scenario, the noisy measurements correspond to a wall modeled by impedance boundary conditions. In this case, the Bayesian algorithm uses a model that is (up to the noise) consistent with the measurements and our algorithm is able to identify acoustic impedance with high accuracy. In the second scenario, the noisy measurements come from a coupled acoustic-structural problem, modeling a wall made of glass, whereas the Bayesian algorithm still uses a model with impedance boundary conditions. In this case, the parameter identification model is inconsistent with the measurements and therefore is not capable to represent them well. Nonetheless, for particular frequency bands the Bayesian algorithm identifies estimates with high likelihood. Outside these frequency bands the algorithm fails. We discuss the results of both examples and possible reasons for the failure of the latter case for particular frequency values.
{"title":"Bayesian Parameter Identification in Impedance Boundary Conditions for Helmholtz Problems","authors":"Nick Wulbusch, Reinhild Roden, Matthias Blau, Alexey Chernov","doi":"10.1137/23m1591517","DOIUrl":"https://doi.org/10.1137/23m1591517","url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page B422-B447, August 2024. <br/> Abstract. We consider the problem of identifying the acoustic impedance of a wall surface from noisy pressure measurements in a closed room using a Bayesian approach. The room acoustics are modeled by the interior Helmholtz equation with impedance boundary conditions. The aim is to compute moments of the acoustic impedance to estimate a suitable density function of the impedance coefficient. For the computation of moments we use ratio estimators and Monte Carlo sampling. We consider two different experimental scenarios. In the first scenario, the noisy measurements correspond to a wall modeled by impedance boundary conditions. In this case, the Bayesian algorithm uses a model that is (up to the noise) consistent with the measurements and our algorithm is able to identify acoustic impedance with high accuracy. In the second scenario, the noisy measurements come from a coupled acoustic-structural problem, modeling a wall made of glass, whereas the Bayesian algorithm still uses a model with impedance boundary conditions. In this case, the parameter identification model is inconsistent with the measurements and therefore is not capable to represent them well. Nonetheless, for particular frequency bands the Bayesian algorithm identifies estimates with high likelihood. Outside these frequency bands the algorithm fails. We discuss the results of both examples and possible reasons for the failure of the latter case for particular frequency values.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"23 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141585759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}