Pub Date : 2025-02-13DOI: 10.1016/j.cpc.2025.109542
Lukas Merten , Sophie Aerdker
We present a new code that significantly extends CRPropa's capabilities to model the ensemble averaged transport of charged cosmic rays in turbulent magnetic fields. Compared with previous implementations, the new version allows for spatially varying Eigenvalues of the diffusion tensor and for the implementation of drifts associated with the magnetic background field. The software is based on solving a set of stochastic differential equations (SDEs).
In this work we give detailed instructions to transform a transport equation, usually given as a partial differential equation, into a Fokker-Planck equation and further into the corresponding set of SDEs. Furthermore, detailed tests of the algorithms are done and different sources of uncertainties are compared to each other. So to some extent, this work serves as a technical reference for existing and upcoming work using the new generalized SDE solver based on the CRPropa framework.
Furthermore, the new flexibility allowed us to implement first test cases on continuous particle injection and focused pitch angle diffusion.
{"title":"Modeling Cosmic-Ray Transport: A CRPropa based stochastic differential equation solver","authors":"Lukas Merten , Sophie Aerdker","doi":"10.1016/j.cpc.2025.109542","DOIUrl":"10.1016/j.cpc.2025.109542","url":null,"abstract":"<div><div>We present a new code that significantly extends CRPropa's capabilities to model the ensemble averaged transport of charged cosmic rays in turbulent magnetic fields. Compared with previous implementations, the new version allows for spatially varying Eigenvalues of the diffusion tensor and for the implementation of drifts associated with the magnetic background field. The software is based on solving a set of stochastic differential equations (SDEs).</div><div>In this work we give detailed instructions to transform a transport equation, usually given as a partial differential equation, into a Fokker-Planck equation and further into the corresponding set of SDEs. Furthermore, detailed tests of the algorithms are done and different sources of uncertainties are compared to each other. So to some extent, this work serves as a technical reference for existing and upcoming work using the new generalized SDE solver based on the CRPropa framework.</div><div>Furthermore, the new flexibility allowed us to implement first test cases on continuous particle injection and focused pitch angle diffusion.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"311 ","pages":"Article 109542"},"PeriodicalIF":7.2,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.cpc.2025.109533
Yangshuai Wang
The accurate and efficient simulation of material systems with defects using atomistic-to-continuum (a/c) coupling methods is a significant focus in computational materials science. Achieving a balance between accuracy and computational cost requires the application of a posteriori error analysis and adaptive algorithms. In this paper, we provide a rigorous a posteriori error analysis for three common blended a/c methods: the blended energy-based quasi-continuum (BQCE) method, the blended force-based quasi-continuum (BQCF) method, and the atomistic/continuum blending with ghost force correction (BGFC) method. We discretize the Cauchy-Born model in the continuum region using first- and second-order finite element methods, with the potential for extending to higher-order schemes. The resulting error estimator provides both an upper bound on the true error and a reliable lower bound, subject to a controllable truncation term. Furthermore, we offer an a posteriori analysis of the energy error. We develop and implement an adaptive mesh refinement algorithm applied to two typical defect scenarios: a micro-crack and a Frenkel defect. In both cases, our numerical experiments demonstrate optimal convergence rates with respect to degrees of freedom, in agreement with a priori error estimates.
{"title":"A posteriori analysis and adaptive algorithms for blended type atomistic-to-continuum coupling with higher-order finite elements","authors":"Yangshuai Wang","doi":"10.1016/j.cpc.2025.109533","DOIUrl":"10.1016/j.cpc.2025.109533","url":null,"abstract":"<div><div>The accurate and efficient simulation of material systems with defects using atomistic-to-continuum (a/c) coupling methods is a significant focus in computational materials science. Achieving a balance between accuracy and computational cost requires the application of <em>a posteriori</em> error analysis and adaptive algorithms. In this paper, we provide a rigorous <em>a posteriori</em> error analysis for three common blended a/c methods: the blended energy-based quasi-continuum (BQCE) method, the blended force-based quasi-continuum (BQCF) method, and the atomistic/continuum blending with ghost force correction (BGFC) method. We discretize the Cauchy-Born model in the continuum region using first- and second-order finite element methods, with the potential for extending to higher-order schemes. The resulting error estimator provides both an upper bound on the true error and a reliable lower bound, subject to a controllable truncation term. Furthermore, we offer an a posteriori analysis of the energy error. We develop and implement an adaptive mesh refinement algorithm applied to two typical defect scenarios: a micro-crack and a Frenkel defect. In both cases, our numerical experiments demonstrate optimal convergence rates with respect to degrees of freedom, in agreement with a priori error estimates.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109533"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143419692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.cpc.2025.109543
Khodr Jaber , Ebenezer E. Essel , Pierre E. Sullivan
Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.
Link to repository: https://github.com/KhodrJ/AGAL.
{"title":"GPU-native adaptive mesh refinement with application to lattice Boltzmann simulations","authors":"Khodr Jaber , Ebenezer E. Essel , Pierre E. Sullivan","doi":"10.1016/j.cpc.2025.109543","DOIUrl":"10.1016/j.cpc.2025.109543","url":null,"abstract":"<div><div>Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.</div><div>Link to repository: <span><span>https://github.com/KhodrJ/AGAL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"311 ","pages":"Article 109543"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143463914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.cpc.2025.109541
Isabel Nitzke , Gabriela Guevara-Carrion , Denis Saric , Simon Homes , Simon Stephan , Robin Fingerhut , Martin Bernreuther , Hans Hasse , Jadran Vrabec
A new version release (5.0) of the molecular simulation tool (Deublein et al. 2011; Glass et al. 2014; Rutkai et al. 2017; Fingerhut et al. 2021) is presented. Version 5.0 of features the eight statistical ensembles that are accessible via Monte Carlo simulation for pure fluids and mixtures. It introduces the Lustig formalism for all ensembles which allows on-the-fly sampling of any time-independent thermodynamic property, such as isochoric and isobaric heat capacities, thermal expansion coefficient, isothermal compressibility, thermal pressure coefficient, speed of sound or Joule-Thomson coefficient. Through the introduction of an extended Axilrod-Teller-Muto potential, three-body interactions become available, also incorporating an improved parallelization scheme. In combination with an extension of the Tang-Toennies potential, this provides a highly accurate intermolecular potential for krypton. Moreover, a truncated and shifted Mie potential for arbitrary cutoff radii is implemented, transport property calculations are extended and an auxiliary tool for the determination of Brown's characteristic curves is introduced.
{"title":"ms2: A molecular simulation tool for thermodynamic properties, release 5.0","authors":"Isabel Nitzke , Gabriela Guevara-Carrion , Denis Saric , Simon Homes , Simon Stephan , Robin Fingerhut , Martin Bernreuther , Hans Hasse , Jadran Vrabec","doi":"10.1016/j.cpc.2025.109541","DOIUrl":"10.1016/j.cpc.2025.109541","url":null,"abstract":"<div><div>A new version release (5.0) of the molecular simulation tool <span><math><mi>m</mi><mi>s</mi><mn>2</mn></math></span> (Deublein et al. 2011; Glass et al. 2014; Rutkai et al. 2017; Fingerhut et al. 2021) is presented. Version 5.0 of <span><math><mi>m</mi><mi>s</mi><mn>2</mn></math></span> features the eight statistical ensembles that are accessible via Monte Carlo simulation for pure fluids and mixtures. It introduces the Lustig formalism for all ensembles which allows on-the-fly sampling of any time-independent thermodynamic property, such as isochoric and isobaric heat capacities, thermal expansion coefficient, isothermal compressibility, thermal pressure coefficient, speed of sound or Joule-Thomson coefficient. Through the introduction of an extended Axilrod-Teller-Muto potential, three-body interactions become available, also incorporating an improved parallelization scheme. In combination with an extension of the Tang-Toennies potential, this provides a highly accurate intermolecular potential for krypton. Moreover, a truncated and shifted Mie potential for arbitrary cutoff radii is implemented, transport property calculations are extended and an auxiliary tool for the determination of Brown's characteristic curves is introduced.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109541"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143419690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1016/j.cpc.2025.109539
Sooncheol Hwang , Patrick J. Lynett , Sangyoung Son
<div><div>A GPU-accelerated nearshore scalar transport model with the Boussinesq-type wave solver is introduced. The depth-integrated advection-diffusion equation is implemented into Celeris Advent, the firstly-developed open-source Boussinesq wave model equipped with an interactive system supporting simultaneous visualization and data exchange between a user and the computing unit. A hybrid finite volume-finite difference scheme is adopted to discretize the governing equations, and the modified HLL Riemann solver for satisfying the conservation property of the scalar concentration is applied for an accurate approximation of scalar numerical flux. A source-function wavemaker in conjunction with alongshore periodic boundary conditions and a wave-breaking model are implemented to more precisely replicate the nearshore hydrodynamic processes. Several numerical tests using analytical solutions and experimental data are performed to validate the model. Finally, field-scale dye release experiments are reproduced numerically, assessing the applicability of the proposed model in predicting nearshore scalar transport by dispersive hydrodynamics. The proposed model is expected to serve as an advanced tool for real-time assessment and mitigation of marine pollution incidents.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> Celeris-with-scalar-transport</div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/bk7v57wsxj.1</span><svg><path></path></svg></span></div><div><em>Developer's repository link:</em> <span><span>https://doi.org/10.5281/zenodo.10609197</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> GNU General Public License 3</div><div><em>Programming language:</em> C++, HLSL</div><div><em>Supplementary material:</em> Movies 1-4</div><div><em>Nature of problem:</em> Nearshore scalar transport phenomena have generally been investigated through the numerical models that solve the shallow water equations and the advection-diffusion equation due to their high computational efficiency. However, these models are incapable of simulating the dispersive effects of the waves, which are significant in nearshore hydrodynamics. The scalar transport model with a Boussinesq-type solver can precisely approximate the nearshore scalar transport processes, but its application has been limited by the heavy computational load, which hinders real-time simulations. Building on previous work (Celeris Advent), this software enables real-time numerical simulation of nearshore scalar transport as well as simultaneous visualization. It also supports an interactive environment, allowing the user to change the water surface, bathymetry, and scalar concentration while the model is running.</div><div><em>Solution method:</em> A hybrid finite volume-finite difference scheme is used to solve the extended Boussinesq equations and the advection-diffusion equation. Various components, including the modi
{"title":"A GPU-accelerated numerical model for nearshore scalar transport by dispersive shallow water flows","authors":"Sooncheol Hwang , Patrick J. Lynett , Sangyoung Son","doi":"10.1016/j.cpc.2025.109539","DOIUrl":"10.1016/j.cpc.2025.109539","url":null,"abstract":"<div><div>A GPU-accelerated nearshore scalar transport model with the Boussinesq-type wave solver is introduced. The depth-integrated advection-diffusion equation is implemented into Celeris Advent, the firstly-developed open-source Boussinesq wave model equipped with an interactive system supporting simultaneous visualization and data exchange between a user and the computing unit. A hybrid finite volume-finite difference scheme is adopted to discretize the governing equations, and the modified HLL Riemann solver for satisfying the conservation property of the scalar concentration is applied for an accurate approximation of scalar numerical flux. A source-function wavemaker in conjunction with alongshore periodic boundary conditions and a wave-breaking model are implemented to more precisely replicate the nearshore hydrodynamic processes. Several numerical tests using analytical solutions and experimental data are performed to validate the model. Finally, field-scale dye release experiments are reproduced numerically, assessing the applicability of the proposed model in predicting nearshore scalar transport by dispersive hydrodynamics. The proposed model is expected to serve as an advanced tool for real-time assessment and mitigation of marine pollution incidents.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> Celeris-with-scalar-transport</div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/bk7v57wsxj.1</span><svg><path></path></svg></span></div><div><em>Developer's repository link:</em> <span><span>https://doi.org/10.5281/zenodo.10609197</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> GNU General Public License 3</div><div><em>Programming language:</em> C++, HLSL</div><div><em>Supplementary material:</em> Movies 1-4</div><div><em>Nature of problem:</em> Nearshore scalar transport phenomena have generally been investigated through the numerical models that solve the shallow water equations and the advection-diffusion equation due to their high computational efficiency. However, these models are incapable of simulating the dispersive effects of the waves, which are significant in nearshore hydrodynamics. The scalar transport model with a Boussinesq-type solver can precisely approximate the nearshore scalar transport processes, but its application has been limited by the heavy computational load, which hinders real-time simulations. Building on previous work (Celeris Advent), this software enables real-time numerical simulation of nearshore scalar transport as well as simultaneous visualization. It also supports an interactive environment, allowing the user to change the water surface, bathymetry, and scalar concentration while the model is running.</div><div><em>Solution method:</em> A hybrid finite volume-finite difference scheme is used to solve the extended Boussinesq equations and the advection-diffusion equation. Various components, including the modi","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109539"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143387758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.cpc.2025.109535
Chengdi Ma , Jizu Huang , Hao Luo , Chao Yang
Compared with the remarkable progress made in parallel numerical solvers of partial differential equations, the development of algorithms for generating unstructured triangular/tetrahedral meshes has been relatively sluggish. In this paper, we propose a novel, consistent parallel advancing front technique (CPAFT) by combining the advancing front technique, the domain decomposition method based on space-filling curves, the distributed forest-of-overlapping-trees approach, and the consistent parallel maximal independent set algorithm. The newly proposed CPAFT algorithm can mathematically ensure that the generated unstructured triangular/tetrahedral meshes are independent of the number of processors and the implementation of domain decomposition. Several numerical tests are conducted to validate the parallel consistency and outstanding parallel efficiency of the proposed algorithm, which scales effectively up to two thousand processors. This is, as far as we know, the first parallel unstructured triangular/tetrahedral mesh generator with scalability to O(1,000) CPU processors.
{"title":"CPAFT: A consistent parallel advancing front technique for unstructured triangular/tetrahedral mesh generation","authors":"Chengdi Ma , Jizu Huang , Hao Luo , Chao Yang","doi":"10.1016/j.cpc.2025.109535","DOIUrl":"10.1016/j.cpc.2025.109535","url":null,"abstract":"<div><div>Compared with the remarkable progress made in parallel numerical solvers of partial differential equations, the development of algorithms for generating unstructured triangular/tetrahedral meshes has been relatively sluggish. In this paper, we propose a novel, consistent parallel advancing front technique (CPAFT) by combining the advancing front technique, the domain decomposition method based on space-filling curves, the distributed forest-of-overlapping-trees approach, and the consistent parallel maximal independent set algorithm. The newly proposed CPAFT algorithm can mathematically ensure that the generated unstructured triangular/tetrahedral meshes are independent of the number of processors and the implementation of domain decomposition. Several numerical tests are conducted to validate the parallel consistency and outstanding parallel efficiency of the proposed algorithm, which scales effectively up to two thousand processors. This is, as far as we know, the first parallel unstructured triangular/tetrahedral mesh generator with scalability to O(1,000) CPU processors.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109535"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143387757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1016/j.cpc.2025.109540
Bob Zigon , Luoding Zhu
We introduce MoE-Bolt (Mixture of Experts for lattice Boltzman), a novel neural network approach for predicting the unsteady state of fluid flow past a cylinder. We modeled the problem as a sequence prediction where 8 time steps previous to time t were used to predict the velocity fields of time t. With Reynolds numbers in the training set from 138 to 196, the problem was difficult because the flow was in an unsteady-state. We used a mixture of experts (MoE) to work cooperatively on solving the problem. The advantage of this cooperation is that the computing domain was decomposed without human intervention. When 4 experts were used our solution exhibited a 15 decibel improvement in the signal to noise ratio when compared to the single expert configuration. Our results and analyses show that MoE-Bolt is an effective approach for unsteady flows and it is a stepping stone for predicting flow fields at all time instants without using data from the simulation.
{"title":"Modeling 2D unsteady flows at moderate Reynolds numbers using a 3D convolutional neural network and a mixture of experts","authors":"Bob Zigon , Luoding Zhu","doi":"10.1016/j.cpc.2025.109540","DOIUrl":"10.1016/j.cpc.2025.109540","url":null,"abstract":"<div><div>We introduce MoE-Bolt (Mixture of Experts for lattice Boltzman), a novel neural network approach for predicting the unsteady state of fluid flow past a cylinder. We modeled the problem as a sequence prediction where 8 time steps previous to time <em>t</em> were used to predict the velocity fields of time <em>t</em>. With Reynolds numbers in the training set from 138 to 196, the problem was difficult because the flow was in an unsteady-state. We used a mixture of experts (MoE) to work cooperatively on solving the problem. The advantage of this cooperation is that the computing domain was decomposed without human intervention. When 4 experts were used our solution exhibited a 15 decibel improvement in the signal to noise ratio when compared to the single expert configuration. Our results and analyses show that MoE-Bolt is an effective approach for unsteady flows and it is a stepping stone for predicting flow fields at all time instants without using data from the simulation.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109540"},"PeriodicalIF":7.2,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1016/j.cpc.2025.109538
Xin Guan , Xiao Liu , Yan-Qing Ma , Wen-Hao Wu
In this article, we present the package Blade as the first implementation of the block-triangular form improved Feynman integral reduction method. The block-triangular form has orders of magnitude fewer equations compared to the plain integration-by-parts system, allowing for strictly block-by-block solutions. This results in faster evaluations and reduced resource consumption. We elucidate the algorithms involved in obtaining the block-triangular form along with their implementations. Additionally, we introduce novel algorithms for finding the canonical form and symmetry relations of Feynman integrals, as well as for performing spanning-sector reduction. Our benchmarks for various state-of-the-art problems demonstrate that Blade is remarkably competitive among existing reduction tools. Furthermore, the Blade package offers several distinctive features, including support for complex kinematic variables or masses, user-defined Feynman prescriptions for each propagator, and general integrands.
Program summary
Program Title:Blade
CPC Library link to program files:https://doi.org/10.17632/rzfwjzmd26.1
https://github.com/peraro/finiteflow, open source.
{"title":"Blade: A package for block-triangular form improved Feynman integrals decomposition","authors":"Xin Guan , Xiao Liu , Yan-Qing Ma , Wen-Hao Wu","doi":"10.1016/j.cpc.2025.109538","DOIUrl":"10.1016/j.cpc.2025.109538","url":null,"abstract":"<div><div>In this article, we present the package <span>Blade</span> as the first implementation of the block-triangular form improved Feynman integral reduction method. The block-triangular form has orders of magnitude fewer equations compared to the plain integration-by-parts system, allowing for strictly block-by-block solutions. This results in faster evaluations and reduced resource consumption. We elucidate the algorithms involved in obtaining the block-triangular form along with their implementations. Additionally, we introduce novel algorithms for finding the canonical form and symmetry relations of Feynman integrals, as well as for performing spanning-sector reduction. Our benchmarks for various state-of-the-art problems demonstrate that <span>Blade</span> is remarkably competitive among existing reduction tools. Furthermore, the <span>Blade</span> package offers several distinctive features, including support for complex kinematic variables or masses, user-defined Feynman prescriptions for each propagator, and general integrands.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> <span>Blade</span></div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/rzfwjzmd26.1</span><svg><path></path></svg></span></div><div><em>Developer's repository link:</em> <span><span>https://gitee.com/multiloop-pku/blade</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> MIT</div><div><em>Programming language:</em> <span>Wolfram Mathematica</span> 11.3 or higher</div><div><em>External routines/libraries used:</em> <span>Wolfram Mathematica</span> [1], <span>FiniteFlow</span> [2]</div><div><em>Nature of problem:</em> Automatically reducing dimensionally regularized Feynman integrals into linear combination of master integrals.</div><div><em>Solution method:</em> The program implements recently proposed block-triangular form to significantly improve the reduction efficiency.</div></div><div><h3>References</h3><div><ul><li><span>[1]</span><span><div><span><span>http://www.wolfram.com/mathematica</span><svg><path></path></svg></span>, commercial algebraic software.</div></span></li><li><span>[2]</span><span><div><span><span>https://github.com/peraro/finiteflow</span><svg><path></path></svg></span>, open source.</div></span></li></ul></div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109538"},"PeriodicalIF":7.2,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1016/j.cpc.2025.109536
Dean Muir , Kenneth Duru , Matthew Hole , Stuart Hudson
We present a novel numerical method for solving the anisotropic diffusion equation in magnetic fields confined to a periodic box which is accurate and provably stable. We derive energy estimates of the solution of the continuous initial boundary value problem. A discrete formulation is presented using operator splitting in time with the summation by parts finite difference approximation of spatial derivatives for the perpendicular diffusion operator. Weak penalty procedures are derived for implementing both boundary conditions and parallel diffusion operator obtained by field line tracing. We prove that the fully-discrete approximation is unconditionally stable. Discrete energy estimates are shown to match the continuous energy estimate given the correct choice of penalty parameters. A nonlinear penalty parameter is shown to provide an effective method for tuning the parallel diffusion penalty and significantly minimises rounding errors. Several numerical experiments, using manufactured solutions, the “NIMROD benchmark” problem and a single island problem, are presented to verify numerical accuracy, convergence, and asymptotic preserving properties of the method. Finally, we present a magnetic field with chaotic regions and islands and show the contours of the anisotropic diffusion equation reproduce key features in the field.
{"title":"A provably stable numerical method for the anisotropic diffusion equation in confined magnetic fields","authors":"Dean Muir , Kenneth Duru , Matthew Hole , Stuart Hudson","doi":"10.1016/j.cpc.2025.109536","DOIUrl":"10.1016/j.cpc.2025.109536","url":null,"abstract":"<div><div>We present a novel numerical method for solving the anisotropic diffusion equation in magnetic fields confined to a periodic box which is accurate and provably stable. We derive energy estimates of the solution of the continuous initial boundary value problem. A discrete formulation is presented using operator splitting in time with the summation by parts finite difference approximation of spatial derivatives for the perpendicular diffusion operator. Weak penalty procedures are derived for implementing both boundary conditions and parallel diffusion operator obtained by field line tracing. We prove that the fully-discrete approximation is unconditionally stable. Discrete energy estimates are shown to match the continuous energy estimate given the correct choice of penalty parameters. A nonlinear penalty parameter is shown to provide an effective method for tuning the parallel diffusion penalty and significantly minimises rounding errors. Several numerical experiments, using manufactured solutions, the “NIMROD benchmark” problem and a single island problem, are presented to verify numerical accuracy, convergence, and asymptotic preserving properties of the method. Finally, we present a magnetic field with chaotic regions and islands and show the contours of the anisotropic diffusion equation reproduce key features in the field.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109536"},"PeriodicalIF":7.2,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1016/j.cpc.2025.109529
Song-En Chen , Eugene Su , Chih-Chung Wang , Jia-Han Li , Chao-Ching Ho
In this paper, we propose a singular value decomposition-based deep learning model to investigate the inverse problem between simulated near field electromagnetic data and the geometric parameters of through silicon via array. This is of great importance for predicting the critical dimensions of through silicon via in the semiconductor industry, and it becomes more challenging due to the decreasing size of through silicon via. Simulation of electromagnetic field data for various through silicon via arrays is used by the finite-difference time-domain method. We analyze the near-field electromagnetic intensity distribution of different geometric parameters, including critical dimensions such as depth, top diameter, bottom diameter, sidewall roughness, and bottom ellipsoid radius. Due to the sub-micron scale of the critical dimensions and the high aspect ratios, single-wavelength electric field data is insufficient for accurate predictions. However, due to its size, multi-wavelength electric field data presents a significant computational challenge. We employ singular value decomposition to compress the multi-wavelength electric field data to overcome this. By analyzing the dominant singular value components, we reduce the data volume to 4.56 % of its original size while preserving predictive accuracy. The compressed data is subsequently integrated with deep learning models for critical dimension prediction. We compare three model architectures and demonstrate that utilizing the largest singular values from 30-wavelength electric field data substantially improves the prediction of vertical critical dimensions, such as through silicon via depth and bottom ellipsoid depth. Specifically, the singular value decomposition-based deep learning model, which incorporates the largest singular values from 5-wavelength electric field data, reduces computation time by 34.88 % and decreases the mean absolute percentage error for through silicon via depth and bottom ellipsoid depth by 2.78 % and 6.60 %, respectively. The singular value decomposition based deep learning model, which uses the largest singular values from 30-wavelength data, further reduces the mean absolute percentage error for the depth and bottom ellipsoid depth of through silicon via by 2.86 % and 10.60 %. These findings underscore the efficacy of singular value decomposition-based multi-wavelength electric field data compression combined with deep learning, offering an efficient approach for managing large-scale electromagnetic simulations in through silicon via design. Our source code is available at https://github.com/AOI-Laboratory/EMDataSVD.
{"title":"Singular value decomposition of near-field electromagnetic data for compressing and accelerating deep neural networks in the prediction of geometric parameters for through silicon via array","authors":"Song-En Chen , Eugene Su , Chih-Chung Wang , Jia-Han Li , Chao-Ching Ho","doi":"10.1016/j.cpc.2025.109529","DOIUrl":"10.1016/j.cpc.2025.109529","url":null,"abstract":"<div><div>In this paper, we propose a singular value decomposition-based deep learning model to investigate the inverse problem between simulated near field electromagnetic data and the geometric parameters of through silicon via array. This is of great importance for predicting the critical dimensions of through silicon via in the semiconductor industry, and it becomes more challenging due to the decreasing size of through silicon via. Simulation of electromagnetic field data for various through silicon via arrays is used by the finite-difference time-domain method. We analyze the near-field electromagnetic intensity distribution of different geometric parameters, including critical dimensions such as depth, top diameter, bottom diameter, sidewall roughness, and bottom ellipsoid radius. Due to the sub-micron scale of the critical dimensions and the high aspect ratios, single-wavelength electric field data is insufficient for accurate predictions. However, due to its size, multi-wavelength electric field data presents a significant computational challenge. We employ singular value decomposition to compress the multi-wavelength electric field data to overcome this. By analyzing the dominant singular value components, we reduce the data volume to 4.56 % of its original size while preserving predictive accuracy. The compressed data is subsequently integrated with deep learning models for critical dimension prediction. We compare three model architectures and demonstrate that utilizing the largest singular values from 30-wavelength electric field data substantially improves the prediction of vertical critical dimensions, such as through silicon via depth and bottom ellipsoid depth. Specifically, the singular value decomposition-based deep learning model, which incorporates the largest singular values from 5-wavelength electric field data, reduces computation time by 34.88 % and decreases the mean absolute percentage error for through silicon via depth and bottom ellipsoid depth by 2.78 % and 6.60 %, respectively. The singular value decomposition based deep learning model, which uses the largest singular values from 30-wavelength data, further reduces the mean absolute percentage error for the depth and bottom ellipsoid depth of through silicon via by 2.86 % and 10.60 %. These findings underscore the efficacy of singular value decomposition-based multi-wavelength electric field data compression combined with deep learning, offering an efficient approach for managing large-scale electromagnetic simulations in through silicon via design. Our source code is available at <span><span>https://github.com/AOI-Laboratory/EMDataSVD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"310 ","pages":"Article 109529"},"PeriodicalIF":7.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}