Pub Date : 2024-05-22DOI: 10.1016/j.cpc.2024.109258
K.A. Bala, M.R. Omar, John Y.H. Soo, W.M.H. Wan Mokhtar
It is essential to precisely determine the evolving concentrations of radioactive nuclides within transmutation problems. It is also a crucial aspect of nuclear physics with widespread applications in nuclear waste management and energy production. This paper introduces CNUCTRAN, a novel computer program that employs a probabilistic approach to estimate nuclide concentrations in transmutation problems. CNUCTRAN directly simulates nuclei transformations arising from various nuclear reactions, diverging from the traditional deterministic methods that solve the Bateman equation using matrix exponential approximation. This approach effectively addresses numerical challenges associated with solving the Bateman equations, therefore, circumventing the need for matrix exponential approximations that risk producing nonphysical concentrations. Our sample calculations using CNUCTRAN shows that the concentration predictions of CNUCTRAN have a relative error of less than 0.001% compared to the state-of-the-art method, CRAM, in different test cases. This makes CNUCTRAN a valuable alternative tool for transmutation analysis.
Program summary
Program Title:CNUCTRAN
CPC Library link to program files:https://doi.org/10.17632/b484w2vx52.1
Nature of problem:CNUCTRAN simulates the transmutation of various nuclides such as decays, fissions, and neutron induced reactions using a direct simulation approach. It has the capability of predicting the final concentration of a large system of nuclides altogether after a specified time step, .
Solution method:CNUCTRAN works based on the novel probabilistic method such that it does not compute the final nuclide concentrations by solving Bateman equations. Instead, it statistically tracks nuclide transformations into one another in a transmutation problem. The technique encapsulates various possible nuclide transformations into a sparse transfer matrix, , whose elements are made up of various nuclear reaction probabilities. Next, serves as a matrix operator acting on the initial nuclide concentrations, , producing the final nuclide concentrations, y.
{"title":"CNUCTRAN: A program for computing final nuclide concentrations using a direct simulation approach","authors":"K.A. Bala, M.R. Omar, John Y.H. Soo, W.M.H. Wan Mokhtar","doi":"10.1016/j.cpc.2024.109258","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109258","url":null,"abstract":"<div><p>It is essential to precisely determine the evolving concentrations of radioactive nuclides within transmutation problems. It is also a crucial aspect of nuclear physics with widespread applications in nuclear waste management and energy production. This paper introduces <span>CNUCTRAN</span>, a novel computer program that employs a probabilistic approach to estimate nuclide concentrations in transmutation problems. <span>CNUCTRAN</span> directly simulates nuclei transformations arising from various nuclear reactions, diverging from the traditional deterministic methods that solve the Bateman equation using matrix exponential approximation. This approach effectively addresses numerical challenges associated with solving the Bateman equations, therefore, circumventing the need for matrix exponential approximations that risk producing nonphysical concentrations. Our sample calculations using <span>CNUCTRAN</span> shows that the concentration predictions of <span>CNUCTRAN</span> have a relative error of less than 0.001% compared to the state-of-the-art method, CRAM, in different test cases. This makes <span>CNUCTRAN</span> a valuable alternative tool for transmutation analysis.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> <span>CNUCTRAN</span></p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/b484w2vx52.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/rabieomar92/cnuctran/releases</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> MIT</p><p><em>Programming language:</em> C++</p><p><em>Nature of problem:</em> <span>CNUCTRAN</span> simulates the transmutation of various nuclides such as decays, fissions, and neutron induced reactions using a direct simulation approach. It has the capability of predicting the final concentration of a large system of nuclides altogether after a specified time step, <span><math><msub><mrow><mi>t</mi></mrow><mrow><mi>f</mi></mrow></msub></math></span>.</p><p><em>Solution method:</em> <span>CNUCTRAN</span> works based on the novel probabilistic method such that it does not compute the final nuclide concentrations by solving Bateman equations. Instead, it statistically tracks nuclide transformations into one another in a transmutation problem. The technique encapsulates various possible nuclide transformations into a sparse transfer matrix, <span><math><mi>T</mi></math></span>, whose elements are made up of various nuclear reaction probabilities. Next, <span><math><mi>T</mi></math></span> serves as a matrix operator acting on the initial nuclide concentrations, <span><math><mi>y</mi><mo>(</mo><mn>0</mn><mo>)</mo></math></span>, producing the final nuclide concentrations, <strong>y</strong>.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1016/j.cpc.2024.109251
Junyan Zhu , Jiang Cao , Chen Song , Bo Li , Zhengsheng Han
We present a Python-based open-source library named Jiezi, which provides the means of simulating the electronic transport properties of nanoscaled devices on the atomistic level. The key feature of Jiezi lies in its core algorithm, i.e., self-consistent orchestration between the non-equilibrium Green's function (NEGF) method and a Poisson's equation solver. Beyond the construction of the tight-binding (TB) Hamiltonian with empirical parameters for conventional materials, the package offers a comprehensive framework for constructing the Wannier-based Hamiltonian matrix, enabling the investigation of novel materials and their heterostructures. To expedite the solution of NEGF systems, a methodology based on renormalization theory is proposed for reducing the dimension of the Hamiltonian matrix. Additionally, we adopt a non-linear Poisson equation solver with no analytical approximation in this software. The software facilitates seamless integration with external tools for geometry and mesh generation and post-processing. In this paper, we present the main capabilities and workflow by demonstrating with a simulation for the carbon nanotube field-effect transistor (CNTFET).
Program summary
Program Title: Jiezi
CPC Library link to program files:https://doi.org/10.17632/nk79kbtww4.1
Nature of problem: Simulates the quantum transport property of nano-scaled transistors based on the predefined device structure and the material composition.
Solution method: Solves the coupled Schrödinger equation and Poisson equation by NEGF and finite element method.
{"title":"Jiezi: an open-source Python software for simulating quantum transport based on non-equilibrium Green's function formalism","authors":"Junyan Zhu , Jiang Cao , Chen Song , Bo Li , Zhengsheng Han","doi":"10.1016/j.cpc.2024.109251","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109251","url":null,"abstract":"<div><p>We present a Python-based open-source library named <span>Jiezi</span>, which provides the means of simulating the electronic transport properties of nanoscaled devices on the atomistic level. The key feature of <span>Jiezi</span> lies in its core algorithm, i.e., self-consistent orchestration between the non-equilibrium Green's function (NEGF) method and a Poisson's equation solver. Beyond the construction of the tight-binding (TB) Hamiltonian with empirical parameters for conventional materials, the package offers a comprehensive framework for constructing the Wannier-based Hamiltonian matrix, enabling the investigation of novel materials and their heterostructures. To expedite the solution of NEGF systems, a methodology based on renormalization theory is proposed for reducing the dimension of the Hamiltonian matrix. Additionally, we adopt a non-linear Poisson equation solver with no analytical approximation in this software. The software facilitates seamless integration with external tools for geometry and mesh generation and post-processing. In this paper, we present the main capabilities and workflow by demonstrating with a simulation for the carbon nanotube field-effect transistor (CNTFET).</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> Jiezi</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/nk79kbtww4.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/Jiezi-negf/Jiezi</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GPLv3</p><p><em>Programming language:</em> Python</p><p><em>Nature of problem:</em> Simulates the quantum transport property of nano-scaled transistors based on the predefined device structure and the material composition.</p><p><em>Solution method:</em> Solves the coupled Schrödinger equation and Poisson equation by NEGF and finite element method.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001747/pdfft?md5=ab8916061cc89c6369f91c496a5a9fcc&pid=1-s2.0-S0010465524001747-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141084302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1016/j.cpc.2024.109249
D. Massaro , A. Peplinski , R. Stanly , S. Mirzareza , V. Lupi , T. Mukha , P. Schlatter
A framework is presented for the spectral-element code Nek5000, which has been, and still is, widely used in the computational fluid dynamics (CFD) community to perform high-fidelity numerical simulations of transitional and high Reynolds number flows. Despite the widespread usage, there is a deficiency in having a comprehensive set of tools specifically designed for conducting simulations using Nek5000. To address this issue, we have created a unique framework that allows, inter alia, to perform stability analysis and compute statistics of a turbulent flow. The framework encapsulates modules that provide tools, run-time parameters and memory structures, defining interfaces and performing different tasks. First, the framework architecture is described, showing its non-intrusive approach. Then, the modules are presented, explaining the main tools that have been implemented and describing some of the test cases. The code is open-source and available online, with proper documentation, to-run instructions and related examples.
{"title":"A comprehensive framework to enhance numerical simulations in the spectral-element code Nek5000","authors":"D. Massaro , A. Peplinski , R. Stanly , S. Mirzareza , V. Lupi , T. Mukha , P. Schlatter","doi":"10.1016/j.cpc.2024.109249","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109249","url":null,"abstract":"<div><p>A framework is presented for the spectral-element code <span>Nek5000</span>, which has been, and still is, widely used in the computational fluid dynamics (CFD) community to perform high-fidelity numerical simulations of transitional and high Reynolds number flows. Despite the widespread usage, there is a deficiency in having a comprehensive set of tools specifically designed for conducting simulations using <span>Nek5000</span>. To address this issue, we have created a unique framework that allows, <em>inter alia</em>, to perform stability analysis and compute statistics of a turbulent flow. The framework encapsulates modules that provide tools, run-time parameters and memory structures, defining interfaces and performing different tasks. First, the framework architecture is described, showing its non-intrusive approach. Then, the modules are presented, explaining the main tools that have been implemented and describing some of the test cases. The code is open-source and available online, with proper documentation, to-run instructions and related examples.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001723/pdfft?md5=0bdec44d85699e37935d33c0c56bc54d&pid=1-s2.0-S0010465524001723-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141084301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1016/j.cpc.2024.109250
S. Westerbeek , S. Hulshoff , H. Schuttelaars , M. Kotsonis
A nonlinear Harmonic Navier-Stokes (HNS) framework is introduced for simulating instabilities in laminar spanwise-invariant shear layers, featuring sharp and smooth wall surface protuberances. While such cases play a critical role in the process of laminar-to-turbulent transition, classical stability theory analyses such as parabolized or local stability methods fail to provide (accurate) results, due to their underlying assumptions. The generalized incompressible Navier-Stokes (NS) equations are expanded in perturbed form, using a spanwise and temporal Fourier ansatz for flow perturbations. The resulting equations are discretized using spectral collocation in the wall-normal direction and finite-difference methods in the streamwise direction. The equations are then solved using a direct sparse-matrix solver. The nonlinear mode interaction terms are converged iteratively. The solution implementation makes use of a generalized domain transformation to account for geometrical smooth surface features, such as humps. No-slip conditions can be embedded in the interior domain to account for the presence of sharp surface features such as forward- or backward-facing steps. Common difficulties with Navier-Stokes solvers, such as the treatment of the outflow boundary and convergence of nonlinear terms, are considered in detail. The performance of the developed solver is evaluated against several cases of representative boundary layer instability growth, including linear and nonlinear growth of Tollmien-Schlichting waves in a Blasius boundary layer and stationary crossflow instabilities in a swept flat-plate boundary layer. The latter problem is also treated in the presence of a geometrical smooth hump and a sharp forward-facing step at the wall. HNS simulation results, such as perturbation amplitudes, growth rates, and shape functions, are compared to benchmark flow stability analysis methods such as Parabolized Stability Equations (PSE), Adaptive Harmonic Linearized Navier-Stokes (AHLNS), or Direct Numerical Simulations (DNS). Good agreement is observed in all cases. The HNS solver is subjected to a grid convergence study and a simple performance benchmark, namely memory usage and computational cost. The computational cost is found to be considerably lower than high-fidelity DNS at comparable grid resolutions.
Program summary
Program Title: DeHNSSo
CPC Library link to program files:https://doi.org/10.17632/9bnms99kk2.1
{"title":"DeHNSSo: The delft harmonic Navier-Stokes solver for nonlinear stability problems with complex geometric features","authors":"S. Westerbeek , S. Hulshoff , H. Schuttelaars , M. Kotsonis","doi":"10.1016/j.cpc.2024.109250","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109250","url":null,"abstract":"<div><p>A nonlinear Harmonic Navier-Stokes (HNS) framework is introduced for simulating instabilities in laminar spanwise-invariant shear layers, featuring sharp and smooth wall surface protuberances. While such cases play a critical role in the process of laminar-to-turbulent transition, classical stability theory analyses such as parabolized or local stability methods fail to provide (accurate) results, due to their underlying assumptions. The generalized incompressible Navier-Stokes (NS) equations are expanded in perturbed form, using a spanwise and temporal Fourier ansatz for flow perturbations. The resulting equations are discretized using spectral collocation in the wall-normal direction and finite-difference methods in the streamwise direction. The equations are then solved using a direct sparse-matrix solver. The nonlinear mode interaction terms are converged iteratively. The solution implementation makes use of a generalized domain transformation to account for geometrical smooth surface features, such as humps. No-slip conditions can be embedded in the interior domain to account for the presence of sharp surface features such as forward- or backward-facing steps. Common difficulties with Navier-Stokes solvers, such as the treatment of the outflow boundary and convergence of nonlinear terms, are considered in detail. The performance of the developed solver is evaluated against several cases of representative boundary layer instability growth, including linear and nonlinear growth of Tollmien-Schlichting waves in a Blasius boundary layer and stationary crossflow instabilities in a swept flat-plate boundary layer. The latter problem is also treated in the presence of a geometrical smooth hump and a sharp forward-facing step at the wall. HNS simulation results, such as perturbation amplitudes, growth rates, and shape functions, are compared to benchmark flow stability analysis methods such as Parabolized Stability Equations (PSE), Adaptive Harmonic Linearized Navier-Stokes (AHLNS), or Direct Numerical Simulations (DNS). Good agreement is observed in all cases. The HNS solver is subjected to a grid convergence study and a simple performance benchmark, namely memory usage and computational cost. The computational cost is found to be considerably lower than high-fidelity DNS at comparable grid resolutions.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> DeHNSSo</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/9bnms99kk2.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/SvenWesterbeek/DeHNSSo</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GPLv3</p><p><em>Programming language:</em> Matlab</p><p><em>Supplementary material:</em> The supplementary material contains the code as well as a user manual.</p><p><em>Nature of problem:</em> Fluid flows are subject to laminar-to-turbulent transition following the growth of instabilities.","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001735/pdfft?md5=d004601e2b52d69146119eda014b888d&pid=1-s2.0-S0010465524001735-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141077921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-17DOI: 10.1016/j.cpc.2024.109248
Yuan Chen , Mahmut Sait Okyay , Bryan M. Wong
We present an open-source software package, MISTER-T (Manipulating an Interacting System of Total Electrons in Real-Time), for the quantum optimal control of interacting electrons within a time-dependent Kohn-Sham formalism. In contrast to other implementations restricted to simple models on rectangular domains, our method enables quantum optimal control calculations for multi-electron systems (in the effective mass formulation) on nonuniform meshes with arbitrary two-dimensional cross-sectional geometries. Our approach is enabled by forward and backward propagator integration methods to evolve the Kohn-Sham equations with a pseudoskeleton decomposition algorithm for enhanced computational efficiency. We provide several examples of the versatility and efficiency of the MISTER-T code in handling complex geometries and quantum control mechanisms. The capabilities of the MISTER-T code provide insight into the implications of varying propagation times and local control mechanisms to understand a variety of strategies for manipulating electron dynamics in these complex systems.
Program summary
Program Title: MISTER-T
CPC Library link to program files:https://doi.org/10.17632/psymy4ddnw.1
Licensing provisions: GNU General Public License 3
Programming language: MATLAB
Supplementary material: animated movies of total electron densities under the influence of optimal control fields for (1) an asymmetric double-well potential for long propagation times, (2) an asymmetric double-well potential for short propagation times, and (3) a triple-well potential with a position-dependent effective mass.
Nature of problem: The MISTER-T code solves quantum optimal control problems for interacting electrons within a time-dependent Kohn-Sham formalism. It can handle two-dimensional systems with arbitrary cross-sectional geometries within the effective mass formulation. The user-friendly code uses forward and backward propagator integration methods to evolve the Kohn-Sham equations with a pseudoskeleton decomposition algorithm for enhanced computational efficiency.
Solution method: iterative solution of the quantum optimal control equations using finite element methods, effective mass formulation, pseudoskeleton decomposition, sparse matrix linear algebra, and nonuniform fast Fourier transforms.
我们提出了一个开源软件包 MISTER-T(实时操纵全电子相互作用系统),用于在时间相关的 Kohn-Sham 形式中对相互作用电子进行量子优化控制。与其他局限于矩形域上简单模型的实现方法不同,我们的方法可以在具有任意二维横截面几何形状的非均匀网格上对多电子系统(有效质量公式)进行量子优化控制计算。我们的方法采用正向和反向传播者积分法来演化 Kohn-Sham 方程,并采用伪骨架分解算法来提高计算效率。我们提供了几个实例,说明 MISTER-T 代码在处理复杂几何图形和量子控制机制方面的多功能性和效率。MISTER-T 代码的功能使我们能够深入了解不同传播时间和局部控制机制的影响,从而理解在这些复杂系统中操纵电子动力学的各种策略:MISTER-TCPC 库与程序文件的链接:https://doi.org/10.17632/psymy4ddnw.1Licensing provisions:GNU General Public License 3编程语言:MATLABSupplementary material: animated movies of total electron densities under the influence of optimal control fields for (1) an asymmetric double-well potential for long propagation times, (2) an asymmetric double-well potential for short propagation times, and (3) a triple-well potential with a position-dependent effective mass.问题性质:MISTER-T 代码在一个时间依赖的 Kohn-Sham 形式主义中解决相互作用电子的量子最优控制问题。它可以在有效质量公式中处理具有任意截面几何形状的二维系统。求解方法:使用有限元方法、有效质量公式、伪骨架分解、稀疏矩阵线性代数和非均匀快速傅立叶变换迭代求解量子优化控制方程。
{"title":"MISTER-T: An open-source software package for quantum optimal control of multi-electron systems on arbitrary geometries","authors":"Yuan Chen , Mahmut Sait Okyay , Bryan M. Wong","doi":"10.1016/j.cpc.2024.109248","DOIUrl":"10.1016/j.cpc.2024.109248","url":null,"abstract":"<div><p>We present an open-source software package, MISTER-T (Manipulating an Interacting System of Total Electrons in Real-Time), for the quantum optimal control of interacting electrons within a time-dependent Kohn-Sham formalism. In contrast to other implementations restricted to simple models on rectangular domains, our method enables quantum optimal control calculations for multi-electron systems (in the effective mass formulation) on nonuniform meshes with arbitrary two-dimensional cross-sectional geometries. Our approach is enabled by forward and backward propagator integration methods to evolve the Kohn-Sham equations with a pseudoskeleton decomposition algorithm for enhanced computational efficiency. We provide several examples of the versatility and efficiency of the MISTER-T code in handling complex geometries and quantum control mechanisms. The capabilities of the MISTER-T code provide insight into the implications of varying propagation times and local control mechanisms to understand a variety of strategies for manipulating electron dynamics in these complex systems.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> MISTER-T</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/psymy4ddnw.1</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GNU General Public License 3</p><p><em>Programming language:</em> MATLAB</p><p><em>Supplementary material:</em> animated movies of total electron densities under the influence of optimal control fields for (1) an asymmetric double-well potential for long propagation times, (2) an asymmetric double-well potential for short propagation times, and (3) a triple-well potential with a position-dependent effective mass.</p><p><em>Nature of problem:</em> The MISTER-T code solves quantum optimal control problems for interacting electrons within a time-dependent Kohn-Sham formalism. It can handle two-dimensional systems with arbitrary cross-sectional geometries within the effective mass formulation. The user-friendly code uses forward and backward propagator integration methods to evolve the Kohn-Sham equations with a pseudoskeleton decomposition algorithm for enhanced computational efficiency.</p><p><em>Solution method:</em> iterative solution of the quantum optimal control equations using finite element methods, effective mass formulation, pseudoskeleton decomposition, sparse matrix linear algebra, and nonuniform fast Fourier transforms.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001711/pdfft?md5=ef74cd15cf84bf6ac30afe3a9868240b&pid=1-s2.0-S0010465524001711-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141048809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-16DOI: 10.1016/j.cpc.2024.109247
Dong-Yeop Na , Fernando L. Teixeira , Yuri A. Omelchenko
A novel electromagnetic particle-in-cell algorithm has been developed for fully kinetic plasma simulations on unstructured (irregular) meshes in complex body-of-revolution geometries. The algorithm, implemented in the BORPIC++ code, utilizes a set of field scalings and a coordinate mapping, reducing the Maxwell field problem in a cylindrical system to a Cartesian finite element Maxwell solver in the meridian plane. The latter obviates the cylindrical coordinate singularity in the symmetry axis. The choice of an unstructured finite element discretization enhances the geometrical flexibility of the BORPIC++ solver compared to the more traditional finite difference solvers. Symmetries in Maxwell's equations are explored to decompose the problem into two dual polarization states with isomorphic representations that enable code reuse. The particle-in-cell scatter and gather steps preserve charge-conservation at the discrete level. Our previous algorithm (BORPIC+) discretized the E and B field components of TEϕ and TMϕ polarizations on the finite element (primal) mesh [1], [2]. Here, we employ a new field-update scheme. Using the same finite element (primal) mesh, this scheme advances two sets of field components independently: (1) E and B of TEϕ polarized fields, () and (2) D and H of TMϕ polarized fields, (). Since these field updates are not explicitly coupled, the new field solver obviates the coordinate singularity, which otherwise arises at the cylindrical symmetric axis, when defining the discrete Hodge matrices (generalized finite element mass matrices). A cylindrical perfectly matched layer is implemented as a boundary condition in the radial direction to simulate open space problems, with periodic boundary conditions in the axial direction. We investigate effects of charged particles moving next to the cylindrical perfectly matched layer. We model azimuthal currents arising from rotational motion of charged rings, which produce TMϕ polarized fields. Several numerical examples are provided to illustrate the first application of the algorithm.
针对复杂旋转体几何结构中的非结构(不规则)网格上的全动能等离子体模拟,我们开发了一种新型电磁粒子-单元算法。该算法在 BORPIC++ 代码中实现,利用一组场标度和坐标映射,将圆柱系统中的麦克斯韦场问题简化为子午线平面上的笛卡尔有限元麦克斯韦求解器。后者避免了对称轴上的圆柱坐标奇异性。与传统的有限差分求解器相比,非结构化有限元离散化的选择增强了 BORPIC++ 求解器的几何灵活性。通过探索麦克斯韦方程的对称性,将问题分解为两个具有同构表示的双重极化状态,从而实现了代码的重复使用。粒子在小室中的散射和聚集步骤在离散水平上保持了电荷守恒。我们以前的算法(BORPIC+)在有限元(基元)网格上离散了 TEϕ 和 TMϕ 极化的 E 和 B 场分量[1],[2]。在这里,我们采用了一种新的场更新方案。该方案使用相同的有限元(基元)网格,独立更新两组场分量:(1) TEϕ 极化场的 E 和 B(Ez,Eρ,Bϕ);(2) TMϕ 极化场的 D 和 H(Dϕ,Hz,Hρ)。由于这些场更新没有显式耦合,新的场求解器避免了坐标奇异性,否则在定义离散霍奇矩阵(广义有限元质量矩阵)时,会在圆柱对称轴 ρ=0 处产生奇异性。为了模拟开放空间问题,在径向实施了圆柱完全匹配层作为边界条件,在轴向实施了周期性边界条件。我们研究了在圆柱完全匹配层旁边运动的带电粒子的影响。我们模拟了带电环旋转运动产生的方位电流,它产生了 TMϕ 极化场。我们提供了几个数值示例来说明该算法的首次应用。
{"title":"An unstructured body-of-revolution electromagnetic particle-in-cell algorithm with radial perfectly matched layers and dual polarizations","authors":"Dong-Yeop Na , Fernando L. Teixeira , Yuri A. Omelchenko","doi":"10.1016/j.cpc.2024.109247","DOIUrl":"10.1016/j.cpc.2024.109247","url":null,"abstract":"<div><p>A novel electromagnetic particle-in-cell algorithm has been developed for fully kinetic plasma simulations on unstructured (irregular) meshes in complex body-of-revolution geometries. The algorithm, implemented in the BORPIC++ code, utilizes a set of field scalings and a coordinate mapping, reducing the Maxwell field problem in a cylindrical system to a Cartesian finite element Maxwell solver in the meridian plane. The latter obviates the cylindrical coordinate singularity in the symmetry axis. The choice of an unstructured finite element discretization enhances the geometrical flexibility of the BORPIC++ solver compared to the more traditional finite difference solvers. Symmetries in Maxwell's equations are explored to decompose the problem into two dual polarization states with isomorphic representations that enable code reuse. The particle-in-cell scatter and gather steps preserve charge-conservation at the discrete level. Our previous algorithm (BORPIC+) discretized the <strong>E</strong> and <strong>B</strong> field components of TE<sup><em>ϕ</em></sup> and TM<sup><em>ϕ</em></sup> polarizations on the finite element (primal) mesh <span>[1]</span>, <span>[2]</span>. Here, we employ a new field-update scheme. Using the same finite element (primal) mesh, this scheme advances two sets of field components independently: (1) <strong>E</strong> and <strong>B</strong> of TE<sup><em>ϕ</em></sup> polarized fields, (<span><math><msub><mrow><mi>E</mi></mrow><mrow><mi>z</mi></mrow></msub><mo>,</mo><msub><mrow><mi>E</mi></mrow><mrow><mi>ρ</mi></mrow></msub><mo>,</mo><msub><mrow><mi>B</mi></mrow><mrow><mi>ϕ</mi></mrow></msub></math></span>) and (2) <strong>D</strong> and <strong>H</strong> of TM<sup><em>ϕ</em></sup> polarized fields, (<span><math><msub><mrow><mi>D</mi></mrow><mrow><mi>ϕ</mi></mrow></msub><mo>,</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>z</mi></mrow></msub><mo>,</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>ρ</mi></mrow></msub></math></span>). Since these field updates are not explicitly coupled, the new field solver obviates the coordinate singularity, which otherwise arises at the cylindrical symmetric axis, <span><math><mi>ρ</mi><mo>=</mo><mn>0</mn></math></span> when defining the discrete Hodge matrices (generalized finite element mass matrices). A cylindrical perfectly matched layer is implemented as a boundary condition in the radial direction to simulate open space problems, with periodic boundary conditions in the axial direction. We investigate effects of charged particles moving next to the cylindrical perfectly matched layer. We model azimuthal currents arising from rotational motion of charged rings, which produce TM<sup><em>ϕ</em></sup> polarized fields. Several numerical examples are provided to illustrate the first application of the algorithm.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141043901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-16DOI: 10.1016/j.cpc.2024.109246
Marcin Rogowski , Brandon C.Y. Yeung , Oliver T. Schmidt , Romit Maulik , Lisandro Dalcin , Matteo Parsani , Gianmarco Mengaldo
We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py. An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.
Program summary
Program Title: PySPOD
CPC Library link to program files:https://doi.org/10.17632/jf5bf26jcj.1
Nature of problem: Large spatio-temporal datasets may contain coherent patterns that can be leveraged to better understand, model, and possibly predict the behavior of complex dynamical systems. To this end, modal decomposition methods, such as the proper orthogonal decomposition (POD) and its spectral counterpart (SPOD), constitute powerful tools. The SPOD algorithm allows the systematic identification of space-time coherent patterns. This can be used to understand better the physics of the process of interest, and provide a path for mathematical modeling, including reduced order modeling. The SPOD algorithm has been successfully applied to fluid dynamics, geophysics and other domains. However, the existing open-source implementations are serial, and they prevent running on the increasingly large datasets that are becoming available, especially in computational physics. The inability to analyze via SPOD large dataset in turn prevents unlocking novel mechanisms and dynamical behaviors in complex systems.
Solution method: We provide an open-source parallel (MPI distributed) code, namely PySPOD, that is able to run on large datasets (the ones considered in the present paper reach about 200 Terabytes). The code is built on the previous serial open-source code PySPOD that was published in https://joss.theoj.org/papers/10.21105/joss.02862.pdf. The new parallel implementation is able to s
我们提出了光谱正交分解(SPOD)技术的并行(分布式)版本。并行 SPOD 算法将数据集的空间维度分布在保留时间的情况下。采用这种方法可以保留数据在时间上的非分布式快速傅里叶变换,从而避免相关瓶颈。并行 SPOD 算法在 PySPOD 库中实现,并利用了标准消息传递接口(MPI)库,通过 mpi4py 在 Python 中实现。对并行软件包进行了广泛的性能评估,包括强可扩展性分析和弱可扩展性分析。该开源库允许对科学界感兴趣的大型数据集进行分析。在这里,我们介绍了流体动力学和地球物理学中的应用,如果没有并行算法,这些应用是极难实现的(如果不是不可能的话)。这项工作开辟了对大型准稳态数据进行模态分析的道路,有助于发现新的未开发时空模式:PySPODCPC 库程序文件链接:https://doi.org/10.17632/jf5bf26jcj.1Developer's repository 链接:https://github.com/MathEXLab/PySPODLicensing provisions:MIT 许可编程语言:Python问题性质:大型时空数据集可能包含连贯模式,可以利用这些模式更好地理解、模拟并预测复杂动态系统的行为。为此,模态分解方法,如适当正交分解(POD)及其对应的频谱分解(SPOD),构成了强大的工具。SPOD 算法可以系统地识别时空相干模式。这可用于更好地理解相关过程的物理原理,并为数学建模(包括降阶建模)提供路径。SPOD 算法已成功应用于流体动力学、地球物理学和其他领域。然而,现有的开源实现都是串行的,无法在日益庞大的数据集上运行,尤其是在计算物理领域。无法通过 SPOD 分析大型数据集反过来又阻碍了揭示复杂系统中的新机制和动态行为:我们提供了一种开源并行(MPI 分布式)代码,即 PySPOD,它能够在大型数据集上运行(本文中考虑的数据集达到约 200 太字节)。该代码基于之前发布于 https://joss.theoj.org/papers/10.21105/joss.02862.pdf 的串行开源代码 PySPOD。新的并行执行能够在多个节点上扩展(我们展示了弱扩展性和强扩展性),并解决了一些在 I/O 阶段常见的瓶颈问题。当前的并行代码可以运行在用串行 SPOD 算法不容易或不可能分析的数据集上,从而为解锁计算物理学的新发现提供了一条途径:代码带有一套内置的后处理工具,用于可视化结果。除了相关的 GiHub 代码库之外,它还附带了大量的持续集成、文档和教程,以及一个专门的网站。在该软件包中,我们还提供了适当正交分解(POD)的并行执行,利用了 SPOD 算法的 I/O 并行能力。
{"title":"Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package","authors":"Marcin Rogowski , Brandon C.Y. Yeung , Oliver T. Schmidt , Romit Maulik , Lisandro Dalcin , Matteo Parsani , Gianmarco Mengaldo","doi":"10.1016/j.cpc.2024.109246","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109246","url":null,"abstract":"<div><p>We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the <span>PySPOD</span><svg><path></path></svg> library and makes use of the standard message passing interface (MPI) library, implemented in Python via <span>mpi4py</span><svg><path></path></svg>. An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> PySPOD</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/jf5bf26jcj.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/MathEXLab/PySPOD</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> MIT License</p><p><em>Programming language:</em> Python</p><p><em>Nature of problem:</em> Large spatio-temporal datasets may contain coherent patterns that can be leveraged to better understand, model, and possibly predict the behavior of complex dynamical systems. To this end, modal decomposition methods, such as the proper orthogonal decomposition (POD) and its spectral counterpart (SPOD), constitute powerful tools. The SPOD algorithm allows the systematic identification of space-time coherent patterns. This can be used to understand better the physics of the process of interest, and provide a path for mathematical modeling, including reduced order modeling. The SPOD algorithm has been successfully applied to fluid dynamics, geophysics and other domains. However, the existing open-source implementations are serial, and they prevent running on the increasingly large datasets that are becoming available, especially in computational physics. The inability to analyze via SPOD large dataset in turn prevents unlocking novel mechanisms and dynamical behaviors in complex systems.</p><p><em>Solution method:</em> We provide an open-source parallel (MPI distributed) code, namely PySPOD, that is able to run on large datasets (the ones considered in the present paper reach about 200 Terabytes). The code is built on the previous serial open-source code PySPOD that was published in <span>https://joss.theoj.org/papers/10.21105/joss.02862.pdf</span><svg><path></path></svg>. The new parallel implementation is able to s","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141163504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-15DOI: 10.1016/j.cpc.2024.109243
Jean-Marco Alameddine , Johannes Albrecht , Hans Dembinski , Pascal Gutjahr , Karl-Heinz Kampert , Wolfgang Rhode , Maximilian Sackel , Alexander Sandrock , Jan Soedingrekso
Accurate particle simulations are essential for the next generation of experiments in astroparticle physics. The Monte Carlo simulation library PROPOSAL is a flexible tool to efficiently propagate high-energy leptons and photons through large volumes of media, for example in the context of underground observatories. It is written as a C++ library, including a Python interface. In this paper, the most recent updates of PROPOSAL are described, including the addition of electron, positron, and photon propagation, for which new interaction types have been implemented. This allows the usage of PROPOSAL to simulate electromagnetic particle cascades, for example in the context of air shower simulations. The precision of the propagation has been improved by including rare interaction processes, new photonuclear parametrizations, deflections in stochastic interactions, and the possibility of propagating in inhomogeneous density distributions. Additional technical improvements regarding the interpolation routine and the propagation algorithm are described.
New version program summary
Program Title: PROPOSAL.
CPC Library link to program files:https://doi.org/10.17632/g478pjdcxy.2.
Does the new version supersede the previous version?: Yes.
Reasons for the new version: Substantial addition of features. Various bugfixes.
Summary of revisions: The library now also treats photons and has the corresponding processes implemented. New parametrizations for photonuclear interaction have been implemented. The angular deflection in stochastic energy losses has been implemented in addition to the already existing multiple scattering implementation, which has been improved to reduce the runtime. The implementation of the Landau-Pomeranchuk-Migdal effect has been corrected. The propagation algorithm has been improved, including the support of inhomogeneous density distributions.
Nature of problem: Three-dimensional propagation of charged leptons and photons through different media. Particles lose energy stochastically by ionization, bremsstrahlung, pair production, and photonuclear interaction for charged leptons (including annihilation with atomic electrons for positrons) and Compton scattering, pair production, photoelectric effect and photohadronic interaction for photons. Additionally, they are deflected while propagating through the medium due to both multiple elastic Coulomb scattering as well as deflections in individual stochastic interactions. Unstable particles eventually decay, pro
{"title":"Improvements in charged lepton and photon propagation for the software PROPOSAL","authors":"Jean-Marco Alameddine , Johannes Albrecht , Hans Dembinski , Pascal Gutjahr , Karl-Heinz Kampert , Wolfgang Rhode , Maximilian Sackel , Alexander Sandrock , Jan Soedingrekso","doi":"10.1016/j.cpc.2024.109243","DOIUrl":"10.1016/j.cpc.2024.109243","url":null,"abstract":"<div><p>Accurate particle simulations are essential for the next generation of experiments in astroparticle physics. The Monte Carlo simulation library PROPOSAL is a flexible tool to efficiently propagate high-energy leptons and photons through large volumes of media, for example in the context of underground observatories. It is written as a <span>C++</span> library, including a Python interface. In this paper, the most recent updates of PROPOSAL are described, including the addition of electron, positron, and photon propagation, for which new interaction types have been implemented. This allows the usage of PROPOSAL to simulate electromagnetic particle cascades, for example in the context of air shower simulations. The precision of the propagation has been improved by including rare interaction processes, new photonuclear parametrizations, deflections in stochastic interactions, and the possibility of propagating in inhomogeneous density distributions. Additional technical improvements regarding the interpolation routine and the propagation algorithm are described.</p></div><div><h3>New version program summary</h3><p><em>Program Title:</em> PROPOSAL.</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/g478pjdcxy.2</span><svg><path></path></svg>.</p><p><em>Developer's repository link:</em> <span>https://github.com/tudo-astroparticlephysics/PROPOSAL</span><svg><path></path></svg>.</p><p><em>Licensing provisions:</em> LGPL.</p><p><em>Programming language:</em> C++, Python.</p><p><em>Journal reference of previous version:</em> Comput. Phys. Commun. 242 (2019) 132.</p><p><em>Does the new version supersede the previous version?:</em> Yes.</p><p><em>Reasons for the new version:</em> Substantial addition of features. Various bugfixes.</p><p><em>Summary of revisions:</em> The library now also treats photons and has the corresponding processes implemented. New parametrizations for photonuclear interaction have been implemented. The angular deflection in stochastic energy losses has been implemented in addition to the already existing multiple scattering implementation, which has been improved to reduce the runtime. The implementation of the Landau-Pomeranchuk-Migdal effect has been corrected. The propagation algorithm has been improved, including the support of inhomogeneous density distributions.</p><p><em>Nature of problem:</em> Three-dimensional propagation of charged leptons and photons through different media. Particles lose energy stochastically by ionization, bremsstrahlung, pair production, and photonuclear interaction for charged leptons (including annihilation with atomic electrons for positrons) and Compton scattering, pair production, photoelectric effect and photohadronic interaction for photons. Additionally, they are deflected while propagating through the medium due to both multiple elastic Coulomb scattering as well as deflections in individual stochastic interactions. Unstable particles eventually decay, pro","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001668/pdfft?md5=8a6f84ccba67ed031fe5ffc276e87899&pid=1-s2.0-S0010465524001668-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141025818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-15DOI: 10.1016/j.cpc.2024.109244
Uladzimir Khasianevich , Wojciech Kotlarski , Dominik Stöckinger , Alexander Voigt
FlexibleSUSY is a framework for the automated computation of physical quantities (observables) in models beyond the Standard Model (BSM). This paper describes an extension of FlexibleSUSY which allows to define and add new observables that can be enabled and computed in applicable user-defined BSM models. The extension has already been used to include Charged Lepton Flavor Violation (CLFV) observables, but further observables can now be added straightforwardly. The paper is split into two parts. The first part is non-technical and describes from the user's perspective how to enable the calculation of predefined observables, in particular CLFV observables. The second part of the paper explains how to define new observables such that their automatic computation in any applicable BSM model becomes possible. A key ingredient is the new NPointFunctions extension which allows to use tree-level and loop calculations in the model-independent setup of observables. Three examples of increasing complexity are fully worked out. This illustrates the features and provides code snippets that may be used as a starting point for implementation of further observables.
Program summary
Program title:NPointFunctions
CPC Library link to program files:https://doi.org/10.17632/kf7m8gn8vp.2
Does the new version supersede the previous version?: Yes
Reasons for the new version: Program extension including new observables and file structures
Nature of problem: Determining observables for an arbitrary extension of the Standard Model supported by FlexibleSUSY, input by the user.
Solution method: Generation of the code from automated algebraic manipulations. Automatic filling and compiling of predefined template files.
Additional comments including restrictions and unusual features: Vertices with a direct product of Lorentz and color structures are supported. Settings of the advanced NPointFunctions mode rely on explicit specification of topologies.
{"title":"FlexibleSUSY extended to automatically compute physical quantities in any beyond the standard model theory: Charged lepton flavor violation processes, Higgs decays, and user-defined observables","authors":"Uladzimir Khasianevich , Wojciech Kotlarski , Dominik Stöckinger , Alexander Voigt","doi":"10.1016/j.cpc.2024.109244","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109244","url":null,"abstract":"<div><p><span>FlexibleSUSY</span> is a framework for the automated computation of physical quantities (observables) in models beyond the Standard Model (BSM). This paper describes an extension of <span>FlexibleSUSY</span> which allows to define and add new observables that can be enabled and computed in applicable user-defined BSM models. The extension has already been used to include Charged Lepton Flavor Violation (CLFV) observables, but further observables can now be added straightforwardly. The paper is split into two parts. The first part is non-technical and describes from the user's perspective how to enable the calculation of predefined observables, in particular CLFV observables. The second part of the paper explains how to define new observables such that their automatic computation in any applicable BSM model becomes possible. A key ingredient is the new <span>NPointFunctions</span> extension which allows to use tree-level and loop calculations in the model-independent setup of observables. Three examples of increasing complexity are fully worked out. This illustrates the features and provides code snippets that may be used as a starting point for implementation of further observables.</p></div><div><h3>Program summary</h3><p><em>Program title:</em> <span>NPointFunctions</span></p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/kf7m8gn8vp.2</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/FlexibleSUSY/FlexibleSUSY</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GPLv3</p><p><em>Programming language:</em> <span>C++</span>, <span>Wolfram Language</span>, <span>Fortran</span>, <span>Bourne shell</span></p><p><em>Journal reference of previous version::</em> Comput. Phys. Commun. 230 (2018) 145–217; PoS CompTools2021 (2022) 036</p><p><em>Does the new version supersede the previous version?:</em> Yes</p><p><em>Reasons for the new version:</em> Program extension including new observables and file structures</p><p><em>Nature of problem:</em> Determining observables for an arbitrary extension of the Standard Model supported by <span>FlexibleSUSY</span>, input by the user.</p><p><em>Solution method:</em> Generation of the code from automated algebraic manipulations. Automatic filling and compiling of predefined template files.</p><p><em>Additional comments including restrictions and unusual features:</em> Vertices with a direct product of Lorentz and color structures are supported. Settings of the advanced <span>NPointFunctions</span> mode rely on explicit specification of topologies.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S001046552400167X/pdfft?md5=d90f11286a31b2404401b33e36a8386d&pid=1-s2.0-S001046552400167X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141077923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1016/j.cpc.2024.109238
Anand Radhakrishnan , Henry Le Berre , Benjamin Wilfong , Jean-Sebastien Spratt , Mauro Rodriguez Jr. , Tim Colonius , Spencer H. Bryngelson
Multiphase compressible flows are often characterized by a broad range of space and time scales, entailing large grids and small time steps. Simulations of these flows on CPU-based clusters can thus take several wall-clock days. Offloading the compute kernels to GPUs appears attractive but is memory-bound for many finite-volume and -difference methods, damping speedups. Even when realized, GPU-based kernels lead to more intrusive communication and I/O times owing to lower computation costs. We present a strategy for GPU acceleration of multiphase compressible flow solvers that addresses these challenges and obtains large speedups at scale. We use OpenACC for directive-based offloading of all compute kernels while maintaining low-level control when needed. An established Fortran preprocessor and metaprogramming tool, Fypp, enables otherwise hidden compile-time optimizations. This strategy exposes compile-time optimizations and high memory reuse while retaining readable, maintainable, and compact code. Remote direct memory access realized via CUDA-aware MPI and GPUDirect reduces halo-exchange communication time. We implement this approach in the open-source solver MFC [1]. Metaprogramming results in an 8-times speedup of the most expensive kernels compared to a statically compiled program, reaching 46% of peak FLOPs on modern NVIDIA GPUs and high arithmetic intensity (about 10 FLOPs/byte). In representative simulations, a single NVIDIA A100 GPU is 7-times faster compared to an Intel Xeon Cascade Lake (6248) CPU die, or about 300-times faster compared to a single such CPU core. At the same time, near-ideal (97%) weak scaling is observed for at least 13824 GPUs on OLCF Summit. A strong scaling efficiency of 84% is retained for an 8-times increase in GPU count. Collective I/O, implemented via MPI3, helps ensure the negligible contribution of data transfers ( of the wall time for a typical, large simulation). Large many-GPU simulations of compressible (solid-)liquid-gas flows demonstrate the practical utility of this strategy.
{"title":"Method for scalable and performant GPU-accelerated simulation of multiphase compressible flow","authors":"Anand Radhakrishnan , Henry Le Berre , Benjamin Wilfong , Jean-Sebastien Spratt , Mauro Rodriguez Jr. , Tim Colonius , Spencer H. Bryngelson","doi":"10.1016/j.cpc.2024.109238","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109238","url":null,"abstract":"<div><p>Multiphase compressible flows are often characterized by a broad range of space and time scales, entailing large grids and small time steps. Simulations of these flows on CPU-based clusters can thus take several wall-clock days. Offloading the compute kernels to GPUs appears attractive but is memory-bound for many finite-volume and -difference methods, damping speedups. Even when realized, GPU-based kernels lead to more intrusive communication and I/O times owing to lower computation costs. We present a strategy for GPU acceleration of multiphase compressible flow solvers that addresses these challenges and obtains large speedups at scale. We use OpenACC for directive-based offloading of all compute kernels while maintaining low-level control when needed. An established Fortran preprocessor and metaprogramming tool, Fypp, enables otherwise hidden compile-time optimizations. This strategy exposes compile-time optimizations and high memory reuse while retaining readable, maintainable, and compact code. Remote direct memory access realized via CUDA-aware MPI and GPUDirect reduces halo-exchange communication time. We implement this approach in the open-source solver MFC <span>[1]</span>. Metaprogramming results in an 8-times speedup of the most expensive kernels compared to a statically compiled program, reaching 46% of peak FLOPs on modern NVIDIA GPUs and high arithmetic intensity (about 10 FLOPs/byte). In representative simulations, a single NVIDIA A100 GPU is 7-times faster compared to an Intel Xeon Cascade Lake (6248) CPU die, or about 300-times faster compared to a single such CPU core. At the same time, near-ideal (97%) weak scaling is observed for at least 13824 GPUs on OLCF Summit. A strong scaling efficiency of 84% is retained for an 8-times increase in GPU count. Collective I/O, implemented via MPI3, helps ensure the negligible contribution of data transfers (<span><math><mo><</mo><mn>1</mn><mtext>%</mtext></math></span> of the wall time for a typical, large simulation). Large many-GPU simulations of compressible (solid-)liquid-gas flows demonstrate the practical utility of this strategy.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140951033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}