Pub Date : 2024-05-13DOI: 10.1016/j.cpc.2024.109240
Christian Guzman Ruiz , Mario Acosta , Oriol Jorba , Eduardo Cesar Galobardes , Matthew Dawson , Guillermo Oyarzun , Carlos Pérez García-Pando , Kim Serradell
Earth system models (ESM) demand significant hardware resources and energy consumption to solve atmospheric chemistry processes. Recent studies have shown improved performance from running these models on GPU accelerators. Nonetheless, there is room for improvement in exploiting even more GPU resources.
This study proposes an optimized distribution of the chemical solver's computational load on the GPU, named Block-cells. Additionally, we evaluate different configurations for distributing the computational load in an NVIDIA GPU.
We use the linear solver from the Chemistry Across Multiple Phases (CAMP) framework as our test bed. An intermediate-complexity chemical mechanism under typical atmospheric conditions is used. Results demonstrate a 35× speedup compared to the single-CPU thread reference case. Even using the full resources of the node (40 physical cores) on the reference case, the Block-cells version outperforms them by 50%. The Block-cells approach shows promise in alleviating the computational burden of chemical solvers on GPU architectures.
{"title":"Optimized thread-block arrangement in a GPU implementation of a linear solver for atmospheric chemistry mechanisms","authors":"Christian Guzman Ruiz , Mario Acosta , Oriol Jorba , Eduardo Cesar Galobardes , Matthew Dawson , Guillermo Oyarzun , Carlos Pérez García-Pando , Kim Serradell","doi":"10.1016/j.cpc.2024.109240","DOIUrl":"10.1016/j.cpc.2024.109240","url":null,"abstract":"<div><p>Earth system models (ESM) demand significant hardware resources and energy consumption to solve atmospheric chemistry processes. Recent studies have shown improved performance from running these models on GPU accelerators. Nonetheless, there is room for improvement in exploiting even more GPU resources.</p><p>This study proposes an optimized distribution of the chemical solver's computational load on the GPU, named Block-cells. Additionally, we evaluate different configurations for distributing the computational load in an NVIDIA GPU.</p><p>We use the linear solver from the Chemistry Across Multiple Phases (CAMP) framework as our test bed. An intermediate-complexity chemical mechanism under typical atmospheric conditions is used. Results demonstrate a 35× speedup compared to the single-CPU thread reference case. Even using the full resources of the node (40 physical cores) on the reference case, the Block-cells version outperforms them by 50%. The Block-cells approach shows promise in alleviating the computational burden of chemical solvers on GPU architectures.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141032446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-11DOI: 10.1016/j.cpc.2024.109242
Qi Min , Ziyang Xu , Siqi He , Haidong Lu , Xingbang Liu , Ruizi Shen , Yanhong Wu , Qikun Pan , Chongxiao Zhao , Fei Chen , Maogen Su , Chenzhong Dong
In this paper, we introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core radiation hydrodynamic equations are resolved in the Eulerian frame, employing an operator-split method. This method decomposes the solution into two substeps: first, the explicit resolution of the hyperbolic subsystems integrating radiation and fluid dynamics; second, the implicit treatment of the parabolic part comprising stiff radiation diffusion, heat conduction, and energy exchange. Laser propagation and energy deposition are modeled through a hybrid approach, combining geometrical-optics ray-tracing in sub-critical plasma regions with a one-dimensional solution of the Helmholtz wave equation in super-critical areas. The thermodynamic states are ascertained using an equation of state, based on either the real gas approximation or the quotidian equation of state (QEOS). For ionization calculations, the code employs a steady-state collisional-radiation (CR) model using the screened-hydrogenic approximation. Additionally, RHDLPP includes RHDLPP-SpeIma3D, a three-dimensional spectral simulation post-processing module, for generating both temporally-spatially resolved and time-integrated spectra and imaging, facilitating direct comparisons with experimental data. The paper showcases a series of verification tests to establish the code's accuracy and efficiency, followed by application cases, including simulations of laser-produced aluminium (Al) plasmas, pre-pulse-induced target deformation of tin (Sn) microdroplets relevant to extreme ultraviolet lithography light sources, and varied imaging and spectroscopic simulations. These simulations highlight RHDLPP's effectiveness and applicability in fields such as laser-induced breakdown spectroscopy, extreme ultraviolet lithography sources, and high-energy-density physics.
{"title":"RHDLPP: A multigroup radiation hydrodynamics code for laser-produced plasmas","authors":"Qi Min , Ziyang Xu , Siqi He , Haidong Lu , Xingbang Liu , Ruizi Shen , Yanhong Wu , Qikun Pan , Chongxiao Zhao , Fei Chen , Maogen Su , Chenzhong Dong","doi":"10.1016/j.cpc.2024.109242","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109242","url":null,"abstract":"<div><p>In this paper, we introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core radiation hydrodynamic equations are resolved in the Eulerian frame, employing an operator-split method. This method decomposes the solution into two substeps: first, the explicit resolution of the hyperbolic subsystems integrating radiation and fluid dynamics; second, the implicit treatment of the parabolic part comprising stiff radiation diffusion, heat conduction, and energy exchange. Laser propagation and energy deposition are modeled through a hybrid approach, combining geometrical-optics ray-tracing in sub-critical plasma regions with a one-dimensional solution of the Helmholtz wave equation in super-critical areas. The thermodynamic states are ascertained using an equation of state, based on either the real gas approximation or the quotidian equation of state (QEOS). For ionization calculations, the code employs a steady-state collisional-radiation (CR) model using the screened-hydrogenic approximation. Additionally, RHDLPP includes RHDLPP-SpeIma3D, a three-dimensional spectral simulation post-processing module, for generating both temporally-spatially resolved and time-integrated spectra and imaging, facilitating direct comparisons with experimental data. The paper showcases a series of verification tests to establish the code's accuracy and efficiency, followed by application cases, including simulations of laser-produced aluminium (Al) plasmas, pre-pulse-induced target deformation of tin (Sn) microdroplets relevant to extreme ultraviolet lithography light sources, and varied imaging and spectroscopic simulations. These simulations highlight RHDLPP's effectiveness and applicability in fields such as laser-induced breakdown spectroscopy, extreme ultraviolet lithography sources, and high-energy-density physics.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141068718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.cpc.2024.109237
Qifeng Zhang, Tong Yan, Dinghua Xu, Yong Chen
The Fourier pseudo-spectral method is well suited to solve PDEs under the periodic boundary condition due to its high-order accuracy and easy-to-implement feature. In this paper, we explore as well as comparatively study four classes of Fourier pseudo-spectral schemes for solving the rotation-two-component Camassa–Holm system which possibly owns peakon solitons. Via exploiting inherent structural properties of the system, we reformulate it into two kinds of different equivalent forms and then apply the Fourier pseudo-spectral method to derive two spatial semi-discrete systems, both of which are proved to preserve the corresponding invariants including mass, momentum and energy. Subsequently, we construct two linearly implicit schemes based on Strang splitting technique and two nonlinear schemes, respectively, for both semi-discrete systems. Owing to the different equivalent forms in the structure, one of the nonlinear schemes preserves discrete mass and momentum, while the other one is shown to preserve all three invariants. Numerical results under the situation of smooth/nonsmooth initial values are provided for distinct types of solutions to test the accuracy in long time simulation and to verify the capacity of predicting water wave propagation, as well as advantages in preserving these invariants. For instance, the present schemes are shown to be at least 14 significant digits, improving upon 10 from ones in previous references.
{"title":"Direct/split invariant-preserving Fourier pseudo-spectral methods for the rotation-two-component Camassa–Holm system with peakon solitons","authors":"Qifeng Zhang, Tong Yan, Dinghua Xu, Yong Chen","doi":"10.1016/j.cpc.2024.109237","DOIUrl":"10.1016/j.cpc.2024.109237","url":null,"abstract":"<div><p>The Fourier pseudo-spectral method is well suited to solve PDEs under the periodic boundary condition due to its high-order accuracy and easy-to-implement feature. In this paper, we explore as well as comparatively study four classes of Fourier pseudo-spectral schemes for solving the rotation-two-component Camassa–Holm system which possibly owns peakon solitons. Via exploiting inherent structural properties of the system, we reformulate it into two kinds of different equivalent forms and then apply the Fourier pseudo-spectral method to derive two spatial semi-discrete systems, both of which are proved to preserve the corresponding invariants including mass, momentum and energy. Subsequently, we construct two linearly implicit schemes based on Strang splitting technique and two nonlinear schemes, respectively, for both semi-discrete systems. Owing to the different equivalent forms in the structure, one of the nonlinear schemes preserves discrete mass and momentum, while the other one is shown to preserve all three invariants. Numerical results under the situation of smooth/nonsmooth initial values are provided for distinct types of solutions to test the accuracy in long time simulation and to verify the capacity of predicting water wave propagation, as well as advantages in preserving these invariants. For instance, the present schemes are shown to be at least 14 significant digits, improving upon 10 from ones in previous references.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141028116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.cpc.2024.109239
Abhishek , Paul Stevenson , Yue Shi , Esra Yüksel , A.S. Umar
The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the calculation of nuclear strength functions in the regime of linear response theory, while retaining all existing functionality and use cases. The strength functions are benchmarked against available RPA codes, and the user has the freedom of choice when selecting the nature of external excitation (from monopole to hexadecapole and more). Some utility programs are also provided that calculate the strength function from the time-dependent output of the dynamic calculations of the Sky3D code.
New version program summary
Program Title: Sky3D
CPC Library link to program files:https://doi.org/10.17632/vzbrzvyrn4.2
Programming language: Fortran, with one post-processing utility in Python.
Journal reference of previous version:: Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018). The TDHF code Sky3D version 1.1. Comput. Phys. Commun. 229 (2018) 211–213.
Does the new version supersede the previous version?: Yes.
Reasons for the new version: The capability of reproducing the nuclear strength function for a variety of newly-coded external boosts has been added.
Nature of problem: Calculating nuclear multipole strength functions is a crucial probe that can help model the nuclear system and its structure properties. A variety of models exist for this task, such as QRPA (Quasiparticle Random Phase Approximation) and its variants, but such approaches are often limited due to symmetry constraints. Time-dependent Hartree Fock (TDHF) has been used to simulate nuclear vibrations and collisions between nuclei for low energies without assuming any symmetry in the system. This code extends the TDHF to calculate the multi-pole strength functions of atomic nuclei. We showcase its reliability by comparing it with the established RPA codes for the calculation of such strength functions.
Solution method: We extended previous versions of the Sky3D code [1,2] to include an external boost of multipole type where the user can provide custom input that decides the nature of the multipole (monopole, quadrupole, octupole, and so on) boost. The principal aim is to calculate the multipole strength function, which is the Fourier transform of the time-dependent expectation value of the m
Sky3D 代码已被广泛用于描述核基态、集体振动激发和重离子碰撞。该方法基于 Skyrme 力或相关能量密度函数。静态和动态方程在三维网格上求解,配对以 BCS 近似方法实现。该代码的更新版本旨在促进线性响应理论机制下的核强度函数计算,同时保留所有现有功能和用例。强度函数以现有的 RPA 代码为基准,用户可以自由选择外部激励的性质(从单极到十六极等)。此外,还提供了一些实用程序,可从 Sky3D 代码动态计算的随时间变化的输出中计算强度函数:Sky3DCPC 库程序文件链接:https://doi.org/10.17632/vzbrzvyrn4.2Developer's 资源库链接:https://github.com/manybody/sky3dLicensing 规定:GPLv3编程语言:Fortran, with one post-processing utility in Python.Journal reference of previous version::Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018)。TDHF 代码 Sky3D 1.1 版。Comput.Phys.229 (2018) 211-213.Does the new version supersede the previous version?是的:增加了为各种新编码的外部助推作用重现核强度函数的功能。问题性质:计算核多极强度函数是一项重要的探测工作,有助于为核系统及其结构特性建模。目前有多种模型可用于这一任务,例如 QRPA(准粒子随机相位逼近)及其变体,但由于对称性的限制,这些方法往往受到限制。与时间相关的哈特里-福克(TDHF)已被用于模拟低能量的核振动和核之间的碰撞,而无需假定系统中存在任何对称性。本代码对 TDHF 进行了扩展,以计算原子核的多极强度函数。我们将其与计算此类强度函数的成熟 RPA 代码进行了比较,从而展示了它的可靠性:我们扩展了之前版本的 Sky3D 代码[1,2],加入了多极类型的外部推动,用户可以提供自定义输入,决定多极(单极、四极、八极等)推动的性质。其主要目的是计算多极强度函数,即多极算子随时间变化的期望值的傅立叶变换,其形式与外部推动相同。通过适当的单位转换,可以提取出计算出的强度函数的精确单位,这与该领域的现有文献不相上下。边界条件的选择使类似伍兹-撒克逊的函数切断了外部场,使其在边界为零:实现了提供更通用的外部激励场的能力,可以选择多极性和等时空性质。然后,作为时间的函数,跟踪两种等离子体和 L=5 以下多极性的矩。此外还包括一个新的分析程序,用于计算强度函数和能量加权和规则。还包括在配对力中指定混合参数的功能。这样就能以更符合 QRPA 代码的方式选择混合表面-体积配对,以便进行比较。此外,还包括一个更新的 Makefile,使在苹果硅计算机上编译变得更容易。Maruhn, P.-G.Reinhard, P. Stevenson, A. Umar, The TDHF code Sky3D, Comput.Phys.185 (7) (2014) 2195-2216, https://doi.org/10.1016/j.cpc.2014.04.008。[2]B.Schuetrumpf, P.-G.Reinhard, P. Stevenson, A. Umar, J. Maruhn, The TDHF code Sky3D version 1.1, Comput.Phys.229 (2018) 211-213, https://doi.org/10.1016/j.cpc.2018.03.012
{"title":"The TDHF code Sky3D version 1.2","authors":"Abhishek , Paul Stevenson , Yue Shi , Esra Yüksel , A.S. Umar","doi":"10.1016/j.cpc.2024.109239","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109239","url":null,"abstract":"<div><p>The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the calculation of nuclear strength functions in the regime of linear response theory, while retaining all existing functionality and use cases. The strength functions are benchmarked against available RPA codes, and the user has the freedom of choice when selecting the nature of external excitation (from monopole to hexadecapole and more). Some utility programs are also provided that calculate the strength function from the time-dependent output of the dynamic calculations of the Sky3D code.</p></div><div><h3>New version program summary</h3><p><em>Program Title:</em> Sky3D</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/vzbrzvyrn4.2</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/manybody/sky3d</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GPLv3</p><p><em>Programming language:</em> Fortran, with one post-processing utility in Python.</p><p><em>Journal reference of previous version::</em> Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018). The TDHF code Sky3D version 1.1. <span>Comput. Phys. Commun. 229 (2018) 211–213.</span><svg><path></path></svg></p><p><em>Does the new version supersede the previous version?:</em> Yes.</p><p><em>Reasons for the new version:</em> The capability of reproducing the nuclear strength function for a variety of newly-coded external boosts has been added.</p><p><em>Nature of problem:</em> Calculating nuclear multipole strength functions is a crucial probe that can help model the nuclear system and its structure properties. A variety of models exist for this task, such as QRPA (Quasiparticle Random Phase Approximation) and its variants, but such approaches are often limited due to symmetry constraints. Time-dependent Hartree Fock (TDHF) has been used to simulate nuclear vibrations and collisions between nuclei for low energies without assuming any symmetry in the system. This code extends the TDHF to calculate the multi-pole strength functions of atomic nuclei. We showcase its reliability by comparing it with the established RPA codes for the calculation of such strength functions.</p><p><em>Solution method:</em> We extended previous versions of the Sky3D code [1,2] to include an external boost of multipole type where the user can provide custom input that decides the nature of the multipole (monopole, quadrupole, octupole, and so on) boost. The principal aim is to calculate the multipole strength function, which is the Fourier transform of the time-dependent expectation value of the m","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001620/pdfft?md5=4a533df4e435e10a51c66bd31cf8a2bd&pid=1-s2.0-S0010465524001620-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.cpc.2024.109236
Tian Liang , Lin Fu
The high-order shock-capturing scheme is one of the main building blocks for the simulation of the compressible fluid characterized by strong shockwaves and broadband length scales. However, the classical shock-capturing scheme fails to perform long-time stable and non-dissipative simulations since the quadratic invariants associated with the conservation equations cannot be conserved as a result of the inherent numerical dissipation. Additionally, the overall computational cost for classical shock-capturing schemes is quite expensive as a result of the time-consuming local characteristic decomposition and the nonlinear-weights computing process. In this work, based on a new efficient discontinuity indicator, which distinguishes the non-smooth high-wavenumber fluctuations and discontinuities from smooth scales in the wavenumber space, a paradigm of high-order shock-capturing scheme by recasting the non-dissipative skew-symmetric-splitting method with newly optimized dispersion property for smooth flow scales and invoking the nonlinear targeted ENO (TENO) schemes for non-smooth discontinuities is proposed. The resulting TENO-S scheme not only successfully performs long-time stable computations for smooth flows without numerical dissipation, but also recovers the robust shock-capturing capabilities with adaptive numerical dissipation. Without the necessity of parameter tuning case by case, extensive benchmark simulations involving a wide range of flow length scales and strong discontinuities demonstrate that the proposed TENO-S scheme performs significantly better than the straightforward deployment of WENO/TENO-family schemes with better spectral property and higher computational efficiency.
{"title":"A new high-order shock-capturing TENO scheme combined with skew-symmetric-splitting method for compressible gas dynamics and turbulence simulation","authors":"Tian Liang , Lin Fu","doi":"10.1016/j.cpc.2024.109236","DOIUrl":"10.1016/j.cpc.2024.109236","url":null,"abstract":"<div><p>The high-order shock-capturing scheme is one of the main building blocks for the simulation of the compressible fluid characterized by strong shockwaves and broadband length scales. However, the classical shock-capturing scheme fails to perform long-time stable and non-dissipative simulations since the quadratic invariants associated with the conservation equations cannot be conserved as a result of the inherent numerical dissipation. Additionally, the overall computational cost for classical shock-capturing schemes is quite expensive as a result of the time-consuming local characteristic decomposition and the nonlinear-weights computing process. In this work, based on a new efficient discontinuity indicator, which distinguishes the non-smooth high-wavenumber fluctuations and discontinuities from smooth scales in the wavenumber space, a paradigm of high-order shock-capturing scheme by recasting the non-dissipative skew-symmetric-splitting method with newly optimized dispersion property for smooth flow scales and invoking the nonlinear targeted ENO (TENO) schemes for non-smooth discontinuities is proposed. The resulting TENO-S scheme not only successfully performs long-time stable computations for smooth flows without numerical dissipation, but also recovers the robust shock-capturing capabilities with adaptive numerical dissipation. Without the necessity of parameter tuning case by case, extensive benchmark simulations involving a wide range of flow length scales and strong discontinuities demonstrate that the proposed TENO-S scheme performs significantly better than the straightforward deployment of WENO/TENO-family schemes with better spectral property and higher computational efficiency.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141040048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.cpc.2024.109234
Tiago S. Farias , Vitor V. Schultz , José C.M. Mombach , Jonas Maziero
We introduce a novel framework for simulating spin models using differentiable programming, an approach that leverages the advancements in machine learning and computational efficiency. We focus on three distinct spin systems: the Ising model, the Potts model, and the Cellular Potts model, demonstrating the practicality and scalability of our framework in modeling these complex systems. Additionally, this framework allows for the optimization of spin models, which can adjust the parameters of a system by a defined objective function. In order to simulate these models, we adapt the Metropolis-Hastings algorithm to a differentiable programming paradigm, employing batched tensors for simulating spin lattices. This adaptation not only facilitates the integration with existing deep learning tools but also significantly enhances computational speed through parallel processing capabilities, as it can be implemented on different hardware architectures, including GPUs and TPUs.
{"title":"A differentiable programming framework for spin models","authors":"Tiago S. Farias , Vitor V. Schultz , José C.M. Mombach , Jonas Maziero","doi":"10.1016/j.cpc.2024.109234","DOIUrl":"10.1016/j.cpc.2024.109234","url":null,"abstract":"<div><p>We introduce a novel framework for simulating spin models using differentiable programming, an approach that leverages the advancements in machine learning and computational efficiency. We focus on three distinct spin systems: the Ising model, the Potts model, and the Cellular Potts model, demonstrating the practicality and scalability of our framework in modeling these complex systems. Additionally, this framework allows for the optimization of spin models, which can adjust the parameters of a system by a defined objective function. In order to simulate these models, we adapt the Metropolis-Hastings algorithm to a differentiable programming paradigm, employing batched tensors for simulating spin lattices. This adaptation not only facilitates the integration with existing deep learning tools but also significantly enhances computational speed through parallel processing capabilities, as it can be implemented on different hardware architectures, including GPUs and TPUs.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141047424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-09DOI: 10.1016/j.cpc.2024.109241
Guang Yang, Tong Liu, Xukang Lu, Moran Wang
We present Fast-QSGS, a GPU-accelerated program for granular disordered media generation. Based on vectorization, Fast-QSGS is accelerated by modern GPU thanks to the NumPy-compatible API provided by CuPy. We also introduce a variable growth probability function and seed spacing control to improve the speed and accuracy of the original QSGS method. Computational performance benchmarks are conducted on both consumer-grade and professional-grade GPUs. Generation of disordered media of size 4003 can be completed in 30 s on A100 and 110 s on RTX4060, achieving a speedup of over 400 compared with the serial version. Physical benchmarks on the reconstruction of Fontainebleau sandstone and hydrated cement are conducted. Our results demonstrate that the permeability of the reconstructed Fontainebleau sandstone falls within the range of experimental values. Additionally, the average relative error of the volume fraction of the unhydrated cement and capillary porosity of hydrated cement is 1.9 % and 3.4 % compared with Powers’ law, respectively.
{"title":"Fast-QSGS: A GPU accelerated program for structure generation of granular disordered media","authors":"Guang Yang, Tong Liu, Xukang Lu, Moran Wang","doi":"10.1016/j.cpc.2024.109241","DOIUrl":"10.1016/j.cpc.2024.109241","url":null,"abstract":"<div><p>We present Fast-QSGS, a GPU-accelerated program for granular disordered media generation. Based on vectorization, Fast-QSGS is accelerated by modern GPU thanks to the NumPy-compatible API provided by CuPy. We also introduce a variable growth probability function and seed spacing control to improve the speed and accuracy of the original QSGS method. Computational performance benchmarks are conducted on both consumer-grade and professional-grade GPUs. Generation of disordered media of size 400<sup>3</sup> can be completed in 30 s on A100 and 110 s on RTX4060, achieving a speedup of over 400 compared with the serial version. Physical benchmarks on the reconstruction of Fontainebleau sandstone and hydrated cement are conducted. Our results demonstrate that the permeability of the reconstructed Fontainebleau sandstone falls within the range of experimental values. Additionally, the average relative error of the volume fraction of the unhydrated cement and capillary porosity of hydrated cement is 1.9 % and 3.4 % compared with Powers’ law, respectively.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141031119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.cpc.2024.109235
Liliya R. Safina , Elizaveta A. Rozhnova , Karina A. Krylova , Ramil T. Murzaev , Julia A. Baimova
Graphene reinforced metal matrix composites represent a promising class of materials for high-strength surface coatings because of their high strength and ductility. This study reports the application of different interatomic potentials to correctly describe the interaction between graphene and metals (Al, Cu, Ni, and Ti) by molecular dynamics. Both simple pair potentials, such as Lennard-Jones and Morse, and many-body potentials, such as bond order potential are applied for the simulation of a graphene/metal system at room temperature. Three different structures are considered: (i) graphene interacting with one metal atom; (ii) graphene interacting with a metal nanoparticle, and (iii) three-dimensional graphene network filled with metal nanoparticles. We first determine the potential energy that any graphene/metal system can reach during exposure at 300 K; then, we analyze the interaction dynamics for all considered systems and all potentials. A considerable difference in the interaction between metal nanoparticles with planar and folded graphene was found. For graphene/Ni, graphene/Cu, and graphene/Ti, the Lennard-Jones and Morse potentials provide accurate energetic and structural properties of the studied structures; they also describe interaction in the graphene/metal system in a similar way, at variance with bond-order potential. For graphene/Al, the Tersoff and Morse potentials describe the interaction better than Lennard-Jones. For the simulation of graphene/Me system, the optimal choice of the potential for different structures is of crucial importance. The presented analysis of the interatomic potentials appears to be promising for realistic and accurate simulations of graphene reinforced metal composites.
石墨烯增强金属基复合材料具有高强度和延展性,是一类很有前途的高强度表面涂层材料。本研究报告了不同原子间势的应用,以通过分子动力学正确描述石墨烯与金属(铝、铜、镍和钛)之间的相互作用。在模拟室温下的石墨烯/金属体系时,既应用了简单的对势能(如伦纳德-琼斯和莫尔斯),也应用了多体势能(如键阶势能)。我们考虑了三种不同的结构:(i) 与一个金属原子相互作用的石墨烯;(ii) 与一个金属纳米粒子相互作用的石墨烯;(iii) 充满金属纳米粒子的三维石墨烯网络。我们首先确定了任何石墨烯/金属系统在 300 K 暴露条件下可达到的势能,然后分析了所有考虑的系统和所有势能的相互作用动力学。我们发现金属纳米粒子与平面石墨烯和折叠石墨烯之间的相互作用存在很大差异。对于石墨烯/尼、石墨烯/铜和石墨烯/钛,伦纳德-琼斯电位和莫尔斯电位提供了所研究结构的精确能量和结构特性;它们也以类似的方式描述了石墨烯/金属体系中的相互作用,但与键阶电位不同。对于石墨烯/铝,特尔索夫势和莫尔斯势比伦纳德-琼斯势更好地描述了相互作用。对于石墨烯/Me 体系的模拟,不同结构的最佳电位选择至关重要。本文介绍的原子间电位分析似乎有望实现对石墨烯增强金属复合材料的真实、准确模拟。
{"title":"Interatomic potentials for graphene reinforced metal composites: Optimal choice","authors":"Liliya R. Safina , Elizaveta A. Rozhnova , Karina A. Krylova , Ramil T. Murzaev , Julia A. Baimova","doi":"10.1016/j.cpc.2024.109235","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109235","url":null,"abstract":"<div><p>Graphene reinforced metal matrix composites represent a promising class of materials for high-strength surface coatings because of their high strength and ductility. This study reports the application of different interatomic potentials to correctly describe the interaction between graphene and metals (Al, Cu, Ni, and Ti) by molecular dynamics. Both simple pair potentials, such as Lennard-Jones and Morse, and many-body potentials, such as bond order potential are applied for the simulation of a graphene/metal system at room temperature. Three different structures are considered: (i) graphene interacting with one metal atom; (ii) graphene interacting with a metal nanoparticle, and (iii) three-dimensional graphene network filled with metal nanoparticles. We first determine the potential energy that any graphene/metal system can reach during exposure at 300 K; then, we analyze the interaction dynamics for all considered systems and all potentials. A considerable difference in the interaction between metal nanoparticles with planar and folded graphene was found. For graphene/Ni, graphene/Cu, and graphene/Ti, the Lennard-Jones and Morse potentials provide accurate energetic and structural properties of the studied structures; they also describe interaction in the graphene/metal system in a similar way, at variance with bond-order potential. For graphene/Al, the Tersoff and Morse potentials describe the interaction better than Lennard-Jones. For the simulation of graphene/Me system, the optimal choice of the potential for different structures is of crucial importance. The presented analysis of the interatomic potentials appears to be promising for realistic and accurate simulations of graphene reinforced metal composites.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.cpc.2024.109233
Kejia Pan , Jin Li , Zhilin Li
This paper is devoted to fourth order compact schemes and fast algorithms for solving stationary Stokes equations with different boundary conditions numerically. One of the main ideas is to decouple the Stokes equations into three Poisson equations for the pressure and the velocity via the pressure Poisson equation (PPE). The augmented strategy is utilized to provide numerical boundary conditions for the pressure. Different velocity boundary conditions require different interpolation strategies for the augmented methods. The augmented variable is solved by the GMRES method. A new simple and efficient preconditioning strategy has also been developed to accelerate the convergence of the GMRES iteration. Numerical examples presented in this paper confirmed the designed convergence order and the efficiency of the new methods.
{"title":"High order compact augmented methods for Stokes equations with different boundary conditions","authors":"Kejia Pan , Jin Li , Zhilin Li","doi":"10.1016/j.cpc.2024.109233","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109233","url":null,"abstract":"<div><p>This paper is devoted to fourth order compact schemes and fast algorithms for solving stationary Stokes equations with different boundary conditions numerically. One of the main ideas is to decouple the Stokes equations into three Poisson equations for the pressure and the velocity via the pressure Poisson equation (PPE). The augmented strategy is utilized to provide numerical boundary conditions for the pressure. Different velocity boundary conditions require different interpolation strategies for the augmented methods. The augmented variable is solved by the GMRES method. A new simple and efficient preconditioning strategy has also been developed to accelerate the convergence of the GMRES iteration. Numerical examples presented in this paper confirmed the designed convergence order and the efficiency of the new methods.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140952156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-06DOI: 10.1016/j.cpc.2024.109232
A.D. Sanchez , S. Chaitanya Kumar , M. Ebrahim-Zadeh
We present an object-oriented programming (OOP) CUDA-based package for fast and accurate simulation of second-harmonic generation (SHG) efficiency using focused Gaussian beams. The model includes linear as well as two-photon absorption that can ultimately lead to thermal lensing due to self-heating effects. Our approach speeds up calculations by nearly 40x (11x) without (with) temperature profiles with respect to an equivalent implementation using CPU. The package offers a valuable tool for experimental design and study of 3D field propagation in nonlinear three-wave interactions. It is useful for optimization of SHG-based experiments and mitigates undesired thermal effects, enabling improved oven designs and advanced device architectures, leading to stable, efficient high-power SHG.
Program summary
Program Title:cuSHG
CPC Library link to program files:https://doi.org/10.17632/hn76s7x848.1
Nature of problem: The problem which is solved in this work is that of second-harmonic generation (SHG) performance degradation in a nonlinear crystal with focused Gaussian beams due to thermal effects. By placing the nonlinear crystal in an oven that controls temperature, the package computes the involved electric fields along the medium. The implemented model includes the linear and nonlinear absorption which occasionally lead to self-heating effect, degrading the performance of the SHG.
Solution method: The coupled differential equations for three-wave interactions, which describe the field evolution along the crystal, are solved using the well-known Split-Step Fourier method. The temperature profiles are estimated using the finite-elements method. The field evolution and thermal effects are embedded in a self-consistent algorithm that sequentially and separately solves the electromagnetic and thermal problems until the system reaches the steady state. Due to the eventual computational demand that some problems may have, we chose to implement the coupled equations in the /CUDA programming language. This allows us to significantly speed up simulations, thanks to the computing power provided by a graphics processing unit (GPU) card. The output files obtained are the interacting electric fields and the temperature profile, which have to be analyzed during post-processing.
{"title":"CUDA-based focused Gaussian beams second-harmonic generation efficiency calculator","authors":"A.D. Sanchez , S. Chaitanya Kumar , M. Ebrahim-Zadeh","doi":"10.1016/j.cpc.2024.109232","DOIUrl":"10.1016/j.cpc.2024.109232","url":null,"abstract":"<div><p>We present an object-oriented programming (OOP) CUDA-based package for fast and accurate simulation of second-harmonic generation (SHG) efficiency using focused Gaussian beams. The model includes linear as well as two-photon absorption that can ultimately lead to thermal lensing due to self-heating effects. Our approach speeds up calculations by nearly 40x (11x) without (with) temperature profiles with respect to an equivalent implementation using CPU. The package offers a valuable tool for experimental design and study of 3D field propagation in nonlinear three-wave interactions. It is useful for optimization of SHG-based experiments and mitigates undesired thermal effects, enabling improved oven designs and advanced device architectures, leading to stable, efficient high-power SHG.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> <span>cuSHG</span></p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/hn76s7x848.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/alfredos84/cuSHG</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> MIT</p><p><em>Programming language:</em> <figure><img></figure>, CUDA</p><p><em>Nature of problem:</em> The problem which is solved in this work is that of second-harmonic generation (SHG) performance degradation in a nonlinear crystal with focused Gaussian beams due to thermal effects. By placing the nonlinear crystal in an oven that controls temperature, the package computes the involved electric fields along the medium. The implemented model includes the linear and nonlinear absorption which occasionally lead to self-heating effect, degrading the performance of the SHG.</p><p><em>Solution method:</em> The coupled differential equations for three-wave interactions, which describe the field evolution along the crystal, are solved using the well-known Split-Step Fourier method. The temperature profiles are estimated using the finite-elements method. The field evolution and thermal effects are embedded in a self-consistent algorithm that sequentially and separately solves the electromagnetic and thermal problems until the system reaches the steady state. Due to the eventual computational demand that some problems may have, we chose to implement the coupled equations in the <figure><img></figure>/CUDA programming language. This allows us to significantly speed up simulations, thanks to the computing power provided by a graphics processing unit (GPU) card. The output files obtained are the interacting electric fields and the temperature profile, which have to be analyzed during post-processing.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141035826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}