首页 > 最新文献

Computer Physics Communications最新文献

英文 中文
Optimized thread-block arrangement in a GPU implementation of a linear solver for atmospheric chemistry mechanisms 大气化学机制线性求解器 GPU 实施中的优化线程块安排
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-13 DOI: 10.1016/j.cpc.2024.109240
Christian Guzman Ruiz , Mario Acosta , Oriol Jorba , Eduardo Cesar Galobardes , Matthew Dawson , Guillermo Oyarzun , Carlos Pérez García-Pando , Kim Serradell

Earth system models (ESM) demand significant hardware resources and energy consumption to solve atmospheric chemistry processes. Recent studies have shown improved performance from running these models on GPU accelerators. Nonetheless, there is room for improvement in exploiting even more GPU resources.

This study proposes an optimized distribution of the chemical solver's computational load on the GPU, named Block-cells. Additionally, we evaluate different configurations for distributing the computational load in an NVIDIA GPU.

We use the linear solver from the Chemistry Across Multiple Phases (CAMP) framework as our test bed. An intermediate-complexity chemical mechanism under typical atmospheric conditions is used. Results demonstrate a 35× speedup compared to the single-CPU thread reference case. Even using the full resources of the node (40 physical cores) on the reference case, the Block-cells version outperforms them by 50%. The Block-cells approach shows promise in alleviating the computational burden of chemical solvers on GPU architectures.

地球系统模型(ESM)需要大量的硬件资源和能源消耗来解决大气化学过程。最近的研究表明,在 GPU 加速器上运行这些模型的性能有所提高。本研究提出了在 GPU 上优化分配化学求解器计算负荷的方法,命名为 Block-cells。此外,我们还评估了在英伟达™(NVIDIA®)GPU上分配计算负荷的不同配置。我们使用多相化学(CAMP)框架中的线性求解器作为测试平台。我们使用跨多相化学(CAMP)框架的线性求解器作为测试平台,使用了典型大气条件下的中等复杂度化学机制。结果表明,与单 CPU 线程参考情况相比,速度提高了 35 倍。即使在参考案例中使用全部节点资源(40 个物理内核),Block-cells 版本也比它们高出 50%。Block-cells 方法有望减轻 GPU 架构上化学求解器的计算负担。
{"title":"Optimized thread-block arrangement in a GPU implementation of a linear solver for atmospheric chemistry mechanisms","authors":"Christian Guzman Ruiz ,&nbsp;Mario Acosta ,&nbsp;Oriol Jorba ,&nbsp;Eduardo Cesar Galobardes ,&nbsp;Matthew Dawson ,&nbsp;Guillermo Oyarzun ,&nbsp;Carlos Pérez García-Pando ,&nbsp;Kim Serradell","doi":"10.1016/j.cpc.2024.109240","DOIUrl":"10.1016/j.cpc.2024.109240","url":null,"abstract":"<div><p>Earth system models (ESM) demand significant hardware resources and energy consumption to solve atmospheric chemistry processes. Recent studies have shown improved performance from running these models on GPU accelerators. Nonetheless, there is room for improvement in exploiting even more GPU resources.</p><p>This study proposes an optimized distribution of the chemical solver's computational load on the GPU, named Block-cells. Additionally, we evaluate different configurations for distributing the computational load in an NVIDIA GPU.</p><p>We use the linear solver from the Chemistry Across Multiple Phases (CAMP) framework as our test bed. An intermediate-complexity chemical mechanism under typical atmospheric conditions is used. Results demonstrate a 35× speedup compared to the single-CPU thread reference case. Even using the full resources of the node (40 physical cores) on the reference case, the Block-cells version outperforms them by 50%. The Block-cells approach shows promise in alleviating the computational burden of chemical solvers on GPU architectures.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141032446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RHDLPP: A multigroup radiation hydrodynamics code for laser-produced plasmas RHDLPP:用于激光产生的等离子体的多组辐射流体力学代码
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-11 DOI: 10.1016/j.cpc.2024.109242
Qi Min , Ziyang Xu , Siqi He , Haidong Lu , Xingbang Liu , Ruizi Shen , Yanhong Wu , Qikun Pan , Chongxiao Zhao , Fei Chen , Maogen Su , Chenzhong Dong

In this paper, we introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core radiation hydrodynamic equations are resolved in the Eulerian frame, employing an operator-split method. This method decomposes the solution into two substeps: first, the explicit resolution of the hyperbolic subsystems integrating radiation and fluid dynamics; second, the implicit treatment of the parabolic part comprising stiff radiation diffusion, heat conduction, and energy exchange. Laser propagation and energy deposition are modeled through a hybrid approach, combining geometrical-optics ray-tracing in sub-critical plasma regions with a one-dimensional solution of the Helmholtz wave equation in super-critical areas. The thermodynamic states are ascertained using an equation of state, based on either the real gas approximation or the quotidian equation of state (QEOS). For ionization calculations, the code employs a steady-state collisional-radiation (CR) model using the screened-hydrogenic approximation. Additionally, RHDLPP includes RHDLPP-SpeIma3D, a three-dimensional spectral simulation post-processing module, for generating both temporally-spatially resolved and time-integrated spectra and imaging, facilitating direct comparisons with experimental data. The paper showcases a series of verification tests to establish the code's accuracy and efficiency, followed by application cases, including simulations of laser-produced aluminium (Al) plasmas, pre-pulse-induced target deformation of tin (Sn) microdroplets relevant to extreme ultraviolet lithography light sources, and varied imaging and spectroscopic simulations. These simulations highlight RHDLPP's effectiveness and applicability in fields such as laser-induced breakdown spectroscopy, extreme ultraviolet lithography sources, and high-energy-density physics.

在本文中,我们介绍了 RHDLPP,这是一种通量限制多组辐射流体力学数值代码,设计用于模拟各种环境中的激光产生的等离子体。该代码分为两个软件包:RHDLPP-LTP 用于中等强度纳秒激光产生的低温等离子体,RHDLPP-HTP 用于高强度激光脉冲形成的高温、高密度等离子体。核心辐射流体力学方程在欧拉框架下采用算子拆分法求解。该方法将求解分解为两个子步骤:第一,显式解析集成辐射和流体动力学的双曲子系统;第二,隐式处理抛物线部分,包括刚性辐射扩散、热传导和能量交换。激光传播和能量沉积是通过一种混合方法建模的,该方法结合了亚临界等离子体区域的几何光学射线追踪和超临界区域的亥姆霍兹波方程一维解法。热力学状态通过状态方程确定,该方程基于真实气体近似或商数状态方程(QEOS)。在电离计算中,代码采用了使用屏蔽氢近似的稳态碰撞辐射(CR)模型。此外,RHDLPP 还包括三维光谱模拟后处理模块 RHDLPP-SpeIma3D,用于生成时间空间分辨率和时间积分光谱及成像,便于与实验数据进行直接比较。论文展示了一系列验证测试,以确定代码的准确性和效率,随后是应用案例,包括激光产生的铝(Al)等离子体的模拟、与极紫外光刻光源相关的锡(Sn)微液滴的预脉冲诱导目标变形,以及各种成像和光谱模拟。这些模拟突出了 RHDLPP 在激光诱导击穿光谱学、极紫外光刻光源和高能量密度物理学等领域的有效性和适用性。
{"title":"RHDLPP: A multigroup radiation hydrodynamics code for laser-produced plasmas","authors":"Qi Min ,&nbsp;Ziyang Xu ,&nbsp;Siqi He ,&nbsp;Haidong Lu ,&nbsp;Xingbang Liu ,&nbsp;Ruizi Shen ,&nbsp;Yanhong Wu ,&nbsp;Qikun Pan ,&nbsp;Chongxiao Zhao ,&nbsp;Fei Chen ,&nbsp;Maogen Su ,&nbsp;Chenzhong Dong","doi":"10.1016/j.cpc.2024.109242","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109242","url":null,"abstract":"<div><p>In this paper, we introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core radiation hydrodynamic equations are resolved in the Eulerian frame, employing an operator-split method. This method decomposes the solution into two substeps: first, the explicit resolution of the hyperbolic subsystems integrating radiation and fluid dynamics; second, the implicit treatment of the parabolic part comprising stiff radiation diffusion, heat conduction, and energy exchange. Laser propagation and energy deposition are modeled through a hybrid approach, combining geometrical-optics ray-tracing in sub-critical plasma regions with a one-dimensional solution of the Helmholtz wave equation in super-critical areas. The thermodynamic states are ascertained using an equation of state, based on either the real gas approximation or the quotidian equation of state (QEOS). For ionization calculations, the code employs a steady-state collisional-radiation (CR) model using the screened-hydrogenic approximation. Additionally, RHDLPP includes RHDLPP-SpeIma3D, a three-dimensional spectral simulation post-processing module, for generating both temporally-spatially resolved and time-integrated spectra and imaging, facilitating direct comparisons with experimental data. The paper showcases a series of verification tests to establish the code's accuracy and efficiency, followed by application cases, including simulations of laser-produced aluminium (Al) plasmas, pre-pulse-induced target deformation of tin (Sn) microdroplets relevant to extreme ultraviolet lithography light sources, and varied imaging and spectroscopic simulations. These simulations highlight RHDLPP's effectiveness and applicability in fields such as laser-induced breakdown spectroscopy, extreme ultraviolet lithography sources, and high-energy-density physics.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141068718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct/split invariant-preserving Fourier pseudo-spectral methods for the rotation-two-component Camassa–Holm system with peakon solitons 带峰孤子的旋转-两分量卡玛萨-霍姆系统的直接/分裂不变保全傅立叶伪谱方法
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-10 DOI: 10.1016/j.cpc.2024.109237
Qifeng Zhang, Tong Yan, Dinghua Xu, Yong Chen

The Fourier pseudo-spectral method is well suited to solve PDEs under the periodic boundary condition due to its high-order accuracy and easy-to-implement feature. In this paper, we explore as well as comparatively study four classes of Fourier pseudo-spectral schemes for solving the rotation-two-component Camassa–Holm system which possibly owns peakon solitons. Via exploiting inherent structural properties of the system, we reformulate it into two kinds of different equivalent forms and then apply the Fourier pseudo-spectral method to derive two spatial semi-discrete systems, both of which are proved to preserve the corresponding invariants including mass, momentum and energy. Subsequently, we construct two linearly implicit schemes based on Strang splitting technique and two nonlinear schemes, respectively, for both semi-discrete systems. Owing to the different equivalent forms in the structure, one of the nonlinear schemes preserves discrete mass and momentum, while the other one is shown to preserve all three invariants. Numerical results under the situation of smooth/nonsmooth initial values are provided for distinct types of solutions to test the accuracy in long time simulation and to verify the capacity of predicting water wave propagation, as well as advantages in preserving these invariants. For instance, the present schemes are shown to be at least 14 significant digits, improving upon 10 from ones in previous references.

傅立叶伪谱法因其高阶精度和易于实现的特点,非常适合求解周期边界条件下的 PDE。本文探索并比较研究了四类傅立叶伪谱方案,用于求解可能存在峰孤子的旋转二分量卡马萨-霍尔姆系统。利用该系统固有的结构特性,我们将其重新表述为两种不同的等效形式,然后应用傅立叶伪谱方法推导出两个空间半离散系统,并证明这两个系统都保留了相应的不变式,包括质量、动量和能量。随后,我们分别为这两个半离散系统构建了两个基于斯特朗分裂技术的线性隐式方案和两个非线性方案。由于结构中的等效形式不同,其中一个非线性方案保留了离散质量和动量,而另一个则保留了所有三个不变式。在光滑/非光滑初值情况下,提供了不同类型解的数值结果,以测试长时间模拟的准确性,验证预测水波传播的能力以及保留这些不变式的优势。例如,目前的方案至少有 14 个有效位数,比以前参考文献中的 10 个有效位数有所提高。
{"title":"Direct/split invariant-preserving Fourier pseudo-spectral methods for the rotation-two-component Camassa–Holm system with peakon solitons","authors":"Qifeng Zhang,&nbsp;Tong Yan,&nbsp;Dinghua Xu,&nbsp;Yong Chen","doi":"10.1016/j.cpc.2024.109237","DOIUrl":"10.1016/j.cpc.2024.109237","url":null,"abstract":"<div><p>The Fourier pseudo-spectral method is well suited to solve PDEs under the periodic boundary condition due to its high-order accuracy and easy-to-implement feature. In this paper, we explore as well as comparatively study four classes of Fourier pseudo-spectral schemes for solving the rotation-two-component Camassa–Holm system which possibly owns peakon solitons. Via exploiting inherent structural properties of the system, we reformulate it into two kinds of different equivalent forms and then apply the Fourier pseudo-spectral method to derive two spatial semi-discrete systems, both of which are proved to preserve the corresponding invariants including mass, momentum and energy. Subsequently, we construct two linearly implicit schemes based on Strang splitting technique and two nonlinear schemes, respectively, for both semi-discrete systems. Owing to the different equivalent forms in the structure, one of the nonlinear schemes preserves discrete mass and momentum, while the other one is shown to preserve all three invariants. Numerical results under the situation of smooth/nonsmooth initial values are provided for distinct types of solutions to test the accuracy in long time simulation and to verify the capacity of predicting water wave propagation, as well as advantages in preserving these invariants. For instance, the present schemes are shown to be at least 14 significant digits, improving upon 10 from ones in previous references.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141028116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The TDHF code Sky3D version 1.2 TDHF 代码 Sky3D 1.2 版
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-10 DOI: 10.1016/j.cpc.2024.109239
Abhishek , Paul Stevenson , Yue Shi , Esra Yüksel , A.S. Umar

The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the calculation of nuclear strength functions in the regime of linear response theory, while retaining all existing functionality and use cases. The strength functions are benchmarked against available RPA codes, and the user has the freedom of choice when selecting the nature of external excitation (from monopole to hexadecapole and more). Some utility programs are also provided that calculate the strength function from the time-dependent output of the dynamic calculations of the Sky3D code.

New version program summary

Program Title: Sky3D

CPC Library link to program files: https://doi.org/10.17632/vzbrzvyrn4.2

Developer's repository link: https://github.com/manybody/sky3d

Licensing provisions: GPLv3

Programming language: Fortran, with one post-processing utility in Python.

Journal reference of previous version:: Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018). The TDHF code Sky3D version 1.1. Comput. Phys. Commun. 229 (2018) 211–213.

Does the new version supersede the previous version?: Yes.

Reasons for the new version: The capability of reproducing the nuclear strength function for a variety of newly-coded external boosts has been added.

Nature of problem: Calculating nuclear multipole strength functions is a crucial probe that can help model the nuclear system and its structure properties. A variety of models exist for this task, such as QRPA (Quasiparticle Random Phase Approximation) and its variants, but such approaches are often limited due to symmetry constraints. Time-dependent Hartree Fock (TDHF) has been used to simulate nuclear vibrations and collisions between nuclei for low energies without assuming any symmetry in the system. This code extends the TDHF to calculate the multi-pole strength functions of atomic nuclei. We showcase its reliability by comparing it with the established RPA codes for the calculation of such strength functions.

Solution method: We extended previous versions of the Sky3D code [1,2] to include an external boost of multipole type where the user can provide custom input that decides the nature of the multipole (monopole, quadrupole, octupole, and so on) boost. The principal aim is to calculate the multipole strength function, which is the Fourier transform of the time-dependent expectation value of the m

Sky3D 代码已被广泛用于描述核基态、集体振动激发和重离子碰撞。该方法基于 Skyrme 力或相关能量密度函数。静态和动态方程在三维网格上求解,配对以 BCS 近似方法实现。该代码的更新版本旨在促进线性响应理论机制下的核强度函数计算,同时保留所有现有功能和用例。强度函数以现有的 RPA 代码为基准,用户可以自由选择外部激励的性质(从单极到十六极等)。此外,还提供了一些实用程序,可从 Sky3D 代码动态计算的随时间变化的输出中计算强度函数:Sky3DCPC 库程序文件链接:https://doi.org/10.17632/vzbrzvyrn4.2Developer's 资源库链接:https://github.com/manybody/sky3dLicensing 规定:GPLv3编程语言:Fortran, with one post-processing utility in Python.Journal reference of previous version::Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018)。TDHF 代码 Sky3D 1.1 版。Comput.Phys.229 (2018) 211-213.Does the new version supersede the previous version?是的:增加了为各种新编码的外部助推作用重现核强度函数的功能。问题性质:计算核多极强度函数是一项重要的探测工作,有助于为核系统及其结构特性建模。目前有多种模型可用于这一任务,例如 QRPA(准粒子随机相位逼近)及其变体,但由于对称性的限制,这些方法往往受到限制。与时间相关的哈特里-福克(TDHF)已被用于模拟低能量的核振动和核之间的碰撞,而无需假定系统中存在任何对称性。本代码对 TDHF 进行了扩展,以计算原子核的多极强度函数。我们将其与计算此类强度函数的成熟 RPA 代码进行了比较,从而展示了它的可靠性:我们扩展了之前版本的 Sky3D 代码[1,2],加入了多极类型的外部推动,用户可以提供自定义输入,决定多极(单极、四极、八极等)推动的性质。其主要目的是计算多极强度函数,即多极算子随时间变化的期望值的傅立叶变换,其形式与外部推动相同。通过适当的单位转换,可以提取出计算出的强度函数的精确单位,这与该领域的现有文献不相上下。边界条件的选择使类似伍兹-撒克逊的函数切断了外部场,使其在边界为零:实现了提供更通用的外部激励场的能力,可以选择多极性和等时空性质。然后,作为时间的函数,跟踪两种等离子体和 L=5 以下多极性的矩。此外还包括一个新的分析程序,用于计算强度函数和能量加权和规则。还包括在配对力中指定混合参数的功能。这样就能以更符合 QRPA 代码的方式选择混合表面-体积配对,以便进行比较。此外,还包括一个更新的 Makefile,使在苹果硅计算机上编译变得更容易。Maruhn, P.-G.Reinhard, P. Stevenson, A. Umar, The TDHF code Sky3D, Comput.Phys.185 (7) (2014) 2195-2216, https://doi.org/10.1016/j.cpc.2014.04.008。[2]B.Schuetrumpf, P.-G.Reinhard, P. Stevenson, A. Umar, J. Maruhn, The TDHF code Sky3D version 1.1, Comput.Phys.229 (2018) 211-213, https://doi.org/10.1016/j.cpc.2018.03.012
{"title":"The TDHF code Sky3D version 1.2","authors":"Abhishek ,&nbsp;Paul Stevenson ,&nbsp;Yue Shi ,&nbsp;Esra Yüksel ,&nbsp;A.S. Umar","doi":"10.1016/j.cpc.2024.109239","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109239","url":null,"abstract":"<div><p>The Sky3D code has been widely used to describe nuclear ground states, collective vibrational excitations, and heavy-ion collisions. The approach is based on Skyrme forces or related energy density functionals. The static and dynamic equations are solved on a three-dimensional grid, and pairing is been implemented in the BCS approximation. This updated version of the code aims to facilitate the calculation of nuclear strength functions in the regime of linear response theory, while retaining all existing functionality and use cases. The strength functions are benchmarked against available RPA codes, and the user has the freedom of choice when selecting the nature of external excitation (from monopole to hexadecapole and more). Some utility programs are also provided that calculate the strength function from the time-dependent output of the dynamic calculations of the Sky3D code.</p></div><div><h3>New version program summary</h3><p><em>Program Title:</em> Sky3D</p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/vzbrzvyrn4.2</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/manybody/sky3d</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> GPLv3</p><p><em>Programming language:</em> Fortran, with one post-processing utility in Python.</p><p><em>Journal reference of previous version::</em> Schuetrumpf, B., Reinhard, P.G., Stevenson, P.D., Umar, A.S., and Maruhn, J.A. (2018). The TDHF code Sky3D version 1.1. <span>Comput. Phys. Commun. 229 (2018) 211–213.</span><svg><path></path></svg></p><p><em>Does the new version supersede the previous version?:</em> Yes.</p><p><em>Reasons for the new version:</em> The capability of reproducing the nuclear strength function for a variety of newly-coded external boosts has been added.</p><p><em>Nature of problem:</em> Calculating nuclear multipole strength functions is a crucial probe that can help model the nuclear system and its structure properties. A variety of models exist for this task, such as QRPA (Quasiparticle Random Phase Approximation) and its variants, but such approaches are often limited due to symmetry constraints. Time-dependent Hartree Fock (TDHF) has been used to simulate nuclear vibrations and collisions between nuclei for low energies without assuming any symmetry in the system. This code extends the TDHF to calculate the multi-pole strength functions of atomic nuclei. We showcase its reliability by comparing it with the established RPA codes for the calculation of such strength functions.</p><p><em>Solution method:</em> We extended previous versions of the Sky3D code [1,2] to include an external boost of multipole type where the user can provide custom input that decides the nature of the multipole (monopole, quadrupole, octupole, and so on) boost. The principal aim is to calculate the multipole strength function, which is the Fourier transform of the time-dependent expectation value of the m","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010465524001620/pdfft?md5=4a533df4e435e10a51c66bd31cf8a2bd&pid=1-s2.0-S0010465524001620-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new high-order shock-capturing TENO scheme combined with skew-symmetric-splitting method for compressible gas dynamics and turbulence simulation 新的高阶冲击捕获 TENO 方案与偏斜对称分裂法相结合,用于可压缩气体动力学和湍流模拟
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-10 DOI: 10.1016/j.cpc.2024.109236
Tian Liang , Lin Fu

The high-order shock-capturing scheme is one of the main building blocks for the simulation of the compressible fluid characterized by strong shockwaves and broadband length scales. However, the classical shock-capturing scheme fails to perform long-time stable and non-dissipative simulations since the quadratic invariants associated with the conservation equations cannot be conserved as a result of the inherent numerical dissipation. Additionally, the overall computational cost for classical shock-capturing schemes is quite expensive as a result of the time-consuming local characteristic decomposition and the nonlinear-weights computing process. In this work, based on a new efficient discontinuity indicator, which distinguishes the non-smooth high-wavenumber fluctuations and discontinuities from smooth scales in the wavenumber space, a paradigm of high-order shock-capturing scheme by recasting the non-dissipative skew-symmetric-splitting method with newly optimized dispersion property for smooth flow scales and invoking the nonlinear targeted ENO (TENO) schemes for non-smooth discontinuities is proposed. The resulting TENO-S scheme not only successfully performs long-time stable computations for smooth flows without numerical dissipation, but also recovers the robust shock-capturing capabilities with adaptive numerical dissipation. Without the necessity of parameter tuning case by case, extensive benchmark simulations involving a wide range of flow length scales and strong discontinuities demonstrate that the proposed TENO-S scheme performs significantly better than the straightforward deployment of WENO/TENO-family schemes with better spectral property and higher computational efficiency.

高阶冲击捕捉方案是模拟以强冲击波和宽带长度尺度为特征的可压缩流体的主要构件之一。然而,由于固有的数值耗散,与守恒方程相关的二次不变式无法保持不变,因此经典的冲击捕捉方案无法进行长时间稳定和非耗散模拟。此外,由于局部特征分解和非线性权重计算过程耗时,经典冲击捕捉方案的总体计算成本相当昂贵。在这项工作中,基于一种新的高效不连续性指标(该指标可将非光滑的高文数波动和不连续性与文数空间中的光滑尺度区分开来),提出了一种高阶冲击捕获方案范例,该范例通过重铸非耗散偏斜-对称-分裂方法,对光滑流动尺度采用新优化的色散特性,对非光滑不连续性采用非线性目标 ENO(TENO)方案。由此产生的 TENO-S 方案不仅能在无数值耗散的情况下成功地对平滑流进行长时间稳定计算,而且还能通过自适应数值耗散恢复鲁棒的冲击捕捉能力。在无需逐个调整参数的情况下,涉及多种流动长度尺度和强不连续性的大量基准模拟表明,所提出的 TENO-S 方案的性能明显优于直接部署的 WENO/TENO 系列方案,具有更好的频谱特性和更高的计算效率。
{"title":"A new high-order shock-capturing TENO scheme combined with skew-symmetric-splitting method for compressible gas dynamics and turbulence simulation","authors":"Tian Liang ,&nbsp;Lin Fu","doi":"10.1016/j.cpc.2024.109236","DOIUrl":"10.1016/j.cpc.2024.109236","url":null,"abstract":"<div><p>The high-order shock-capturing scheme is one of the main building blocks for the simulation of the compressible fluid characterized by strong shockwaves and broadband length scales. However, the classical shock-capturing scheme fails to perform long-time stable and non-dissipative simulations since the quadratic invariants associated with the conservation equations cannot be conserved as a result of the inherent numerical dissipation. Additionally, the overall computational cost for classical shock-capturing schemes is quite expensive as a result of the time-consuming local characteristic decomposition and the nonlinear-weights computing process. In this work, based on a new efficient discontinuity indicator, which distinguishes the non-smooth high-wavenumber fluctuations and discontinuities from smooth scales in the wavenumber space, a paradigm of high-order shock-capturing scheme by recasting the non-dissipative skew-symmetric-splitting method with newly optimized dispersion property for smooth flow scales and invoking the nonlinear targeted ENO (TENO) schemes for non-smooth discontinuities is proposed. The resulting TENO-S scheme not only successfully performs long-time stable computations for smooth flows without numerical dissipation, but also recovers the robust shock-capturing capabilities with adaptive numerical dissipation. Without the necessity of parameter tuning case by case, extensive benchmark simulations involving a wide range of flow length scales and strong discontinuities demonstrate that the proposed TENO-S scheme performs significantly better than the straightforward deployment of WENO/TENO-family schemes with better spectral property and higher computational efficiency.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141040048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A differentiable programming framework for spin models 自旋模型的可微编程框架
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-10 DOI: 10.1016/j.cpc.2024.109234
Tiago S. Farias , Vitor V. Schultz , José C.M. Mombach , Jonas Maziero

We introduce a novel framework for simulating spin models using differentiable programming, an approach that leverages the advancements in machine learning and computational efficiency. We focus on three distinct spin systems: the Ising model, the Potts model, and the Cellular Potts model, demonstrating the practicality and scalability of our framework in modeling these complex systems. Additionally, this framework allows for the optimization of spin models, which can adjust the parameters of a system by a defined objective function. In order to simulate these models, we adapt the Metropolis-Hastings algorithm to a differentiable programming paradigm, employing batched tensors for simulating spin lattices. This adaptation not only facilitates the integration with existing deep learning tools but also significantly enhances computational speed through parallel processing capabilities, as it can be implemented on different hardware architectures, including GPUs and TPUs.

我们介绍了一种利用可微编程模拟自旋模型的新框架,这种方法充分利用了机器学习和计算效率方面的进步。我们重点研究了三种不同的自旋系统:伊辛模型、波茨模型和蜂窝波茨模型,证明了我们的框架在模拟这些复杂系统时的实用性和可扩展性。此外,该框架还允许对自旋模型进行优化,通过定义的目标函数调整系统参数。为了模拟这些模型,我们将 Metropolis-Hastings 算法调整为可微编程范式,采用批量张量来模拟自旋网格。这种调整不仅便于与现有的深度学习工具集成,还能通过并行处理能力显著提高计算速度,因为它可以在不同的硬件架构上实现,包括 GPU 和 TPU。
{"title":"A differentiable programming framework for spin models","authors":"Tiago S. Farias ,&nbsp;Vitor V. Schultz ,&nbsp;José C.M. Mombach ,&nbsp;Jonas Maziero","doi":"10.1016/j.cpc.2024.109234","DOIUrl":"10.1016/j.cpc.2024.109234","url":null,"abstract":"<div><p>We introduce a novel framework for simulating spin models using differentiable programming, an approach that leverages the advancements in machine learning and computational efficiency. We focus on three distinct spin systems: the Ising model, the Potts model, and the Cellular Potts model, demonstrating the practicality and scalability of our framework in modeling these complex systems. Additionally, this framework allows for the optimization of spin models, which can adjust the parameters of a system by a defined objective function. In order to simulate these models, we adapt the Metropolis-Hastings algorithm to a differentiable programming paradigm, employing batched tensors for simulating spin lattices. This adaptation not only facilitates the integration with existing deep learning tools but also significantly enhances computational speed through parallel processing capabilities, as it can be implemented on different hardware architectures, including GPUs and TPUs.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141047424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast-QSGS: A GPU accelerated program for structure generation of granular disordered media Fast-QSGS:用于粒状无序介质结构生成的 GPU 加速程序
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-09 DOI: 10.1016/j.cpc.2024.109241
Guang Yang, Tong Liu, Xukang Lu, Moran Wang

We present Fast-QSGS, a GPU-accelerated program for granular disordered media generation. Based on vectorization, Fast-QSGS is accelerated by modern GPU thanks to the NumPy-compatible API provided by CuPy. We also introduce a variable growth probability function and seed spacing control to improve the speed and accuracy of the original QSGS method. Computational performance benchmarks are conducted on both consumer-grade and professional-grade GPUs. Generation of disordered media of size 4003 can be completed in 30 s on A100 and 110 s on RTX4060, achieving a speedup of over 400 compared with the serial version. Physical benchmarks on the reconstruction of Fontainebleau sandstone and hydrated cement are conducted. Our results demonstrate that the permeability of the reconstructed Fontainebleau sandstone falls within the range of experimental values. Additionally, the average relative error of the volume fraction of the unhydrated cement and capillary porosity of hydrated cement is 1.9 % and 3.4 % compared with Powers’ law, respectively.

我们介绍了用于生成颗粒状无序介质的 GPU 加速程序 Fast-QSGS。由于 CuPy 提供了与 NumPy 兼容的应用程序接口,Fast-QSGS 得以在矢量化的基础上通过现代 GPU 加速。我们还引入了可变生长概率函数和种子间距控制,以提高原始 QSGS 方法的速度和精度。我们在消费级和专业级 GPU 上进行了计算性能基准测试。生成大小为 4003 的无序介质在 A100 上可在 30 秒内完成,在 RTX4060 上可在 110 秒内完成,与串行版本相比速度提高了 400 多倍。我们还进行了重建枫丹白露砂岩和水合水泥的物理基准测试。结果表明,重建的枫丹白露砂岩的渗透率在实验值范围内。此外,与鲍尔斯定律相比,未水化水泥的体积分数和水化水泥的毛细管孔隙率的平均相对误差分别为 1.9 % 和 3.4 %。
{"title":"Fast-QSGS: A GPU accelerated program for structure generation of granular disordered media","authors":"Guang Yang,&nbsp;Tong Liu,&nbsp;Xukang Lu,&nbsp;Moran Wang","doi":"10.1016/j.cpc.2024.109241","DOIUrl":"10.1016/j.cpc.2024.109241","url":null,"abstract":"<div><p>We present Fast-QSGS, a GPU-accelerated program for granular disordered media generation. Based on vectorization, Fast-QSGS is accelerated by modern GPU thanks to the NumPy-compatible API provided by CuPy. We also introduce a variable growth probability function and seed spacing control to improve the speed and accuracy of the original QSGS method. Computational performance benchmarks are conducted on both consumer-grade and professional-grade GPUs. Generation of disordered media of size 400<sup>3</sup> can be completed in 30 s on A100 and 110 s on RTX4060, achieving a speedup of over 400 compared with the serial version. Physical benchmarks on the reconstruction of Fontainebleau sandstone and hydrated cement are conducted. Our results demonstrate that the permeability of the reconstructed Fontainebleau sandstone falls within the range of experimental values. Additionally, the average relative error of the volume fraction of the unhydrated cement and capillary porosity of hydrated cement is 1.9 % and 3.4 % compared with Powers’ law, respectively.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141031119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interatomic potentials for graphene reinforced metal composites: Optimal choice 石墨烯增强金属复合材料的原子间势垒:最佳选择
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-08 DOI: 10.1016/j.cpc.2024.109235
Liliya R. Safina , Elizaveta A. Rozhnova , Karina A. Krylova , Ramil T. Murzaev , Julia A. Baimova

Graphene reinforced metal matrix composites represent a promising class of materials for high-strength surface coatings because of their high strength and ductility. This study reports the application of different interatomic potentials to correctly describe the interaction between graphene and metals (Al, Cu, Ni, and Ti) by molecular dynamics. Both simple pair potentials, such as Lennard-Jones and Morse, and many-body potentials, such as bond order potential are applied for the simulation of a graphene/metal system at room temperature. Three different structures are considered: (i) graphene interacting with one metal atom; (ii) graphene interacting with a metal nanoparticle, and (iii) three-dimensional graphene network filled with metal nanoparticles. We first determine the potential energy that any graphene/metal system can reach during exposure at 300 K; then, we analyze the interaction dynamics for all considered systems and all potentials. A considerable difference in the interaction between metal nanoparticles with planar and folded graphene was found. For graphene/Ni, graphene/Cu, and graphene/Ti, the Lennard-Jones and Morse potentials provide accurate energetic and structural properties of the studied structures; they also describe interaction in the graphene/metal system in a similar way, at variance with bond-order potential. For graphene/Al, the Tersoff and Morse potentials describe the interaction better than Lennard-Jones. For the simulation of graphene/Me system, the optimal choice of the potential for different structures is of crucial importance. The presented analysis of the interatomic potentials appears to be promising for realistic and accurate simulations of graphene reinforced metal composites.

石墨烯增强金属基复合材料具有高强度和延展性,是一类很有前途的高强度表面涂层材料。本研究报告了不同原子间势的应用,以通过分子动力学正确描述石墨烯与金属(铝、铜、镍和钛)之间的相互作用。在模拟室温下的石墨烯/金属体系时,既应用了简单的对势能(如伦纳德-琼斯和莫尔斯),也应用了多体势能(如键阶势能)。我们考虑了三种不同的结构:(i) 与一个金属原子相互作用的石墨烯;(ii) 与一个金属纳米粒子相互作用的石墨烯;(iii) 充满金属纳米粒子的三维石墨烯网络。我们首先确定了任何石墨烯/金属系统在 300 K 暴露条件下可达到的势能,然后分析了所有考虑的系统和所有势能的相互作用动力学。我们发现金属纳米粒子与平面石墨烯和折叠石墨烯之间的相互作用存在很大差异。对于石墨烯/尼、石墨烯/铜和石墨烯/钛,伦纳德-琼斯电位和莫尔斯电位提供了所研究结构的精确能量和结构特性;它们也以类似的方式描述了石墨烯/金属体系中的相互作用,但与键阶电位不同。对于石墨烯/铝,特尔索夫势和莫尔斯势比伦纳德-琼斯势更好地描述了相互作用。对于石墨烯/Me 体系的模拟,不同结构的最佳电位选择至关重要。本文介绍的原子间电位分析似乎有望实现对石墨烯增强金属复合材料的真实、准确模拟。
{"title":"Interatomic potentials for graphene reinforced metal composites: Optimal choice","authors":"Liliya R. Safina ,&nbsp;Elizaveta A. Rozhnova ,&nbsp;Karina A. Krylova ,&nbsp;Ramil T. Murzaev ,&nbsp;Julia A. Baimova","doi":"10.1016/j.cpc.2024.109235","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109235","url":null,"abstract":"<div><p>Graphene reinforced metal matrix composites represent a promising class of materials for high-strength surface coatings because of their high strength and ductility. This study reports the application of different interatomic potentials to correctly describe the interaction between graphene and metals (Al, Cu, Ni, and Ti) by molecular dynamics. Both simple pair potentials, such as Lennard-Jones and Morse, and many-body potentials, such as bond order potential are applied for the simulation of a graphene/metal system at room temperature. Three different structures are considered: (i) graphene interacting with one metal atom; (ii) graphene interacting with a metal nanoparticle, and (iii) three-dimensional graphene network filled with metal nanoparticles. We first determine the potential energy that any graphene/metal system can reach during exposure at 300 K; then, we analyze the interaction dynamics for all considered systems and all potentials. A considerable difference in the interaction between metal nanoparticles with planar and folded graphene was found. For graphene/Ni, graphene/Cu, and graphene/Ti, the Lennard-Jones and Morse potentials provide accurate energetic and structural properties of the studied structures; they also describe interaction in the graphene/metal system in a similar way, at variance with bond-order potential. For graphene/Al, the Tersoff and Morse potentials describe the interaction better than Lennard-Jones. For the simulation of graphene/Me system, the optimal choice of the potential for different structures is of crucial importance. The presented analysis of the interatomic potentials appears to be promising for realistic and accurate simulations of graphene reinforced metal composites.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High order compact augmented methods for Stokes equations with different boundary conditions 不同边界条件下斯托克斯方程的高阶紧凑增强方法
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-08 DOI: 10.1016/j.cpc.2024.109233
Kejia Pan , Jin Li , Zhilin Li

This paper is devoted to fourth order compact schemes and fast algorithms for solving stationary Stokes equations with different boundary conditions numerically. One of the main ideas is to decouple the Stokes equations into three Poisson equations for the pressure and the velocity via the pressure Poisson equation (PPE). The augmented strategy is utilized to provide numerical boundary conditions for the pressure. Different velocity boundary conditions require different interpolation strategies for the augmented methods. The augmented variable is solved by the GMRES method. A new simple and efficient preconditioning strategy has also been developed to accelerate the convergence of the GMRES iteration. Numerical examples presented in this paper confirmed the designed convergence order and the efficiency of the new methods.

本文致力于研究数值求解具有不同边界条件的静止斯托克斯方程的四阶紧凑方案和快速算法。主要思路之一是通过压力泊松方程(PPE)将斯托克斯方程解耦为三个压力和速度泊松方程。利用增强策略为压力提供数值边界条件。不同的速度边界条件要求增强方法采用不同的插值策略。增强变量采用 GMRES 方法求解。此外,还开发了一种新的简单高效的预处理策略,以加速 GMRES 迭代的收敛。本文提供的数值示例证实了所设计的收敛阶次和新方法的效率。
{"title":"High order compact augmented methods for Stokes equations with different boundary conditions","authors":"Kejia Pan ,&nbsp;Jin Li ,&nbsp;Zhilin Li","doi":"10.1016/j.cpc.2024.109233","DOIUrl":"https://doi.org/10.1016/j.cpc.2024.109233","url":null,"abstract":"<div><p>This paper is devoted to fourth order compact schemes and fast algorithms for solving stationary Stokes equations with different boundary conditions numerically. One of the main ideas is to decouple the Stokes equations into three Poisson equations for the pressure and the velocity via the pressure Poisson equation (PPE). The augmented strategy is utilized to provide numerical boundary conditions for the pressure. Different velocity boundary conditions require different interpolation strategies for the augmented methods. The augmented variable is solved by the GMRES method. A new simple and efficient preconditioning strategy has also been developed to accelerate the convergence of the GMRES iteration. Numerical examples presented in this paper confirmed the designed convergence order and the efficiency of the new methods.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140952156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CUDA-based focused Gaussian beams second-harmonic generation efficiency calculator 基于 CUDA 的聚焦高斯光束二次谐波生成效率计算器
IF 6.3 2区 物理与天体物理 Q1 Physics and Astronomy Pub Date : 2024-05-06 DOI: 10.1016/j.cpc.2024.109232
A.D. Sanchez , S. Chaitanya Kumar , M. Ebrahim-Zadeh

We present an object-oriented programming (OOP) CUDA-based package for fast and accurate simulation of second-harmonic generation (SHG) efficiency using focused Gaussian beams. The model includes linear as well as two-photon absorption that can ultimately lead to thermal lensing due to self-heating effects. Our approach speeds up calculations by nearly 40x (11x) without (with) temperature profiles with respect to an equivalent implementation using CPU. The package offers a valuable tool for experimental design and study of 3D field propagation in nonlinear three-wave interactions. It is useful for optimization of SHG-based experiments and mitigates undesired thermal effects, enabling improved oven designs and advanced device architectures, leading to stable, efficient high-power SHG.

Program summary

Program Title: cuSHG

CPC Library link to program files: https://doi.org/10.17632/hn76s7x848.1

Developer's repository link: https://github.com/alfredos84/cuSHG

Licensing provisions: MIT

Programming language:

, CUDA

Nature of problem: The problem which is solved in this work is that of second-harmonic generation (SHG) performance degradation in a nonlinear crystal with focused Gaussian beams due to thermal effects. By placing the nonlinear crystal in an oven that controls temperature, the package computes the involved electric fields along the medium. The implemented model includes the linear and nonlinear absorption which occasionally lead to self-heating effect, degrading the performance of the SHG.

Solution method: The coupled differential equations for three-wave interactions, which describe the field evolution along the crystal, are solved using the well-known Split-Step Fourier method. The temperature profiles are estimated using the finite-elements method. The field evolution and thermal effects are embedded in a self-consistent algorithm that sequentially and separately solves the electromagnetic and thermal problems until the system reaches the steady state. Due to the eventual computational demand that some problems may have, we chose to implement the coupled equations in the

/CUDA programming language. This allows us to significantly speed up simulations, thanks to the computing power provided by a graphics processing unit (GPU) card. The output files obtained are the interacting electric fields and the temperature profile, which have to be analyzed during post-processing.

我们介绍了一种基于面向对象编程(OOP)的 CUDA 软件包,用于利用聚焦高斯光束快速、准确地模拟二次谐波发生(SHG)效率。该模型包括线性吸收和双光子吸收,最终会因自热效应导致热透镜效应。与使用中央处理器的等效实现相比,我们的方法在没有(有)温度曲线的情况下将计算速度提高了近 40 倍(11 倍)。该软件包为非线性三波相互作用中三维场传播的实验设计和研究提供了宝贵的工具。它有助于优化基于 SHG 的实验,减轻不希望出现的热效应,从而改进烘箱设计和先进的设备架构,实现稳定、高效的高功率 SHG。程序摘要程序标题:cuSHGCPC 程序文件库链接:https://doi.org/10.17632/hn76s7x848.1Developer's repository 链接:https://github.com/alfredos84/cuSHGLicensing provisions:MITProgramming language:问题性质:本工作所要解决的问题是,在具有聚焦高斯光束的非线性晶体中,由于热效应导致二次谐波发生(SHG)性能下降。通过将非线性晶体置于控制温度的烤箱中,软件包计算了沿介质的相关电场。实现的模型包括线性和非线性吸收,这些吸收偶尔会导致自热效应,从而降低 SHG 的性能:采用著名的分步傅立叶法求解三波相互作用的耦合微分方程,该方程描述了沿晶体的场演化。温度曲线采用有限元法估算。场演变和热效应被嵌入一个自洽算法中,该算法按顺序分别求解电磁和热问题,直到系统达到稳定状态。由于某些问题最终可能需要计算,我们选择用 /CUDA 编程语言实现耦合方程。借助图形处理单元(GPU)卡提供的计算能力,我们可以大大加快模拟速度。获得的输出文件是相互作用的电场和温度曲线,必须在后处理过程中对其进行分析。
{"title":"CUDA-based focused Gaussian beams second-harmonic generation efficiency calculator","authors":"A.D. Sanchez ,&nbsp;S. Chaitanya Kumar ,&nbsp;M. Ebrahim-Zadeh","doi":"10.1016/j.cpc.2024.109232","DOIUrl":"10.1016/j.cpc.2024.109232","url":null,"abstract":"<div><p>We present an object-oriented programming (OOP) CUDA-based package for fast and accurate simulation of second-harmonic generation (SHG) efficiency using focused Gaussian beams. The model includes linear as well as two-photon absorption that can ultimately lead to thermal lensing due to self-heating effects. Our approach speeds up calculations by nearly 40x (11x) without (with) temperature profiles with respect to an equivalent implementation using CPU. The package offers a valuable tool for experimental design and study of 3D field propagation in nonlinear three-wave interactions. It is useful for optimization of SHG-based experiments and mitigates undesired thermal effects, enabling improved oven designs and advanced device architectures, leading to stable, efficient high-power SHG.</p></div><div><h3>Program summary</h3><p><em>Program Title:</em> <span>cuSHG</span></p><p><em>CPC Library link to program files:</em> <span>https://doi.org/10.17632/hn76s7x848.1</span><svg><path></path></svg></p><p><em>Developer's repository link:</em> <span>https://github.com/alfredos84/cuSHG</span><svg><path></path></svg></p><p><em>Licensing provisions:</em> MIT</p><p><em>Programming language:</em> <figure><img></figure>, CUDA</p><p><em>Nature of problem:</em> The problem which is solved in this work is that of second-harmonic generation (SHG) performance degradation in a nonlinear crystal with focused Gaussian beams due to thermal effects. By placing the nonlinear crystal in an oven that controls temperature, the package computes the involved electric fields along the medium. The implemented model includes the linear and nonlinear absorption which occasionally lead to self-heating effect, degrading the performance of the SHG.</p><p><em>Solution method:</em> The coupled differential equations for three-wave interactions, which describe the field evolution along the crystal, are solved using the well-known Split-Step Fourier method. The temperature profiles are estimated using the finite-elements method. The field evolution and thermal effects are embedded in a self-consistent algorithm that sequentially and separately solves the electromagnetic and thermal problems until the system reaches the steady state. Due to the eventual computational demand that some problems may have, we chose to implement the coupled equations in the <figure><img></figure>/CUDA programming language. This allows us to significantly speed up simulations, thanks to the computing power provided by a graphics processing unit (GPU) card. The output files obtained are the interacting electric fields and the temperature profile, which have to be analyzed during post-processing.</p></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":null,"pages":null},"PeriodicalIF":6.3,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141035826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer Physics Communications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1