Enabling FEM-based absolute permeability estimation in giga-voxel porous media with a single GPU

IF 7.3 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY Computer Methods in Applied Mechanics and Engineering Pub Date : 2025-02-01 Epub Date: 2024-11-29 DOI:10.1016/j.cma.2024.117559
Pedro Cortez Fetter Lopes , Federico Semeraro , André Maués Brabo Pereira , Ricardo Leiderman
{"title":"Enabling FEM-based absolute permeability estimation in giga-voxel porous media with a single GPU","authors":"Pedro Cortez Fetter Lopes ,&nbsp;Federico Semeraro ,&nbsp;André Maués Brabo Pereira ,&nbsp;Ricardo Leiderman","doi":"10.1016/j.cma.2024.117559","DOIUrl":null,"url":null,"abstract":"<div><div>The characterization of porous media via digital testing usually relies on intensive numerical computations that can be parallelized in GPUs. For absolute permeability estimation, Stokes flow simulations are carried out at the micro-structure to recover velocity fields that are used in upscaling with Darcy’s law. Digital models of samples can be obtained via micro-computed tomography (<span><math><mi>μ</mi></math></span>CT) scans. As <span><math><mi>μ</mi></math></span>CT data is three-dimensional, meshes grow cubically with image dimensions, causing the numerical problem at hand to become compute- and memory-bound as either resolution improves or larger fields-of-view are considered. While the usual focus is on accelerating solvers, memory usage continues to be a significant limitation for analyses of representative volumes in relatively accessible hardware. In this work, we explore the possibility of implementing MINRES solvers in GPU that favor a reduction in memory allocation. These solvers are applied to matrix-free FEM-based permeability characterization of <span><math><mi>μ</mi></math></span>CT images. Our goal is to enable the study of 1000<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span> voxel images in single GPU machines. Implementations that only require five, three, or two <span><math><mi>n</mi></math></span>-sized vectors of variables are presented, with <span><math><mi>n</mi></math></span> being the number of unknowns. Further, we employ a mesh numbering strategy that enables node-by-node massively parallel operations within a non-monolithic voxel-based pore space without storing connectivity tables. The proposed solvers, available through the open-source <span><span>chfem</span><svg><path></path></svg></span> software, are verified against analytical models for simple three-dimensional micro-structures, then are validated against numerical Digital Petrophysics benchmarks. A consumer-grade graphics card with 12GB of RAM is employed for the characterization of images with up to roughly 540 million voxels in a matter of tens of minutes. Stokes flow FEM-based simulations in meshes with 449 million degrees-of-freedom (DOFs) are carried out in 9 to 15 min, allocating less than 10GB in global memory. Finally, simulations on three 1000<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span> carbon fiber domains, amounting to more than 3.7 billion DOFs, were run on a high-end GPU with 80GB of RAM in under 2.5 h, achieving very close agreement with flow-tube permeability experiments.</div></div>","PeriodicalId":55222,"journal":{"name":"Computer Methods in Applied Mechanics and Engineering","volume":"434 ","pages":"Article 117559"},"PeriodicalIF":7.3000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Applied Mechanics and Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045782524008132","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The characterization of porous media via digital testing usually relies on intensive numerical computations that can be parallelized in GPUs. For absolute permeability estimation, Stokes flow simulations are carried out at the micro-structure to recover velocity fields that are used in upscaling with Darcy’s law. Digital models of samples can be obtained via micro-computed tomography (μCT) scans. As μCT data is three-dimensional, meshes grow cubically with image dimensions, causing the numerical problem at hand to become compute- and memory-bound as either resolution improves or larger fields-of-view are considered. While the usual focus is on accelerating solvers, memory usage continues to be a significant limitation for analyses of representative volumes in relatively accessible hardware. In this work, we explore the possibility of implementing MINRES solvers in GPU that favor a reduction in memory allocation. These solvers are applied to matrix-free FEM-based permeability characterization of μCT images. Our goal is to enable the study of 10003 voxel images in single GPU machines. Implementations that only require five, three, or two n-sized vectors of variables are presented, with n being the number of unknowns. Further, we employ a mesh numbering strategy that enables node-by-node massively parallel operations within a non-monolithic voxel-based pore space without storing connectivity tables. The proposed solvers, available through the open-source chfem software, are verified against analytical models for simple three-dimensional micro-structures, then are validated against numerical Digital Petrophysics benchmarks. A consumer-grade graphics card with 12GB of RAM is employed for the characterization of images with up to roughly 540 million voxels in a matter of tens of minutes. Stokes flow FEM-based simulations in meshes with 449 million degrees-of-freedom (DOFs) are carried out in 9 to 15 min, allocating less than 10GB in global memory. Finally, simulations on three 10003 carbon fiber domains, amounting to more than 3.7 billion DOFs, were run on a high-end GPU with 80GB of RAM in under 2.5 h, achieving very close agreement with flow-tube permeability experiments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用单个GPU在千兆体素多孔介质中实现基于fem的绝对渗透率估计
通过数字测试表征多孔介质通常依赖于可以在gpu中并行化的密集数值计算。对于绝对渗透率的估计,在微观结构上进行了Stokes流动模拟,以恢复用于达西定律升级的速度场。通过微计算机断层扫描(μCT)可以获得样品的数字模型。由于μCT数据是三维的,网格随着图像尺寸的增加而呈立方体增长,导致当前的数值问题在分辨率提高或考虑更大视场时变得计算和内存受限。虽然通常关注的是加速求解器,但内存使用仍然是分析相对可访问硬件中的代表性卷的一个重要限制。在这项工作中,我们探索了在GPU中实现MINRES求解器的可能性,这有利于减少内存分配。这些求解器被应用于μCT图像的无基质有限元磁导率表征。我们的目标是在单个GPU机器上实现10003体素图像的研究。给出了只需要5个、3个或2个n大小的变量向量的实现,其中n是未知数的数量。此外,我们采用了一种网格编号策略,可以在非单片体素的孔隙空间内实现逐节点的大规模并行操作,而无需存储连接表。通过开源的chfem软件,通过简单三维微观结构的分析模型验证了所提出的求解方法,然后通过数字数字岩石物理基准进行了验证。使用具有12GB RAM的消费级显卡在数十分钟内表征高达约5.4亿体素的图像。在具有4.49亿个自由度(dof)的网格中,基于Stokes流fem的模拟在9到15分钟内完成,分配的全局内存少于10GB。最后,在80GB RAM的高端GPU上,对3个10003碳纤维畴进行了超过37亿dfs的仿真,用时不到2.5 h,与流管渗透率实验结果非常吻合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
12.70
自引率
15.30%
发文量
719
审稿时长
44 days
期刊介绍: Computer Methods in Applied Mechanics and Engineering stands as a cornerstone in the realm of computational science and engineering. With a history spanning over five decades, the journal has been a key platform for disseminating papers on advanced mathematical modeling and numerical solutions. Interdisciplinary in nature, these contributions encompass mechanics, mathematics, computer science, and various scientific disciplines. The journal welcomes a broad range of computational methods addressing the simulation, analysis, and design of complex physical problems, making it a vital resource for researchers in the field.
期刊最新文献
Bayesian neural networks with interpretable priors from Mercer kernels A level-set-oriented and problem-independent robust topology optimization strategy based on convolutional neural networks and uncertainty clustering Adaptive mesh h(p)-refinement of a discontinuous Bubnov-Galerkin isogeometric analysis spatial discretisation of the first-order form of the neutron transport equation with goal-based error measures and diffusion acceleration Benchmarking stabilized and self-stabilized p-virtual element methods with variable coefficients A novel approach for topology optimization mirroring human intention: Introducing a pattern-embedded filter built by machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1