arXiv - CS - Mathematical Software最新文献

英文中文

A Distributed Algebra System for Time Integration on Parallel Computers 并行计算机上时间积分的分布式代数系统

arXiv - CS - Mathematical Software

Pub Date : 2023-09-11 DOI: arxiv-2309.05331

Abhinav Singh, Landfried Kraatz, Pietro Incardona, Ivo F. Sbalzarini

We present a distributed algebra system for efficient and compactimplementation of numerical time integration schemes on parallel computers andgraphics processing units (GPU). The software implementation combines the timeintegration library Odeint from Boost with the OpenFPM framework for scalablescientific computing. Implementing multi-stage, multi-step, or adaptive timeintegration methods in distributed-memory parallel codes or on GPUs ischallenging. The present algebra system addresses this by making the timeintegration methods from Odeint available in a concise template-expressionlanguage for numerical simulations distributed and parallelized using OpenFPM.This allows using state-of-the-art time integration schemes, or switchingbetween schemes, by changing one line of code, while maintaining parallelscalability. This enables scalable time integration with compact code andfacilitates rapid rewriting and deployment of simulation algorithms. Webenchmark the present software for exponential and sigmoidal dynamics andpresent an application example to the 3D Gray-Scott reaction-diffusion problemon both CPUs and GPUs in only 60 lines of code.

我们提出了一个分布式代数系统，用于在并行计算机和图形处理单元(GPU)上高效、紧凑地实现数值时间积分方案。软件实现将Boost的时间积分库Odeint与OpenFPM框架相结合，用于可扩展的科学计算。在分布式内存并行代码或gpu上实现多阶段，多步骤或自适应时间积分方法是具有挑战性的。目前的代数系统解决了这个问题，它使Odeint中的时间积分方法以简洁的模板表达式语言提供，用于使用OpenFPM进行分布式和并行化的数值模拟。这允许使用最先进的时间集成方案，或者通过更改一行代码在方案之间切换，同时保持并行可伸缩性。这使得可扩展的时间集成与紧凑的代码和促进快速重写和部署仿真算法。我们对目前的软件进行了指数和s型动力学的基准测试，并在仅60行代码的情况下，在cpu和gpu上给出了3D Gray-Scott反应扩散问题的应用示例。

{"title":"A Distributed Algebra System for Time Integration on Parallel Computers","authors":"Abhinav Singh, Landfried Kraatz, Pietro Incardona, Ivo F. Sbalzarini","doi":"arxiv-2309.05331","DOIUrl":"https://doi.org/arxiv-2309.05331","url":null,"abstract":"We present a distributed algebra system for efficient and compact\u0000implementation of numerical time integration schemes on parallel computers and\u0000graphics processing units (GPU). The software implementation combines the time\u0000integration library Odeint from Boost with the OpenFPM framework for scalable\u0000scientific computing. Implementing multi-stage, multi-step, or adaptive time\u0000integration methods in distributed-memory parallel codes or on GPUs is\u0000challenging. The present algebra system addresses this by making the time\u0000integration methods from Odeint available in a concise template-expression\u0000language for numerical simulations distributed and parallelized using OpenFPM.\u0000This allows using state-of-the-art time integration schemes, or switching\u0000between schemes, by changing one line of code, while maintaining parallel\u0000scalability. This enables scalable time integration with compact code and\u0000facilitates rapid rewriting and deployment of simulation algorithms. We\u0000benchmark the present software for exponential and sigmoidal dynamics and\u0000present an application example to the 3D Gray-Scott reaction-diffusion problem\u0000on both CPUs and GPUs in only 60 lines of code.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"17 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Missing Data Imputation of Non-stationary Signals with Harmonic Decomposition 用谐波分解增强非平稳信号缺失数据的输入

arXiv - CS - Mathematical Software

Pub Date : 2023-09-08 DOI: arxiv-2309.04630

Joaquin Ruiz, Hau-tieng Wu, Marcelo A. Colominas

Dealing with time series with missing values, including those afflicted bylow quality or over-saturation, presents a significant signal processingchallenge. The task of recovering these missing values, known as imputation,has led to the development of several algorithms. However, we have observedthat the efficacy of these algorithms tends to diminish when the time seriesexhibit non-stationary oscillatory behavior. In this paper, we introduce anovel algorithm, coined Harmonic Level Interpolation (HaLI), which enhances theperformance of existing imputation algorithms for oscillatory time series.After running any chosen imputation algorithm, HaLI leverages the harmonicdecomposition based on the adaptive nonharmonic model of the initial imputationto improve the imputation accuracy for oscillatory time series. Experimentalassessments conducted on synthetic and real signals consistently highlight thatHaLI enhances the performance of existing imputation algorithms. The algorithmis made publicly available as a readily employable Matlab code for otherresearchers to use.

处理具有缺失值的时间序列，包括那些受低质量或过饱和影响的时间序列，是一个重大的信号处理挑战。恢复这些缺失值的任务，被称为imputation，已经导致了几种算法的发展。然而，我们已经观察到，当时间序列表现出非平稳振荡行为时，这些算法的有效性趋于降低。在本文中，我们引入了一种新的算法——谐波电平插值(HaLI)，它提高了现有的振荡时间序列插值算法的性能。在运行任意选择的输入算法后，HaLI利用基于初始输入的自适应非谐波模型的谐波分解来提高振荡时间序列的输入精度。对合成信号和真实信号进行的实验评估一致表明，ali增强了现有估算算法的性能。该算法作为易于使用的Matlab代码公开提供给其他研究人员使用。

{"title":"Enhancing Missing Data Imputation of Non-stationary Signals with Harmonic Decomposition","authors":"Joaquin Ruiz, Hau-tieng Wu, Marcelo A. Colominas","doi":"arxiv-2309.04630","DOIUrl":"https://doi.org/arxiv-2309.04630","url":null,"abstract":"Dealing with time series with missing values, including those afflicted by\u0000low quality or over-saturation, presents a significant signal processing\u0000challenge. The task of recovering these missing values, known as imputation,\u0000has led to the development of several algorithms. However, we have observed\u0000that the efficacy of these algorithms tends to diminish when the time series\u0000exhibit non-stationary oscillatory behavior. In this paper, we introduce a\u0000novel algorithm, coined Harmonic Level Interpolation (HaLI), which enhances the\u0000performance of existing imputation algorithms for oscillatory time series.\u0000After running any chosen imputation algorithm, HaLI leverages the harmonic\u0000decomposition based on the adaptive nonharmonic model of the initial imputation\u0000to improve the imputation accuracy for oscillatory time series. Experimental\u0000assessments conducted on synthetic and real signals consistently highlight that\u0000HaLI enhances the performance of existing imputation algorithms. The algorithm\u0000is made publicly available as a readily employable Matlab code for other\u0000researchers to use.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"12 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Immersed Boundary Approach for Irregular Topography with Acoustic Wave Equations 一种基于声波方程的不规则地形浸入边界新方法

arXiv - CS - Mathematical Software

Pub Date : 2023-09-07 DOI: arxiv-2309.03600

Edward Caunt, Rhodri Nelson, Fabio Luporini, Gerard Gorman

Irregular terrain has a pronounced effect on the propagation of seismic andacoustic wavefields but is not straightforwardly reconciled with structuredfinite-difference (FD) methods used to model such phenomena. Methods currentlydetailed in the literature are generally limited in scope application-wise ornon-trivial to apply to real-world geometries. With this in mind, a generalimmersed boundary treatment capable of imposing a range of boundary conditionsin a relatively equation-agnostic manner has been developed, alongside aframework implementing this approach, intending to complement emergingcode-generation paradigms. The approach is distinguished by the use ofN-dimensional Taylor-series extrapolants constrained by boundary conditionsimposed at some suitably-distributed set of surface points. The extrapolationprocess is encapsulated in modified derivative stencils applied in the vicinityof the boundary, utilizing hyperspherical support regions. This method ensuresboundary representation is consistent with the FD discretization: both must beconsidered in tandem. Furthermore, high-dimensional and vector boundaryconditions can be applied without approximation prior to discretization. Aconsistent methodology can thus be applied across free and rigid surfaces withthe first and second-order acoustic wave equation formulations. Application toboth equations is demonstrated, and numerical examples based on analytic andreal-world topography implementing free and rigid surfaces in 2D and 3D arepresented.

不规则地形对地震波场和声波场的传播有明显的影响，但与用于模拟这种现象的结构有限差分(FD)方法不能直接调和。目前在文献中详细介绍的方法通常在应用方面的范围有限，或者不平凡地应用于现实世界的几何。考虑到这一点，已经开发了一种能够以相对方程不可知的方式施加一系列边界条件的通用浸入式边界处理，以及实施这种方法的框架，旨在补充新兴的代码生成范例。该方法的特点是使用n维泰勒级数外推法，并在适当分布的曲面点集合上设置边界条件。外推过程被封装在应用于边界附近的改进导数模板中，利用超球面支撑区域。该方法确保边界表示与FD离散化一致:两者必须同时考虑。此外，在离散化之前，高维和矢量边界条件可以在没有近似的情况下应用。因此，一致的方法可以应用于自由和刚性表面与一阶和二阶声波方程公式。演示了这两个方程的应用，并给出了基于解析和现实世界地形的二维和三维自由和刚性表面的数值例子。

{"title":"A Novel Immersed Boundary Approach for Irregular Topography with Acoustic Wave Equations","authors":"Edward Caunt, Rhodri Nelson, Fabio Luporini, Gerard Gorman","doi":"arxiv-2309.03600","DOIUrl":"https://doi.org/arxiv-2309.03600","url":null,"abstract":"Irregular terrain has a pronounced effect on the propagation of seismic and\u0000acoustic wavefields but is not straightforwardly reconciled with structured\u0000finite-difference (FD) methods used to model such phenomena. Methods currently\u0000detailed in the literature are generally limited in scope application-wise or\u0000non-trivial to apply to real-world geometries. With this in mind, a general\u0000immersed boundary treatment capable of imposing a range of boundary conditions\u0000in a relatively equation-agnostic manner has been developed, alongside a\u0000framework implementing this approach, intending to complement emerging\u0000code-generation paradigms. The approach is distinguished by the use of\u0000N-dimensional Taylor-series extrapolants constrained by boundary conditions\u0000imposed at some suitably-distributed set of surface points. The extrapolation\u0000process is encapsulated in modified derivative stencils applied in the vicinity\u0000of the boundary, utilizing hyperspherical support regions. This method ensures\u0000boundary representation is consistent with the FD discretization: both must be\u0000considered in tandem. Furthermore, high-dimensional and vector boundary\u0000conditions can be applied without approximation prior to discretization. A\u0000consistent methodology can thus be applied across free and rigid surfaces with\u0000the first and second-order acoustic wave equation formulations. Application to\u0000both equations is demonstrated, and numerical examples based on analytic and\u0000real-world topography implementing free and rigid surfaces in 2D and 3D are\u0000presented.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A FAIR File Format for Mathematical Software 数学软件的公平文件格式

arXiv - CS - Mathematical Software

Pub Date : 2023-09-01 DOI: arxiv-2309.00465

Antony Della Vecchia, Michael Joswig, Benjamin Lorenz

We describe a generic JSON based file format which is suitable forcomputations in computer algebra. This is implemented in the computer algebrasystem OSCAR, but we also indicate how it can be used in a different context.

我们描述了一种通用的基于JSON的文件格式，它适用于计算机代数中的计算。这是在计算机代数系统OSCAR中实现的，但我们也指出了如何在不同的上下文中使用它。

引用次数: 0

Segmentação e contagem de troncos de madeira utilizando deep learning e processamento de imagens 使用深度学习和图像处理对原木进行分割和计数

arXiv - CS - Mathematical Software

Pub Date : 2023-08-31 DOI: arxiv-2309.00123

João V. C. Mazzochin, Gustavo Tiecker, Erick O. Rodrigues

Counting objects in images is a pattern recognition problem that focuses onidentifying an element to determine its incidence and is approached in theliterature as Visual Object Counting (VOC). In this work, we propose amethodology to count wood logs. First, wood logs are segmented from the imagebackground. This first segmentation step is obtained using the Pix2Pixframework that implements Conditional Generative Adversarial Networks (CGANs).Second, the clusters are counted using Connected Components. The averageaccuracy of the segmentation exceeds 89% while the average amount of wood logsidentified based on total accounted is over 97%.

图像中的对象计数是一个模式识别问题，重点是识别一个元素以确定其发生率，在文献中被称为视觉对象计数(VOC)。在这项工作中，我们提出了计算原木的方法。首先，从图像背景中分割原木。第一个分割步骤是使用实现条件生成对抗网络(cgan)的Pix2Pixframework获得的。其次，使用连接组件对集群进行计数。分割的平均准确率超过89%，基于总占比的木材平均识别量超过97%。

引用次数: 0

Bringing PDEs to JAX with forward and reverse modes automatic differentiation 将pde引入JAX，具有正向和反向模式的自动区分

arXiv - CS - Mathematical Software

Pub Date : 2023-08-31 DOI: arxiv-2309.07137

Ivan Yashchuk

Partial differential equations (PDEs) are used to describe a variety ofphysical phenomena. Often these equations do not have analytical solutions andnumerical approximations are used instead. One of the common methods to solvePDEs is the finite element method. Computing derivative information of thesolution with respect to the input parameters is important in many tasks inscientific computing. We extend JAX automatic differentiation library with aninterface to Firedrake finite element library. High-level symbolicrepresentation of PDEs allows bypassing differentiating through low-levelpossibly many iterations of the underlying nonlinear solvers. Differentiatingthrough Firedrake solvers is done using tangent-linear and adjoint equations.This enables the efficient composition of finite element solvers with arbitrarydifferentiable programs. The code is available atgithub.com/IvanYashchuk/jax-firedrake.

偏微分方程(PDEs)被用来描述各种物理现象。通常这些方程没有解析解，而用数值近似代替。求解偏微分方程的常用方法之一是有限元法。在科学计算的许多任务中，计算解对输入参数的导数信息是很重要的。我们扩展了JAX自动微分库，并提供了Firedrake有限元库的接口。pde的高级符号表示允许通过底层非线性求解器的低级可能多次迭代来绕过微分。通过Firedrake求解器进行微分，使用切线方程和伴随方程。这使得有限元求解器与任意可微程序的有效组合成为可能。代码可以在atgithub.com/IvanYashchuk/jax-firedrake上找到。

引用次数: 0

An Efficient Framework for Global Non-Convex Polynomial Optimization over the Hypercube 超立方体上全局非凸多项式优化的一个有效框架

arXiv - CS - Mathematical Software

Pub Date : 2023-08-31 DOI: arxiv-2308.16731

Pierre-David Letourneau, Dalton Jones, Matthew Morse, M. Harper Langston

We present a novel efficient theoretical and numerical framework for solvingglobal non-convex polynomial optimization problems. We analytically demonstratethat such problems can be efficiently reformulated using a non-linear objectiveover a convex set; further, these reformulated problems possess no spuriouslocal minima (i.e., every local minimum is a global minimum). We introduce analgorithm for solving these resulting problems using the augmented Lagrangianand the method of Burer and Monteiro. We show through numerical experimentsthat polynomial scaling in dimension and degree is achievable for computing theoptimal value and location of previously intractable global polynomialoptimization problems in high dimension.

本文提出了一种新的求解全局非凸多项式优化问题的理论和数值框架。我们解析地证明了这类问题可以用凸集上的非线性目标有效地重新表述;此外，这些重新表述的问题不具有虚假的局部最小值(即，每个局部最小值都是全局最小值)。我们介绍了利用增广拉格朗日和Burer和Monteiro的方法来解决这些问题的算法。我们通过数值实验证明，多项式的维数和阶数缩放对于计算高维全局多项式优化问题的最优值和位置是可以实现的。

引用次数: 0

Evolutionary Dynamic Optimization Laboratory: A MATLAB Optimization Platform for Education and Experimentation in Dynamic Environments 进化动态优化实验室:动态环境下教学与实验的MATLAB优化平台

arXiv - CS - Mathematical Software

Pub Date : 2023-08-24 DOI: arxiv-2308.12644

Mai Peng, Zeneng She, Delaram Yazdani, Danial Yazdani, Wenjian Luo, Changhe Li, Juergen Branke, Trung Thanh Nguyen, Amir H. Gandomi, Yaochu Jin, Xin Yao

Many real-world optimization problems possess dynamic characteristics.Evolutionary dynamic optimization algorithms (EDOAs) aim to tackle thechallenges associated with dynamic optimization problems. Looking at theexisting works, the results reported for a given EDOA can sometimes beconsiderably different. This issue occurs because the source codes of manyEDOAs, which are usually very complex algorithms, have not been made publiclyavailable. Indeed, the complexity of components and mechanisms used in manyEDOAs makes their re-implementation error-prone. In this paper, to assistresearchers in performing experiments and comparing their algorithms againstseveral EDOAs, we develop an open-source MATLAB platform for EDOAs, calledEvolutionary Dynamic Optimization LABoratory (EDOLAB). This platform alsocontains an education module that can be used for educational purposes. In theeducation module, the user can observe a) a 2-dimensional problem space and howits morphology changes after each environmental change, b) the behaviors ofindividuals over time, and c) how the EDOA reacts to environmental changes andtries to track the moving optimum. In addition to being useful for research andeducation purposes, EDOLAB can also be used by practitioners to solve theirreal-world problems. The current version of EDOLAB includes 25 EDOAs and threefully-parametric benchmark generators. The MATLAB source code for EDOLAB ispublicly available and can be accessed from[https://github.com/EDOLAB-platform/EDOLAB-MATLAB].

许多现实世界的优化问题都具有动态特性。进化动态优化算法(EDOAs)旨在解决与动态优化问题相关的挑战。看看现有的工作，对于给定的EDOA报告的结果有时可能会有很大的不同。出现这个问题的原因是，许多yedoa(通常是非常复杂的算法)的源代码尚未公开。实际上，许多yedoa中使用的组件和机制的复杂性使得它们的重新实现容易出错。在本文中，为了帮助研究人员进行实验并将他们的算法与几种edoa进行比较，我们开发了一个edoa的开源MATLAB平台，称为进化动态优化实验室(EDOLAB)。该平台还包含一个教育模块，可用于教育目的。在教育模块中，用户可以观察a)二维问题空间和每次环境变化后的形态变化，b)个体随时间的行为，以及c) EDOA如何对环境变化做出反应并试图跟踪移动最优。除了用于研究和教育目的之外，eolab还可以被从业者用来解决他们的现实问题。当前版本的EDOLAB包括25个edoa和三个全参数基准生成器。EDOLAB的MATLAB源代码是公开的，可以从[https://github.com/EDOLAB-platform/EDOLAB-MATLAB]访问。

{"title":"Evolutionary Dynamic Optimization Laboratory: A MATLAB Optimization Platform for Education and Experimentation in Dynamic Environments","authors":"Mai Peng, Zeneng She, Delaram Yazdani, Danial Yazdani, Wenjian Luo, Changhe Li, Juergen Branke, Trung Thanh Nguyen, Amir H. Gandomi, Yaochu Jin, Xin Yao","doi":"arxiv-2308.12644","DOIUrl":"https://doi.org/arxiv-2308.12644","url":null,"abstract":"Many real-world optimization problems possess dynamic characteristics.\u0000Evolutionary dynamic optimization algorithms (EDOAs) aim to tackle the\u0000challenges associated with dynamic optimization problems. Looking at the\u0000existing works, the results reported for a given EDOA can sometimes be\u0000considerably different. This issue occurs because the source codes of many\u0000EDOAs, which are usually very complex algorithms, have not been made publicly\u0000available. Indeed, the complexity of components and mechanisms used in many\u0000EDOAs makes their re-implementation error-prone. In this paper, to assist\u0000researchers in performing experiments and comparing their algorithms against\u0000several EDOAs, we develop an open-source MATLAB platform for EDOAs, called\u0000Evolutionary Dynamic Optimization LABoratory (EDOLAB). This platform also\u0000contains an education module that can be used for educational purposes. In the\u0000education module, the user can observe a) a 2-dimensional problem space and how\u0000its morphology changes after each environmental change, b) the behaviors of\u0000individuals over time, and c) how the EDOA reacts to environmental changes and\u0000tries to track the moving optimum. In addition to being useful for research and\u0000education purposes, EDOLAB can also be used by practitioners to solve their\u0000real-world problems. The current version of EDOLAB includes 25 EDOAs and three\u0000fully-parametric benchmark generators. The MATLAB source code for EDOLAB is\u0000publicly available and can be accessed from\u0000[https://github.com/EDOLAB-platform/EDOLAB-MATLAB].","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"27 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Alternative quadrant representations with Morton index and AVX2 vectorization for AMR algorithms within the p4est software library p4est软件库中AMR算法的Morton索引和AVX2矢量化的可选象限表示

arXiv - CS - Mathematical Software

Pub Date : 2023-08-24 DOI: arxiv-2308.13615

Mikhail KirilinINS, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany, Carsten BursteddeINS, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

We present a technical enhancement within the p4est software for paralleladaptive mesh refinement. In p4est primitives are stored as octants in threeand quadrants in two dimensions. While, classically, they are encoded by thenative approach using its spatial and refinement level, any othermathematically equivalent encoding might be used instead. Recognizing this, we add two alternative representations to the classical,explicit version, based on a long monotonic index and 128-bit AVX quadintegers, respectively. The first one requires changes in logic for low-levelquadrant manipulating algorithms, while the other exploits data levelparallelism and requires algorithms to be adapted to SIMD instructions. Theresultant algorithms and data structures lead to higher performance and lessermemory usage in comparison with the standard baseline. We benchmark selected algorithms on a cluster with two Intel(R) Xeon(R) Gold6130 Skylake family CPUs per node, which provides support for AVX2 extensions,192 GB RAM per node, and up to 512 computational cores in total.

我们在p4est软件中提出了一种技术改进，用于并行自适应网格细化。在p4est中，原语存储为三维的八分位数和二维的象限。虽然它们通常是通过使用其空间和细化级别的替代方法进行编码的，但也可以使用其他数学上等效的编码。认识到这一点，我们在经典的显式版本上添加了两种替代表示，分别基于长单调索引和128位AVX四整数。第一种方法需要更改低级象限操作算法的逻辑，而另一种方法利用数据级并行性，并要求算法适应SIMD指令。与标准基线相比，由此产生的算法和数据结构带来了更高的性能和更少的内存使用。我们在一个集群上对选择的算法进行基准测试，每个节点有两个Intel(R) Xeon(R) Gold6130 Skylake系列cpu，它提供对AVX2扩展的支持，每个节点有192 GB RAM，总共有多达512个计算核心。

{"title":"Alternative quadrant representations with Morton index and AVX2 vectorization for AMR algorithms within the p4est software library","authors":"Mikhail KirilinINS, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany, Carsten BursteddeINS, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany","doi":"arxiv-2308.13615","DOIUrl":"https://doi.org/arxiv-2308.13615","url":null,"abstract":"We present a technical enhancement within the p4est software for parallel\u0000adaptive mesh refinement. In p4est primitives are stored as octants in three\u0000and quadrants in two dimensions. While, classically, they are encoded by the\u0000native approach using its spatial and refinement level, any other\u0000mathematically equivalent encoding might be used instead. Recognizing this, we add two alternative representations to the classical,\u0000explicit version, based on a long monotonic index and 128-bit AVX quad\u0000integers, respectively. The first one requires changes in logic for low-level\u0000quadrant manipulating algorithms, while the other exploits data level\u0000parallelism and requires algorithms to be adapted to SIMD instructions. The\u0000resultant algorithms and data structures lead to higher performance and lesser\u0000memory usage in comparison with the standard baseline. We benchmark selected algorithms on a cluster with two Intel(R) Xeon(R) Gold\u00006130 Skylake family CPUs per node, which provides support for AVX2 extensions,\u0000192 GB RAM per node, and up to 512 computational cores in total.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical Lowrank Arithmetic with Binary Compression 基于二进制压缩的层次低秩算法

arXiv - CS - Mathematical Software

Pub Date : 2023-08-21 DOI: arxiv-2308.10960

Ronald Kriemann

With lowrank approximation the storage requirements for dense data arereduced down to linear complexity and with the addition of hierarchy this alsoworks for data without global lowrank properties. However, the lowrank factorsitself are often still stored using double precision numbers. Newer approachesexploit the different IEEE754 floating point formats available nowadays in amixed precision approach. However, these formats show a significant gap instorage (and accuracy), e.g. between half, single and double precision. Wetherefore look beyond these standard formats and use adaptive compression forstoring the lowrank and dense data and investigate how that affects thearithmetic of such matrices.

通过低秩近似，密集数据的存储需求降低到线性复杂性，并且随着层次结构的增加，这也适用于没有全局低秩属性的数据。然而，低秩因子本身通常仍然使用双精度数存储。较新的方法利用了不同的IEEE754浮点格式，目前在混合精度方法中可用。然而，这些格式在存储(和精度)上有很大的差距，例如在半精度、单精度和双精度之间。因此，我们超越这些标准格式，并使用自适应压缩来存储低秩和密集的数据，并研究它如何影响这些矩阵的算法。

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - CS - Mathematical Software

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀