首页 > 最新文献

arXiv - CS - Mathematical Software最新文献

英文 中文
A Framework for Self-Intersecting Surfaces (SOS): Symmetric Optimisation for Stability 自相交曲面的框架:稳定性的对称优化
Pub Date : 2023-12-04 DOI: arxiv-2312.02113
Christian Amend, Tom Goertzen
In this paper, we give a stable and efficient method for fixingself-intersections and non-manifold parts in a given embedded simplicialcomplex. In addition, we show how symmetric properties can be used for furtheroptimisation. We prove an initialisation criterion for computation of the outerhull of an embedded simplicial complex. To regularise the outer hull of theretriangulated surface, we present a method to remedy non-manifold edges andpoints. We also give a modification of the outer hull algorithm to determinechambers of complexes which gives rise to many new insights. All of thesemethods have applications in many areas, for example in 3D-printing, artisticrealisations of 3D models or fixing errors introduced by scanning equipmentapplied for tomography. Implementations of the proposed algorithms are given inthe computer algebra system GAP4. For verification of our methods, we use adata-set of highly self-intersecting symmetric icosahedra.
本文给出了一种稳定有效的方法来固定给定嵌入的简单复合体上的自交和非流形部分。此外,我们还展示了如何使用对称属性进行进一步优化。证明了内嵌简单复合体外壳计算的一个初始化准则。为了使三角曲面的外表面规整化,提出了一种非流形边和点的修正方法。我们也给出了一种修正的外壳算法来确定复合体的腔室,这产生了许多新的见解。所有这些方法在许多领域都有应用,例如3D打印,3D模型的艺术实现或修复应用于断层扫描的扫描设备引入的错误。给出了该算法在计算机代数系统GAP4中的实现。为了验证我们的方法,我们使用了高度自相交对称二十面体的数据集。
{"title":"A Framework for Self-Intersecting Surfaces (SOS): Symmetric Optimisation for Stability","authors":"Christian Amend, Tom Goertzen","doi":"arxiv-2312.02113","DOIUrl":"https://doi.org/arxiv-2312.02113","url":null,"abstract":"In this paper, we give a stable and efficient method for fixing\u0000self-intersections and non-manifold parts in a given embedded simplicial\u0000complex. In addition, we show how symmetric properties can be used for further\u0000optimisation. We prove an initialisation criterion for computation of the outer\u0000hull of an embedded simplicial complex. To regularise the outer hull of the\u0000retriangulated surface, we present a method to remedy non-manifold edges and\u0000points. We also give a modification of the outer hull algorithm to determine\u0000chambers of complexes which gives rise to many new insights. All of these\u0000methods have applications in many areas, for example in 3D-printing, artistic\u0000realisations of 3D models or fixing errors introduced by scanning equipment\u0000applied for tomography. Implementations of the proposed algorithms are given in\u0000the computer algebra system GAP4. For verification of our methods, we use a\u0000data-set of highly self-intersecting symmetric icosahedra.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"18 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scale Ratio Tuning of Group Based Job Scheduling in HPC Systems HPC系统中基于组的作业调度比例调整
Pub Date : 2023-11-29 DOI: arxiv-2311.17889
Lyakhovets D. S., Baranov A. V., Telegin P. N
During the initialization of a supercomputer job, no useful calculations areperformed. A high proportion of initialization time results in idle computingresources and less computational efficiency. Certain methods and algorithmscombining jobs into groups are used to optimize scheduling of jobs with highinitialization proportion. The article considers the influence of the scaleratio setting in algorithm for the job groups formation, on the performancemetrics of the workload manager. The study was carried out on the developed byauthors Aleabased workload manager model. The model makes it possible toconduct a large number of experiments in reasonable time without losing theaccuracy of the simulation. We performed a series of experiments involvingvarious characteristics of the workload. The article represents the results ofa study of the scale ratio influence on efficiency metrics for differentinitialization time proportions and input workflows with varying intensity andhomogeneity. The presented results allow the workload managers administratorsto set a scale ratio that provides an appropriate balance with contradictoryefficiency metrics.
在超级计算机作业初始化期间,不会执行有用的计算。过高的初始化时间会导致计算资源的闲置和计算效率的降低。采用作业分组的方法和算法对高初始化比例作业进行调度优化。本文考虑了作业组形成算法中伸缩设置对工作负载管理器性能的影响。本研究是在作者开发的基于albasbased的工作量管理器模型上进行的。该模型可以在合理的时间内进行大量的实验,而不影响仿真的准确性。我们进行了一系列的实验,涉及工作负荷的各种特征。本文研究了不同初始化时间比例和不同强度和均匀性的输入工作流对效率指标的影响。所提供的结果允许工作负载管理器和管理员设置一个比例比率,以在相互矛盾的效率指标之间提供适当的平衡。
{"title":"Scale Ratio Tuning of Group Based Job Scheduling in HPC Systems","authors":"Lyakhovets D. S., Baranov A. V., Telegin P. N","doi":"arxiv-2311.17889","DOIUrl":"https://doi.org/arxiv-2311.17889","url":null,"abstract":"During the initialization of a supercomputer job, no useful calculations are\u0000performed. A high proportion of initialization time results in idle computing\u0000resources and less computational efficiency. Certain methods and algorithms\u0000combining jobs into groups are used to optimize scheduling of jobs with high\u0000initialization proportion. The article considers the influence of the scale\u0000ratio setting in algorithm for the job groups formation, on the performance\u0000metrics of the workload manager. The study was carried out on the developed by\u0000authors Aleabased workload manager model. The model makes it possible to\u0000conduct a large number of experiments in reasonable time without losing the\u0000accuracy of the simulation. We performed a series of experiments involving\u0000various characteristics of the workload. The article represents the results of\u0000a study of the scale ratio influence on efficiency metrics for different\u0000initialization time proportions and input workflows with varying intensity and\u0000homogeneity. The presented results allow the workload managers administrators\u0000to set a scale ratio that provides an appropriate balance with contradictory\u0000efficiency metrics.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lineax: unified linear solves and linear least-squares in JAX and Equinox Lineax: JAX和Equinox中统一的线性解和线性最小二乘
Pub Date : 2023-11-28 DOI: arxiv-2311.17283
Jason Rader, Terry Lyons, Patrick Kidger
We introduce Lineax, a library bringing linear solves and linearleast-squares to the JAX+Equinox scientific computing ecosystem. Lineax usesgeneral linear operators, and unifies linear solves and least-squares into asingle, autodifferentiable API. Solvers and operators are user-extensible,without requiring the user to implement any custom derivative rules to getdifferentiability. Lineax is available at https://github.com/google/lineax.
我们介绍Lineax,一个为JAX+Equinox科学计算生态系统带来线性解和线性最小二乘的库。Lineax使用一般的线性算子,并将线性解和最小二乘统一为一个单一的,可自微的API。求解器和运算符是用户可扩展的,不需要用户实现任何定制的导数规则来获得可微性。Lineax可在https://github.com/google/lineax上获得。
{"title":"Lineax: unified linear solves and linear least-squares in JAX and Equinox","authors":"Jason Rader, Terry Lyons, Patrick Kidger","doi":"arxiv-2311.17283","DOIUrl":"https://doi.org/arxiv-2311.17283","url":null,"abstract":"We introduce Lineax, a library bringing linear solves and linear\u0000least-squares to the JAX+Equinox scientific computing ecosystem. Lineax uses\u0000general linear operators, and unifies linear solves and least-squares into a\u0000single, autodifferentiable API. Solvers and operators are user-extensible,\u0000without requiring the user to implement any custom derivative rules to get\u0000differentiability. Lineax is available at https://github.com/google/lineax.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"11 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mathematical Modelling and a Numerical Solution for High Precision Satellite Ephemeris Determination 高精度卫星星历确定的数学模型和数值解
Pub Date : 2023-11-25 DOI: arxiv-2311.15028
Aravind Gundakaram, Abhirath Sangala, Aditya Sai Ellendula, Prachi Kansal, Lanii Lakshitaa, Suchir Reddy Punuru, Nethra Naveen, Sanjitha Jaggumantri
In this paper, we develop a high-precision satellite orbit determinationmodel for satellites orbiting the Earth. Solving this model entails numericallyintegrating the differential equation of motion governing a two-body system,employing Fehlberg's formulation and the Runge-Kutta class of embeddedintegrators with adaptive stepsize control. Relevant primary perturbing forcesincluded in this mathematical model are the full force gravitational fieldmodel, Earth's atmospheric drag, third body gravitational effects and solarradiation pressure. Development of the high-precision model required accountingfor the perturbing influences of Earth radiation pressure, Earth tides andrelativistic effects. The model is then implemented to obtain a high-fidelityEarth orbiting satellite propagator, namely the Satellite Ephemeris Determiner(SED), which is comparable to the popular High Precision Orbit Propagator(HPOP). The architecture of SED, the methodology employed, and the numericalresults obtained are presented.
本文建立了一种高精度的地球轨道卫星定轨模型。求解该模型需要对控制两体系统的运动微分方程进行数值积分,采用Fehlberg公式和具有自适应步长控制的龙格-库塔类嵌入积分器。该数学模型包括全力引力场模型、地球大气阻力、第三体引力效应和太阳辐射压力。高精度模型的发展需要考虑地球辐射压力、地球潮汐和相对论效应的扰动影响。然后对该模型进行实现,得到了一种与目前流行的高精度轨道传播器(High Precision Orbit propagator, HPOP)相当的高保真的地球轨道卫星传播器,即卫星星历确定器(satellite Ephemeris Determiner, SED)。介绍了SED的体系结构、采用的方法和得到的数值结果。
{"title":"Mathematical Modelling and a Numerical Solution for High Precision Satellite Ephemeris Determination","authors":"Aravind Gundakaram, Abhirath Sangala, Aditya Sai Ellendula, Prachi Kansal, Lanii Lakshitaa, Suchir Reddy Punuru, Nethra Naveen, Sanjitha Jaggumantri","doi":"arxiv-2311.15028","DOIUrl":"https://doi.org/arxiv-2311.15028","url":null,"abstract":"In this paper, we develop a high-precision satellite orbit determination\u0000model for satellites orbiting the Earth. Solving this model entails numerically\u0000integrating the differential equation of motion governing a two-body system,\u0000employing Fehlberg's formulation and the Runge-Kutta class of embedded\u0000integrators with adaptive stepsize control. Relevant primary perturbing forces\u0000included in this mathematical model are the full force gravitational field\u0000model, Earth's atmospheric drag, third body gravitational effects and solar\u0000radiation pressure. Development of the high-precision model required accounting\u0000for the perturbing influences of Earth radiation pressure, Earth tides and\u0000relativistic effects. The model is then implemented to obtain a high-fidelity\u0000Earth orbiting satellite propagator, namely the Satellite Ephemeris Determiner\u0000(SED), which is comparable to the popular High Precision Orbit Propagator\u0000(HPOP). The architecture of SED, the methodology employed, and the numerical\u0000results obtained are presented.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"14 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exact Combinatorial Optimization with Temporo-Attentional Graph Neural Networks 时间-注意力图神经网络的精确组合优化
Pub Date : 2023-11-23 DOI: arxiv-2311.13843
Mehdi Seyfi, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang
Combinatorial optimization finds an optimal solution within a discrete set ofvariables and constraints. The field has seen tremendous progress both inresearch and industry. With the success of deep learning in the past decade, arecent trend in combinatorial optimization has been to improve state-of-the-artcombinatorial optimization solvers by replacing key heuristic components withmachine learning (ML) models. In this paper, we investigate two essentialaspects of machine learning algorithms for combinatorial optimization: temporalcharacteristics and attention. We argue that for the task of variable selectionin the branch-and-bound (B&B) algorithm, incorporating the temporal informationas well as the bipartite graph attention improves the solver's performance. Wesupport our claims with intuitions and numerical results over several standarddatasets used in the literature and competitions. Code is available at:https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=047c6cf2-8463-40d7-b92f-7b2ca998e935
组合优化在变量和约束的离散集合中找到最优解。这一领域在研究和工业上都取得了巨大的进步。随着深度学习在过去十年中的成功,组合优化的一个最新趋势是通过用机器学习(ML)模型取代关键的启发式组件来改进最先进的组合优化求解器。在本文中,我们研究了用于组合优化的机器学习算法的两个基本方面:时间特征和注意力。对于分支定界(B&B)算法中的变量选择任务,我们认为结合时间信息和二部图注意可以提高求解器的性能。我们用文献和竞赛中使用的几个标准数据集的直觉和数值结果来支持我们的主张。代码可从https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=047c6cf2-8463-40d7-b92f-7b2ca998e935获得
{"title":"Exact Combinatorial Optimization with Temporo-Attentional Graph Neural Networks","authors":"Mehdi Seyfi, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang","doi":"arxiv-2311.13843","DOIUrl":"https://doi.org/arxiv-2311.13843","url":null,"abstract":"Combinatorial optimization finds an optimal solution within a discrete set of\u0000variables and constraints. The field has seen tremendous progress both in\u0000research and industry. With the success of deep learning in the past decade, a\u0000recent trend in combinatorial optimization has been to improve state-of-the-art\u0000combinatorial optimization solvers by replacing key heuristic components with\u0000machine learning (ML) models. In this paper, we investigate two essential\u0000aspects of machine learning algorithms for combinatorial optimization: temporal\u0000characteristics and attention. We argue that for the task of variable selection\u0000in the branch-and-bound (B&B) algorithm, incorporating the temporal information\u0000as well as the bipartite graph attention improves the solver's performance. We\u0000support our claims with intuitions and numerical results over several standard\u0000datasets used in the literature and competitions. Code is available at:\u0000https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=047c6cf2-8463-40d7-b92f-7b2ca998e935","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Assessment of PC-mer's Performance in Alignment-Free Phylogenetic Tree Construction PC-mer在无比对系统发育树构建中的性能评价
Pub Date : 2023-11-21 DOI: arxiv-2311.12898
Saeedeh Akbari Rokn Abadi, Melika Honarmand, Ali Hajialinaghi, Somayyeh Koohi
Background: Sequence comparison is essential in bioinformatics, servingvarious purposes such as taxonomy, functional inference, and drug discovery.The traditional method of aligning sequences for comparison is time-consuming,especially with large datasets. To overcome this, alignment-free methods haveemerged as an alternative approach, prioritizing comparison scores overalignment itself. These methods directly compare sequences without the need foralignment. However, accurately representing the relationships between sequencesis a significant challenge in the design of these tools. Methods:One of thealignment-free comparison approaches utilizes the frequency of fixed-lengthsubstrings, known as K-mers, which serves as the foundation for many sequencecomparison methods. However, a challenge arises in these methods whenincreasing the length of the substring (K), as it leads to an exponentialgrowth in the number of possible states. In this work, we explore the PC-mermethod, which utilizes a more limited set of words that experience slowergrowth 2^k instead of 4^k compared to K. We conducted a comparison of sequencesand evaluated how the reduced input vector size influenced the performance ofthe PC-mer method. Results: For the evaluation, we selected the Clustal Omegamethod as our reference approach, alongside three alignment-free methods:kmacs, FFP, and alfpy (word count). These methods also leverage the frequencyof K-mers. We applied all five methods to 9 datasets for comprehensiveanalysis. The results were compared using phylogenetic trees and metrics suchas Robinson-Foulds and normalized quartet distance (nQD). Conclusion: Ourfindings indicate that, unlike reducing the input features in otheralignment-independent methods, the PC-mer method exhibits competitiveperformance when compared to the aforementioned methods especially when inputsequences are very varied.
背景:序列比较在生物信息学中是必不可少的,服务于各种目的,如分类、功能推断和药物发现。传统的序列比对比较方法非常耗时,特别是对于大型数据集。为了克服这个问题,无对齐方法作为一种替代方法出现了,它优先考虑比较分数高于对齐本身。这些方法直接比较序列而不需要对齐。然而,在这些工具的设计中,准确地表示序列之间的关系是一个重大挑战。方法:一种无比对比较方法利用固定长度子串的频率,称为K-mers,它是许多序列比较方法的基础。然而,当增加子串(K)的长度时,这些方法会出现一个挑战,因为它会导致可能状态的数量呈指数增长。在这项工作中,我们探索了PC-mer方法,该方法使用了一组更有限的单词,与k相比,这些单词的增长速度较慢,为2^k,而不是4^k。我们对序列进行了比较,并评估了减少的输入向量大小如何影响PC-mer方法的性能。结果:在评估中,我们选择了集群omega方法作为我们的参考方法,以及三种不需要对齐的方法:kmacs、FFP和alfpy(字数统计)。这些方法还利用了K-mers的频率。我们将这五种方法应用于9个数据集进行综合分析。使用系统发育树和指标(如Robinson-Foulds和归一化四重奏距离(nQD))对结果进行比较。结论:我们的研究结果表明,与其他与对齐无关的方法中减少输入特征不同,PC-mer方法与上述方法相比,表现出具有竞争力的性能,特别是当输入序列变化很大时。
{"title":"An Assessment of PC-mer's Performance in Alignment-Free Phylogenetic Tree Construction","authors":"Saeedeh Akbari Rokn Abadi, Melika Honarmand, Ali Hajialinaghi, Somayyeh Koohi","doi":"arxiv-2311.12898","DOIUrl":"https://doi.org/arxiv-2311.12898","url":null,"abstract":"Background: Sequence comparison is essential in bioinformatics, serving\u0000various purposes such as taxonomy, functional inference, and drug discovery.\u0000The traditional method of aligning sequences for comparison is time-consuming,\u0000especially with large datasets. To overcome this, alignment-free methods have\u0000emerged as an alternative approach, prioritizing comparison scores over\u0000alignment itself. These methods directly compare sequences without the need for\u0000alignment. However, accurately representing the relationships between sequences\u0000is a significant challenge in the design of these tools. Methods:One of the\u0000alignment-free comparison approaches utilizes the frequency of fixed-length\u0000substrings, known as K-mers, which serves as the foundation for many sequence\u0000comparison methods. However, a challenge arises in these methods when\u0000increasing the length of the substring (K), as it leads to an exponential\u0000growth in the number of possible states. In this work, we explore the PC-mer\u0000method, which utilizes a more limited set of words that experience slower\u0000growth 2^k instead of 4^k compared to K. We conducted a comparison of sequences\u0000and evaluated how the reduced input vector size influenced the performance of\u0000the PC-mer method. Results: For the evaluation, we selected the Clustal Omega\u0000method as our reference approach, alongside three alignment-free methods:\u0000kmacs, FFP, and alfpy (word count). These methods also leverage the frequency\u0000of K-mers. We applied all five methods to 9 datasets for comprehensive\u0000analysis. The results were compared using phylogenetic trees and metrics such\u0000as Robinson-Foulds and normalized quartet distance (nQD). Conclusion: Our\u0000findings indicate that, unlike reducing the input features in other\u0000alignment-independent methods, the PC-mer method exhibits competitive\u0000performance when compared to the aforementioned methods especially when input\u0000sequences are very varied.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"10 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
p-adaptive discontinuous Galerkin method for the shallow water equations on heterogeneous computing architectures 异构计算结构下浅水方程的p-自适应间断Galerkin方法
Pub Date : 2023-11-19 DOI: arxiv-2311.11348
Sara Faghih-Naini, Vadym Aizinger, Sebastian Kuckuk, Richard Angersbach, Harald Köstler
Heterogeneous computing and exploiting integrated CPU-GPU architectures hasbecome a clear current trend since the flattening of Moore's Law. In this work,we propose a numerical and algorithmic re-design of a p-adaptivequadrature-free discontinuous Galerkin method (DG) for the shallow waterequations (SWE). Our new approach separates the computations of thenon-adaptive (lower-order) and adaptive (higher-order) parts of thediscretization form each other. Thereby, we can overlap computations of thelower-order and the higher-order DG solution components. Furthermore, weinvestigate execution times of main computational kernels and use automaticcode generation to optimize their distribution between the CPU and GPU. Severalsetups, including a prototype of a tsunami simulation in a tide-driven flowscenario, are investigated, and the results show that significant performanceimprovements can be achieved in suitable setups.
自从摩尔定律的扁平化以来,异构计算和利用集成的CPU-GPU架构已经成为当前的明显趋势。在这项工作中,我们提出了一种用于浅水方程(SWE)的p自适应无正交间断伽辽金方法(DG)的数值和算法重新设计。我们的新方法将离散化的非自适应(低阶)和自适应(高阶)部分的计算相互分离。因此,我们可以重叠低阶和高阶DG解分量的计算。此外,我们研究了主要计算内核的执行时间,并使用自动代码生成来优化它们在CPU和GPU之间的分布。研究了几种设置,包括潮汐驱动流场景中的海啸模拟原型,结果表明,在适当的设置中可以实现显着的性能改进。
{"title":"p-adaptive discontinuous Galerkin method for the shallow water equations on heterogeneous computing architectures","authors":"Sara Faghih-Naini, Vadym Aizinger, Sebastian Kuckuk, Richard Angersbach, Harald Köstler","doi":"arxiv-2311.11348","DOIUrl":"https://doi.org/arxiv-2311.11348","url":null,"abstract":"Heterogeneous computing and exploiting integrated CPU-GPU architectures has\u0000become a clear current trend since the flattening of Moore's Law. In this work,\u0000we propose a numerical and algorithmic re-design of a p-adaptive\u0000quadrature-free discontinuous Galerkin method (DG) for the shallow water\u0000equations (SWE). Our new approach separates the computations of the\u0000non-adaptive (lower-order) and adaptive (higher-order) parts of the\u0000discretization form each other. Thereby, we can overlap computations of the\u0000lower-order and the higher-order DG solution components. Furthermore, we\u0000investigate execution times of main computational kernels and use automatic\u0000code generation to optimize their distribution between the CPU and GPU. Several\u0000setups, including a prototype of a tsunami simulation in a tide-driven flow\u0000scenario, are investigated, and the results show that significant performance\u0000improvements can be achieved in suitable setups.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"13 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deriving Algorithms for Triangular Tridiagonalization a (Skew-)Symmetric Matrix (斜)对称矩阵三角三角对角化的推导算法
Pub Date : 2023-11-17 DOI: arxiv-2311.10700
Robert van de Geijn, Maggie Myers, RuQing G. Xu, Devin Matthews
We apply the FLAME methodology to derive algorithms hand in hand with theirproofs of correctness for the computation of the $ L T L^T $ decomposition(with and without pivoting) of a skew-symmetric matrix. The approach yieldsknown as well as new algorithms, presented using the FLAME notation. A numberof BLAS-like primitives are exposed at the core of blocked algorithms that canattain high performance. The insights can be easily extended to yieldalgorithms for computing the $ L T L^T $ decomposition of a symmetric matrix.
我们应用FLAME方法推导了斜对称矩阵的$ L T L^T $分解(有或没有旋转)计算的算法及其正确性证明。该方法产生了已知的以及使用FLAME符号表示的新算法。许多类似blas的原语暴露在阻塞算法的核心,可以获得高性能。这些见解可以很容易地扩展到计算对称矩阵的L^T分解的yield算法。
{"title":"Deriving Algorithms for Triangular Tridiagonalization a (Skew-)Symmetric Matrix","authors":"Robert van de Geijn, Maggie Myers, RuQing G. Xu, Devin Matthews","doi":"arxiv-2311.10700","DOIUrl":"https://doi.org/arxiv-2311.10700","url":null,"abstract":"We apply the FLAME methodology to derive algorithms hand in hand with their\u0000proofs of correctness for the computation of the $ L T L^T $ decomposition\u0000(with and without pivoting) of a skew-symmetric matrix. The approach yields\u0000known as well as new algorithms, presented using the FLAME notation. A number\u0000of BLAS-like primitives are exposed at the core of blocked algorithms that can\u0000attain high performance. The insights can be easily extended to yield\u0000algorithms for computing the $ L T L^T $ decomposition of a symmetric matrix.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"12 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DisCoPy: the Hierarchy of Graphical Languages in Python DisCoPy: Python中图形语言的层次结构
Pub Date : 2023-11-17 DOI: arxiv-2311.10608
Alexis Toumi, Richie Yeung, Boldizsár Poór, Giovanni de Felice
DisCoPy is a Python toolkit for computing with monoidal categories. It comeswith two flexible data structures for string diagrams: the first one for planarmonoidal categories based on lists of layers, the second one for symmetricmonoidal categories based on cospans of hypergraphs. Algorithms for functorapplication then allow to translate string diagrams into code for numericalcomputation, be it differentiable, probabilistic or quantum. This report givesan overview of the library and the new developments released in its version1.0. In particular, we showcase the implementation of diagram equality for alarge fragment of the hierarchy of graphical languages for monoidal categories,as well as a new syntax for defining string diagrams as Python functions.
DisCoPy是一个用于计算一元类别的Python工具包。它为字符串图提供了两种灵活的数据结构:第一种是基于层列表的平面类,第二种是基于超图的共张的对称类。函数应用的算法允许将字符串图转换为数字计算的代码,无论是可微分的,概率的还是量子的。本报告概述了该库及其1.0版本中发布的新开发。特别地,我们展示了对一元类别的图形语言层次结构的大片段图相等的实现,以及将字符串图定义为Python函数的新语法。
{"title":"DisCoPy: the Hierarchy of Graphical Languages in Python","authors":"Alexis Toumi, Richie Yeung, Boldizsár Poór, Giovanni de Felice","doi":"arxiv-2311.10608","DOIUrl":"https://doi.org/arxiv-2311.10608","url":null,"abstract":"DisCoPy is a Python toolkit for computing with monoidal categories. It comes\u0000with two flexible data structures for string diagrams: the first one for planar\u0000monoidal categories based on lists of layers, the second one for symmetric\u0000monoidal categories based on cospans of hypergraphs. Algorithms for functor\u0000application then allow to translate string diagrams into code for numerical\u0000computation, be it differentiable, probabilistic or quantum. This report gives\u0000an overview of the library and the new developments released in its version\u00001.0. In particular, we showcase the implementation of diagram equality for a\u0000large fragment of the hierarchy of graphical languages for monoidal categories,\u0000as well as a new syntax for defining string diagrams as Python functions.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast multiplication by two's complement addition of numbers represented as a set of polynomial radix 2 indexes, stored as an integer list for massively parallel computation 以多项式基数2索引的集合表示的数字的2的补加的快速乘法,存储为大规模并行计算的整数列表
Pub Date : 2023-11-16 DOI: arxiv-2311.09922
Mark Stocks
We demonstrate a multiplication method based on numbers represented as set ofpolynomial radix 2 indices stored as an integer list. The 'polynomial integerindex multiplication' method is a set of algorithms implemented in python code.We demonstrate the method to be faster than both the Number Theoretic Transform(NTT) and Karatsuba for multiplication within a certain bit range. Alsoimplemented in python code for comparison purposes with the polynomial radix 2integer method. We demonstrate that it is possible to express any integer orreal number as a list of integer indices, representing a finite series in basetwo. The finite series of integer index representation of a number can then bestored and distributed across multiple CPUs / GPUs. We show that operations ofaddition and multiplication can be applied as two's complement additionsoperating on the index integer representations and can be fully distributedacross a given CPU / GPU architecture. We demonstrate fully distributedarithmetic operations such that the 'polynomial integer index multiplication'method overcomes the current limitation of parallel multiplication methods. Ie,the need to share common core memory and common disk for the calculation ofresults and intermediate results.
我们演示了一种基于以整数列表形式存储的多项式基数2索引集表示的数字的乘法方法。'多项式integerindex乘法'方法是一组在python代码中实现的算法。在一定的位范围内,我们证明了该方法比数论变换(NTT)和Karatsuba都快。也在python代码中实现,用于与多项式基数2integer方法进行比较。我们证明了可以将任何整数或实数表示为整数索引的列表,表示以2为基数的有限级数。整数索引表示的有限序列可以存储并分布在多个cpu / gpu上。我们证明了加法和乘法运算可以应用于索引整数表示上的两个补加运算,并且可以完全分布在给定的CPU / GPU架构上。我们演示了完全分布式的算术运算,使得“多项式整数索引乘法”方法克服了当前并行乘法方法的局限性。即,需要共享共同的核心内存和共同的磁盘,用于计算结果和中间结果。
{"title":"Fast multiplication by two's complement addition of numbers represented as a set of polynomial radix 2 indexes, stored as an integer list for massively parallel computation","authors":"Mark Stocks","doi":"arxiv-2311.09922","DOIUrl":"https://doi.org/arxiv-2311.09922","url":null,"abstract":"We demonstrate a multiplication method based on numbers represented as set of\u0000polynomial radix 2 indices stored as an integer list. The 'polynomial integer\u0000index multiplication' method is a set of algorithms implemented in python code.\u0000We demonstrate the method to be faster than both the Number Theoretic Transform\u0000(NTT) and Karatsuba for multiplication within a certain bit range. Also\u0000implemented in python code for comparison purposes with the polynomial radix 2\u0000integer method. We demonstrate that it is possible to express any integer or\u0000real number as a list of integer indices, representing a finite series in base\u0000two. The finite series of integer index representation of a number can then be\u0000stored and distributed across multiple CPUs / GPUs. We show that operations of\u0000addition and multiplication can be applied as two's complement additions\u0000operating on the index integer representations and can be fully distributed\u0000across a given CPU / GPU architecture. We demonstrate fully distributed\u0000arithmetic operations such that the 'polynomial integer index multiplication'\u0000method overcomes the current limitation of parallel multiplication methods. Ie,\u0000the need to share common core memory and common disk for the calculation of\u0000results and intermediate results.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"13 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Mathematical Software
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1