首页 > 最新文献

2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing最新文献

英文 中文
Catamorphism Generation and Fusion Using Coq 利用Coq产生变质与聚变
Simon Robillard
Catamorphisms are a class of higher-order functions that recursively traverse an inductive data structure to produce a value. An important result related to catamorphisms is the fusion theorem, which gives sufficient conditions to rewrite compositions of catamorphisms. We use the Coq proof assistant to automatically define a catamorphism and a fusion theorem according to an arbitrary inductive type definition. Catamorphisms are then used to define functional specifications and the fusion theorem is applied to derive efficient programs that match those specifications.
变形是一类递归遍历归纳数据结构以产生值的高阶函数。关于变形的一个重要结果是融合定理,它给出了改写变形组合的充分条件。利用Coq证明辅助,根据任意归纳类型定义,自动定义了一个变形定理和一个融合定理。然后使用变形定义功能规范,并应用融合定理推导出匹配这些规范的高效程序。
{"title":"Catamorphism Generation and Fusion Using Coq","authors":"Simon Robillard","doi":"10.1109/SYNASC.2014.32","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.32","url":null,"abstract":"Catamorphisms are a class of higher-order functions that recursively traverse an inductive data structure to produce a value. An important result related to catamorphisms is the fusion theorem, which gives sufficient conditions to rewrite compositions of catamorphisms. We use the Coq proof assistant to automatically define a catamorphism and a fusion theorem according to an arbitrary inductive type definition. Catamorphisms are then used to define functional specifications and the fusion theorem is applied to derive efficient programs that match those specifications.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Load Scheduling in a Cloud Based Massive Video-Storage Environment 基于云的海量视频存储环境中的负载调度
K. Bayyapu, Paul F. Fischer
We propose an architecture for a storage system of surveillance videos. Such systems have to handle massive amounts of incoming video streams and relatively few requests for replay. In such a system load (i.e., Write requests) scheduling is essential to guarantee performance. Large-scale data-storage system (LSDSS) is an emerging hosting facility for video-storage, which has a very high number of writes while most of the videos are never or rarely watched. We discuss the design and implementation of LSDSS and load scheduling in autonomous storage environments called datacenters in LSDSS. A datacenter (DC) is the basic concept in our LSDSS, which has the self-management system to store data efficiently. A LSDSS consists of many DCs organized in a hierarchy fashion, thereby decentralizing load scheduling tasks. Because DC has a simple design, load scheduling is particularly suited for implementation on a real-time video surveillance and allows to make scheduling decisions. We also discuss experimental results that clearly show the advantage of load scheduling over the widely known base load scheduling.
提出了一种监控视频存储系统的架构。这样的系统必须处理大量的传入视频流和相对较少的重放请求。在这样的系统负载(即写请求)中,调度对于保证性能至关重要。大规模数据存储系统(Large-scale data-storage system, LSDSS)是一种新兴的视频存储托管设施,它具有非常高的写入数量,而大多数视频从未或很少被观看。我们讨论了LSDSS的设计和实现以及LSDSS中称为数据中心的自主存储环境中的负载调度。数据中心(datacenter, DC)是我们LSDSS的基本概念,它具有自我管理系统,可以有效地存储数据。LSDSS由许多以层次结构方式组织的dc组成,从而分散负载调度任务。由于DC具有简单的设计,因此负载调度特别适合在实时视频监控上实现,并允许做出调度决策。我们还讨论了实验结果,清楚地表明负载调度优于广为人知的基本负载调度。
{"title":"Load Scheduling in a Cloud Based Massive Video-Storage Environment","authors":"K. Bayyapu, Paul F. Fischer","doi":"10.1109/SYNASC.2014.54","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.54","url":null,"abstract":"We propose an architecture for a storage system of surveillance videos. Such systems have to handle massive amounts of incoming video streams and relatively few requests for replay. In such a system load (i.e., Write requests) scheduling is essential to guarantee performance. Large-scale data-storage system (LSDSS) is an emerging hosting facility for video-storage, which has a very high number of writes while most of the videos are never or rarely watched. We discuss the design and implementation of LSDSS and load scheduling in autonomous storage environments called datacenters in LSDSS. A datacenter (DC) is the basic concept in our LSDSS, which has the self-management system to store data efficiently. A LSDSS consists of many DCs organized in a hierarchy fashion, thereby decentralizing load scheduling tasks. Because DC has a simple design, load scheduling is particularly suited for implementation on a real-time video surveillance and allows to make scheduling decisions. We also discuss experimental results that clearly show the advantage of load scheduling over the widely known base load scheduling.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116259870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Efficient Converting of Large Sparse Matrices to Quadtree Format 大型稀疏矩阵到四叉树格式的有效转换
I. Šimeček, D. Langr, Jan Trdlicka
Computations with sparse matrices are widespread in scientific projects. Used data format affects strongly the performance and also the space-efficiency. Commonly used storage formats (such as COO or CSR) are not suitable neither for some numerical algebra operations (e.g., The sparse matrix-vector multiplication) due to the required indirect addressing nor for I/O file operations with sparse matrices due to their high space complexities. In our previous papers, we prove that the idea of using the quad tree for these purposes is viable. In this paper, we present a completely new algorithm based on bottom-up approach for the converting matrices from common storage formats to the quad tree format. We derive the asymptotic complexity of our new algorithm, design the parallel variant of the classical and the new algorithm, and discuss their performance.
稀疏矩阵计算在科学项目中广泛应用。所使用的数据格式对性能和空间效率影响很大。常用的存储格式(如COO或CSR)既不适合一些数值代数操作(如稀疏矩阵-向量乘法),因为需要间接寻址,也不适合使用稀疏矩阵的I/O文件操作,因为它们的空间复杂性很高。在我们之前的论文中,我们证明了将四叉树用于这些目的的想法是可行的。在本文中,我们提出了一种基于自底向上的方法将矩阵从普通存储格式转换为四叉树格式的全新算法。我们推导了新算法的渐近复杂度,设计了经典算法和新算法的并行变体,并讨论了它们的性能。
{"title":"Efficient Converting of Large Sparse Matrices to Quadtree Format","authors":"I. Šimeček, D. Langr, Jan Trdlicka","doi":"10.1109/SYNASC.2014.25","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.25","url":null,"abstract":"Computations with sparse matrices are widespread in scientific projects. Used data format affects strongly the performance and also the space-efficiency. Commonly used storage formats (such as COO or CSR) are not suitable neither for some numerical algebra operations (e.g., The sparse matrix-vector multiplication) due to the required indirect addressing nor for I/O file operations with sparse matrices due to their high space complexities. In our previous papers, we prove that the idea of using the quad tree for these purposes is viable. In this paper, we present a completely new algorithm based on bottom-up approach for the converting matrices from common storage formats to the quad tree format. We derive the asymptotic complexity of our new algorithm, design the parallel variant of the classical and the new algorithm, and discuss their performance.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114434703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Performance Evaluation of Fuzzy Automata Using VHDL Simulation 基于VHDL仿真的模糊自动机性能评价
Daniel Butoianu, Doru Todinca
Fuzzy automata have been proposed in late 1960's, being among the firsts domains extended through the framework of fuzzy logic. Fuzzy automata (FA) are fuzzy logic extensions of classic, or crisp automata. Despite their long existence, fuzzy automata have quite few real-life applications, most of them related to fuzzy languages, while crisp automata are applied in many devices that are part of our everyday life, from coffee machines to ATMs. In this paper we continue our previous work of investigating by VHDL simulation the performance of fuzzy automata. The new aspect compared to our previous work is the focus on the applications of FA.
模糊自动机是在20世纪60年代末提出的,是最早通过模糊逻辑框架进行扩展的领域之一。模糊自动机(FA)是经典自动机或干脆自动机的模糊逻辑扩展。尽管存在了很长时间,但模糊自动机在现实生活中的应用很少,其中大多数与模糊语言有关,而脆自动机应用于我们日常生活中的许多设备,从咖啡机到自动取款机。在本文中,我们继续用VHDL仿真来研究模糊自动机的性能。与我们以前的工作相比,新的方面是关注FA的应用。
{"title":"Performance Evaluation of Fuzzy Automata Using VHDL Simulation","authors":"Daniel Butoianu, Doru Todinca","doi":"10.1109/SYNASC.2014.34","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.34","url":null,"abstract":"Fuzzy automata have been proposed in late 1960's, being among the firsts domains extended through the framework of fuzzy logic. Fuzzy automata (FA) are fuzzy logic extensions of classic, or crisp automata. Despite their long existence, fuzzy automata have quite few real-life applications, most of them related to fuzzy languages, while crisp automata are applied in many devices that are part of our everyday life, from coffee machines to ATMs. In this paper we continue our previous work of investigating by VHDL simulation the performance of fuzzy automata. The new aspect compared to our previous work is the focus on the applications of FA.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122173382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Guiding Random Test Generation for Intra-class Dataflow Coverage 指导类内数据流覆盖的随机测试生成
Petru Florin Mihancea, Edit Mercedes Mera-Batiz, M. Minea
Automatic generation of a good test suite is difficult, especially for object-oriented software. Feedback-directed random test generation is an approach that can achieve good branch coverage and has been used as a basis to systematically construct suites for testing realistic Java programs. We augment this random test generation method to create tests suites that satisfy an intra-class data-flow coverage criterion which is highly relevant for object orientation, although little addressed or achieved by tools in practice. We show that our approach can be used on real object-oriented software and that the technique for guiding test generation produces an increase in coverage.
自动生成一个好的测试套件是很困难的,特别是对于面向对象的软件。反馈导向的随机测试生成是一种可以实现良好分支覆盖的方法,并且已经被用作系统地构建套件以测试实际Java程序的基础。我们扩展了这种随机测试生成方法,以创建满足类内数据流覆盖标准的测试套件,该标准与面向对象高度相关,尽管在实践中很少被工具处理或实现。我们展示了我们的方法可以用于真正的面向对象的软件,并且指导测试生成的技术产生了覆盖率的增加。
{"title":"Guiding Random Test Generation for Intra-class Dataflow Coverage","authors":"Petru Florin Mihancea, Edit Mercedes Mera-Batiz, M. Minea","doi":"10.1109/SYNASC.2014.28","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.28","url":null,"abstract":"Automatic generation of a good test suite is difficult, especially for object-oriented software. Feedback-directed random test generation is an approach that can achieve good branch coverage and has been used as a basis to systematically construct suites for testing realistic Java programs. We augment this random test generation method to create tests suites that satisfy an intra-class data-flow coverage criterion which is highly relevant for object orientation, although little addressed or achieved by tools in practice. We show that our approach can be used on real object-oriented software and that the technique for guiding test generation produces an increase in coverage.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127689258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Solving Parametric Sparse Linear Systems by Local Blocking, II 用局部块法求解参数稀疏线性系统,2
Tateaki Sasaki, D. Inaba, F. Kako
The present author, Inaba and Kako proposed local blocking in a recent paper [6], for solving parametric sparse linear systems appearing in industry, so that the obtained solution is suited for determining optimal parameter values. They employed a graph theoretical treatment, and the points of their method are to select strongly connected sub graphs satisfying several restrictions and to form the so-called "characteristic system". The method of selecting sub graphs is, however, complicated and seems to be unsuited for big systems. In this paper, assuming that a small number of representative vertices of the characteristic system are specified by the user, we give a simple method of finding a characteristic system. Then, we present a simple and satisfactory method of decomposing the given graph into strongly connected sub graphs. The method applies the SCC (strongly connected component) decomposition algorithm. The complexity of new method is O(# (vertex) +# (edge)). We test our method successfully by three graphs of 100 vertices made artificially showing different but typical features.
本文作者Inaba和Kako在最近的一篇论文[6]中提出了局部块算法,用于求解工业中出现的参数稀疏线性系统,使得到的解适合于确定最优参数值。他们采用了图论的处理方法,其方法的要点是选择满足若干限制条件的强连通子图,形成所谓的“特征系统”。然而,选择子图的方法很复杂,似乎不适合大型系统。本文假设用户指定了特征系统的少量代表性顶点,给出了一种寻找特征系统的简单方法。然后,我们给出了将给定图分解为强连通子图的一种简单而令人满意的方法。该方法采用了强连通分量(SCC)分解算法。新方法的复杂度为0(#(顶点)+#(边))。我们通过人工制作的三个100个顶点的图成功地测试了我们的方法,这些图显示了不同但典型的特征。
{"title":"Solving Parametric Sparse Linear Systems by Local Blocking, II","authors":"Tateaki Sasaki, D. Inaba, F. Kako","doi":"10.1145/2733693.2733712","DOIUrl":"https://doi.org/10.1145/2733693.2733712","url":null,"abstract":"The present author, Inaba and Kako proposed local blocking in a recent paper [6], for solving parametric sparse linear systems appearing in industry, so that the obtained solution is suited for determining optimal parameter values. They employed a graph theoretical treatment, and the points of their method are to select strongly connected sub graphs satisfying several restrictions and to form the so-called \"characteristic system\". The method of selecting sub graphs is, however, complicated and seems to be unsuited for big systems. In this paper, assuming that a small number of representative vertices of the characteristic system are specified by the user, we give a simple method of finding a characteristic system. Then, we present a simple and satisfactory method of decomposing the given graph into strongly connected sub graphs. The method applies the SCC (strongly connected component) decomposition algorithm. The complexity of new method is O(# (vertex) +# (edge)). We test our method successfully by three graphs of 100 vertices made artificially showing different but typical features.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127801600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
New Arithmetic Algorithms for Hereditarily Binary Natural Numbers 遗传二进制自然数的新算法
Paul Tarau
Hereditarily binary numbers are a tree-based number representation derived from a bijection between natural numbers and iterated applications of two simple functions corresponding to bijective base 2 numbers. This paper describes several new arithmetic algorithms on hereditarily binary numbers that, while within constant factors from their traditional counterparts for their average case behavior, make tractable important computations that are impossible with traditional number representations.
传统上,二进制数是一种基于树的数字表示,它来源于自然数和对应于双射基数2的两个简单函数的迭代应用之间的双射。本文描述了几种新的遗传二进制数的算法,这些算法与传统二进制数的平均情况行为相比,具有恒定的因素,可以进行传统数字表示无法进行的易于处理的重要计算。
{"title":"New Arithmetic Algorithms for Hereditarily Binary Natural Numbers","authors":"Paul Tarau","doi":"10.1109/SYNASC.2014.23","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.23","url":null,"abstract":"Hereditarily binary numbers are a tree-based number representation derived from a bijection between natural numbers and iterated applications of two simple functions corresponding to bijective base 2 numbers. This paper describes several new arithmetic algorithms on hereditarily binary numbers that, while within constant factors from their traditional counterparts for their average case behavior, make tractable important computations that are impossible with traditional number representations.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114479209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Enhanced Gradient Descent Algorithms for Complex-Valued Neural Networks 复值神经网络的增强梯度下降算法
Călin-Adrian Popa
In this paper, enhanced gradient descent learning algorithms for complex-valued feed forward neural networks are proposed. The most known such enhanced algorithms for real-valued neural networks are: quick prop, resilient back propagation, delta-bar-delta, and Super SAB, and so it is natural to extend these learning methods to complex-valued neural networks, also. The complex variants of these four algorithms are presented, which are then exemplified on various function approximation problems, as well as on channel equalization and time series prediction applications. Experimental results show an important improvement in training and testing error over classical gradient descent and gradient descent with momentum algorithms.
提出了一种用于复值前馈神经网络的增强梯度下降学习算法。最著名的实值神经网络增强算法是:快速prop、弹性反向传播、delta-bar-delta和Super SAB,因此将这些学习方法扩展到复值神经网络也是很自然的。提出了这四种算法的复杂变体,然后举例说明了各种函数逼近问题,以及信道均衡和时间序列预测应用。实验结果表明,与经典梯度下降算法和动量梯度下降算法相比,梯度下降算法在训练和测试误差方面有了很大的改善。
{"title":"Enhanced Gradient Descent Algorithms for Complex-Valued Neural Networks","authors":"Călin-Adrian Popa","doi":"10.1109/SYNASC.2014.44","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.44","url":null,"abstract":"In this paper, enhanced gradient descent learning algorithms for complex-valued feed forward neural networks are proposed. The most known such enhanced algorithms for real-valued neural networks are: quick prop, resilient back propagation, delta-bar-delta, and Super SAB, and so it is natural to extend these learning methods to complex-valued neural networks, also. The complex variants of these four algorithms are presented, which are then exemplified on various function approximation problems, as well as on channel equalization and time series prediction applications. Experimental results show an important improvement in training and testing error over classical gradient descent and gradient descent with momentum algorithms.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122049897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Distributed-Memory Parallelization of a Shared-Memory Parallel Ensemble Kalman Filter 共享内存并行集成卡尔曼滤波器的分布式并行化
M. Rostami, H. M. Bucker, C. Vogt, Ralf Seidler, David Neuhauser, V. Rath
Inverse problems arise in various areas of science and engineering. These problems are not only difficult to solve numerically, but they also require a large amount of computer resources both in time and memory. It is therefore not surprising that inverse problems are often solved using techniques from high-performance computing. We consider the parallelization of an inverse problem in the field of geothermal reservoir engineering. In this particular scientific application, the underlying software package is already parallelized using the shared-memory programming paradigm Open MP. Here, we present an extension of this parallelization to distributed memory enabling a hybrid Open MP/MPI parallelization. The situation is different from the standard way of hybrid parallel programming because the data structures of the Open MP-parallelized code differ from those in the serial implementation. We exploit this transformation of the data structures in our distributed-memory strategy for parallelizing an ensemble Kalman filter, a particular method for the solution of inverse problems. We describe this novel parallelization strategy, introduce a performance model, and present timing results on a compute cluster using nodes with 2 sockets, each equipped with Intel Xeon X5675 Westmere EP processors with 6 cores. All timing results are obtained with a pure MPI parallelization without using any Open MP threads.
逆问题出现在科学和工程的各个领域。这些问题不仅在数值上难以解决,而且在时间和内存上都需要大量的计算机资源。因此,通常使用高性能计算技术来解决逆问题并不奇怪。本文研究了地热储层工程领域中一个反问题的并行化问题。在这个特殊的科学应用程序中,底层软件包已经使用共享内存编程范例Open MP并行化了。在这里,我们将这种并行化扩展到分布式内存,从而实现Open MP/MPI混合并行化。这种情况与混合并行编程的标准方式不同,因为Open mp并行代码的数据结构与串行实现中的数据结构不同。我们在分布式存储策略中利用这种数据结构的转换来并行化集成卡尔曼滤波器,这是一种求解逆问题的特殊方法。我们描述了这种新的并行化策略,介绍了一个性能模型,并在一个使用2个插槽的节点的计算集群上给出了时序结果,每个节点配备了Intel Xeon X5675 Westmere EP 6核处理器。所有计时结果都是通过纯MPI并行化获得的,而不使用任何Open MP线程。
{"title":"A Distributed-Memory Parallelization of a Shared-Memory Parallel Ensemble Kalman Filter","authors":"M. Rostami, H. M. Bucker, C. Vogt, Ralf Seidler, David Neuhauser, V. Rath","doi":"10.1109/SYNASC.2014.67","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.67","url":null,"abstract":"Inverse problems arise in various areas of science and engineering. These problems are not only difficult to solve numerically, but they also require a large amount of computer resources both in time and memory. It is therefore not surprising that inverse problems are often solved using techniques from high-performance computing. We consider the parallelization of an inverse problem in the field of geothermal reservoir engineering. In this particular scientific application, the underlying software package is already parallelized using the shared-memory programming paradigm Open MP. Here, we present an extension of this parallelization to distributed memory enabling a hybrid Open MP/MPI parallelization. The situation is different from the standard way of hybrid parallel programming because the data structures of the Open MP-parallelized code differ from those in the serial implementation. We exploit this transformation of the data structures in our distributed-memory strategy for parallelizing an ensemble Kalman filter, a particular method for the solution of inverse problems. We describe this novel parallelization strategy, introduce a performance model, and present timing results on a compute cluster using nodes with 2 sockets, each equipped with Intel Xeon X5675 Westmere EP processors with 6 cores. All timing results are obtained with a pure MPI parallelization without using any Open MP threads.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122119184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Corank Two Edge-Bipartite Graphs and Simply Extended Euclidean Diagrams 关于Corank两边二部图和简单扩展欧几里得图
Marcin Gąsiorek, D. Simson, Katarzyna Zając
We continue the Coxeter spectral study of finite connected loop-free edge-bipartite graphs Δ, with m+2 ≥ 3 vertices (a class of signed graphs), started in [SIAM J. Discrete Math., 27(2013), 827-854] by means of the complex Coxeter spectrum speccΔ ⊆ C and presented in our talks given in SYNASC12 and SYNASC13. Here, we study non-negative edge-bipartite graphs of corank two, in the sense that the symmetric Gram matrix GΔ ∈ Mm+2(Z) of Δ is positive semi-definite of rank m ≥ 1. Extending each of the simply laced Euclidean diagrams Am, m ≥ 1, Dm, m ≥ 4, Ẽ6, Ẽ7, Ẽ8 by one vertex, we construct a family of loop-free corank two diagrams Ãm2, D̃m2, Ẽ62, Ẽ72, Ẽ82 (called simply extended Euclidean diagrams) such that they classify all connected corank two loop-free edge-bipartite graphs Δ, with m + 2 ≥ 3 vertices, up to Z-congruence Δ ~z Δ'. Here Δ ~z Δ' means that GΔ', = Btr ·GΔ ·B, for some B ∈ Mm+2(Z) such that det B = ±1. We present algorithms that generate all such edge-bipartite graphs of a given size m + 2 ≥ 3, together with their Coxeter polynomials, and the reduced Coxeter numbers, using symbolic and numeric computer calculations in Python. Moreover, we prove that for any corank two connected loop-free edge-bipartite graph Δ, with m + 2 ≥ 3 vertices, there exists a simply extended Euclidean diagram D such that Δ ~z D.
我们继续有限连通无环边二部图Δ的Coxeter谱研究,其中m+2≥3个顶点(一类有符号图),开始于[SIAM J.离散数学]。, 27(2013), 827-854],并在第12届和第13届大会上发表。本文研究了秩2的非负边二部图,即Δ的对称Gram矩阵GΔ∈Mm+2(Z)是秩m≥1的正半定。延长每一个简单的欧几里得图,m≥1,Dm, m≥4,Ẽ6Ẽ7,Ẽ8到一个顶点,我们构建一个家庭无路由循环的秩两图Am2 D̃m2,Ẽ62年,72年ẼẼ82(简称扩展欧几里德图),这样他们分类所有连接秩两个无路由循环edge-bipartite图形Δ顶点与m + 2≥3,Z-congruenceΔ~ zΔ”。这里Δ ~z Δ'表示GΔ', = Btr·GΔ·B,对于某些B∈Mm+2(z),使得det B =±1。我们提出了一种算法,使用Python中的符号和数字计算机计算,生成给定大小为m + 2≥3的所有这些边二部图,以及它们的Coxeter多项式和约简Coxeter数。此外,我们证明了对于任意有m + 2≥3个顶点的corank两连通无环边二部图Δ,存在一个简单扩展欧几里得图D,使得Δ ~z D。
{"title":"On Corank Two Edge-Bipartite Graphs and Simply Extended Euclidean Diagrams","authors":"Marcin Gąsiorek, D. Simson, Katarzyna Zając","doi":"10.1109/SYNASC.2014.17","DOIUrl":"https://doi.org/10.1109/SYNASC.2014.17","url":null,"abstract":"We continue the Coxeter spectral study of finite connected loop-free edge-bipartite graphs Δ, with m+2 ≥ 3 vertices (a class of signed graphs), started in [SIAM J. Discrete Math., 27(2013), 827-854] by means of the complex Coxeter spectrum specc<sub>Δ</sub> ⊆ C and presented in our talks given in SYNASC12 and SYNASC13. Here, we study non-negative edge-bipartite graphs of corank two, in the sense that the symmetric Gram matrix G<sub>Δ</sub> ∈ M<sub>m+2</sub>(Z) of Δ is positive semi-definite of rank m ≥ 1. Extending each of the simply laced Euclidean diagrams A<sub>m</sub>, m ≥ 1, D<sub>m</sub>, m ≥ 4, Ẽ<sub>6</sub>, Ẽ<sub>7</sub>, Ẽ<sub>8</sub> by one vertex, we construct a family of loop-free corank two diagrams Ã<sub>m</sub><sup>2</sup>, D̃<sub>m</sub><sup>2</sup>, Ẽ<sub>6</sub><sup>2</sup>, Ẽ<sub>7</sub><sup>2</sup>, Ẽ<sub>8</sub><sup>2</sup> (called simply extended Euclidean diagrams) such that they classify all connected corank two loop-free edge-bipartite graphs Δ, with m + 2 ≥ 3 vertices, up to Z-congruence Δ ~z Δ'. Here Δ ~z Δ' means that G<sub>Δ'</sub>, = B<sup>tr</sup> ·G<sub>Δ</sub> ·B, for some B ∈ M<sub>m+2</sub>(Z) such that det B = ±1. We present algorithms that generate all such edge-bipartite graphs of a given size m + 2 ≥ 3, together with their Coxeter polynomials, and the reduced Coxeter numbers, using symbolic and numeric computer calculations in Python. Moreover, we prove that for any corank two connected loop-free edge-bipartite graph Δ, with m + 2 ≥ 3 vertices, there exists a simply extended Euclidean diagram D such that Δ ~z D.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130178122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1