首页 > 最新文献

Proceedings. Advances in Parallel and Distributed Computing最新文献

英文 中文
Utilization of disk drives for RAID 磁盘驱动器用于RAID的利用率
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574031
D. Feng, Xinrong Zhou, Hai Jin, Jiangling Zhang
A stochastic Petri nets (SPN) model of RAID-5 is constructed. With the model and its isomorphic Markov chain, the average utilization of disk drives in RAID for small write and large I/O request can be calculated. It provides us a good method to evaluate the performance of RAID in the paper.
建立了RAID-5的随机Petri网(SPN)模型。利用该模型及其同构马尔可夫链,可以计算出小写大I/O请求时RAID中磁盘驱动器的平均利用率。这为我们提供了一种很好的评价RAID性能的方法。
{"title":"Utilization of disk drives for RAID","authors":"D. Feng, Xinrong Zhou, Hai Jin, Jiangling Zhang","doi":"10.1109/APDC.1997.574031","DOIUrl":"https://doi.org/10.1109/APDC.1997.574031","url":null,"abstract":"A stochastic Petri nets (SPN) model of RAID-5 is constructed. With the model and its isomorphic Markov chain, the average utilization of disk drives in RAID for small write and large I/O request can be calculated. It provides us a good method to evaluate the performance of RAID in the paper.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127907044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fast parallel algorithm for finding the kth longest path in a tree 寻找树中第k条最长路径的快速并行算法
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574028
Hong Shen
We present a fast parallel algorithm running in O(log/sup 2/n) time on a CREW PRAM with O(n) processors for finding the kth longest path in a given tree of n vertices (with /spl Theta/(n/sup 2/) intervertex distances). Our algorithm is obtained by efficient parallelization of a sequential algorithm which is a variant of both N. Megiddo et al.'s algorithm and G.N. Fredrickson et al.'s algorithm based on centroid decomposition of tree and succinct representation of the set of intervertex distances. With the same time and space bound as the best known result, our sequential algorithm maintains a shorter length of the decomposition tree.
我们提出了一种快速并行算法,在具有O(n)个处理器的CREW PRAM上运行O(log/sup 2/n)时间,用于在给定的n个顶点的树(具有/spl Theta/(n/sup 2/)顶点间距离)中查找第k个最长路径。我们的算法是在N. Megiddo等人的算法和G.N. Fredrickson等人的算法的基础上,基于树的质心分解和顶点间距离集的简洁表示,通过对序列算法的高效并行化得到的。在与已知结果相同的时间和空间约束下,我们的顺序算法保持了较短的分解树长度。
{"title":"Fast parallel algorithm for finding the kth longest path in a tree","authors":"Hong Shen","doi":"10.1109/APDC.1997.574028","DOIUrl":"https://doi.org/10.1109/APDC.1997.574028","url":null,"abstract":"We present a fast parallel algorithm running in O(log/sup 2/n) time on a CREW PRAM with O(n) processors for finding the kth longest path in a given tree of n vertices (with /spl Theta/(n/sup 2/) intervertex distances). Our algorithm is obtained by efficient parallelization of a sequential algorithm which is a variant of both N. Megiddo et al.'s algorithm and G.N. Fredrickson et al.'s algorithm based on centroid decomposition of tree and succinct representation of the set of intervertex distances. With the same time and space bound as the best known result, our sequential algorithm maintains a shorter length of the decomposition tree.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131715595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design and analysis of an efficient algorithm for coordinated checkpointing in distributed systems 分布式系统协同检查点的高效算法设计与分析
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574042
Jiannong Cao, W. Jia, X. Jia, T. Cheung
A synchronous checkpointing algorithm coordinates a set of processes in taking checkpoints in such a way that the set of local checkpoints always forms part of a consistent global system state. Whenever a process p requests to take a checkpoint, a set of processes, called the cohorts set of p, must be checked and some of them may also have to take their checkpoints in order to preserve system consistency. Although several synchronous checkpointing algorithms have been proposed in the literature, most of them do not address the performance issue. In this paper we propose an efficient distributed algorithm for synchronous checkpointing. Proof of correctness and analysis of efficiency of the algorithm are presented. It is shown that the algorithm has a better message and time complexity than the existing algorithms. The method proposed in this paper can also be applied to enhance the performance of rollback operation which always require synchronization of the inter-dependent processes.
同步检查点算法以这样一种方式协调一组获取检查点的进程,使得一组局部检查点始终构成一致的全局系统状态的一部分。每当进程p请求占用检查点时,必须检查一组进程(称为p的队列集),其中一些进程可能还必须占用它们的检查点,以保持系统一致性。虽然文献中已经提出了几种同步检查点算法,但大多数都没有解决性能问题。本文提出了一种高效的分布式同步检查点算法。给出了算法的正确性证明和效率分析。结果表明,该算法比现有算法具有更好的消息复杂度和时间复杂度。本文提出的方法也可以用于提高回滚操作的性能,因为回滚操作总是需要相互依赖的进程同步。
{"title":"Design and analysis of an efficient algorithm for coordinated checkpointing in distributed systems","authors":"Jiannong Cao, W. Jia, X. Jia, T. Cheung","doi":"10.1109/APDC.1997.574042","DOIUrl":"https://doi.org/10.1109/APDC.1997.574042","url":null,"abstract":"A synchronous checkpointing algorithm coordinates a set of processes in taking checkpoints in such a way that the set of local checkpoints always forms part of a consistent global system state. Whenever a process p requests to take a checkpoint, a set of processes, called the cohorts set of p, must be checked and some of them may also have to take their checkpoints in order to preserve system consistency. Although several synchronous checkpointing algorithms have been proposed in the literature, most of them do not address the performance issue. In this paper we propose an efficient distributed algorithm for synchronous checkpointing. Proof of correctness and analysis of efficiency of the algorithm are presented. It is shown that the algorithm has a better message and time complexity than the existing algorithms. The method proposed in this paper can also be applied to enhance the performance of rollback operation which always require synchronization of the inter-dependent processes.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126361382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Implementing a software virtual shared memory on PVM 在PVM上实现软件虚拟共享内存
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574032
Y. Dou, Zhengbin Pang, Xingming Zhou
This paper introduces a software virtual shared memory, GKD-VSM on PVM. It provides a shared memory parallel programming model in FORTRAN language for distributed memory environments. To reduce the software overhead GKD-VSM takes several approaches, including special-purposed user-level multithread scheme and Prefetch&Poststore at synchronization points scheme. The latencies for basic operations are presented.
本文介绍了一种基于PVM的软件虚拟共享内存GKD-VSM。为分布式内存环境提供了一种基于FORTRAN语言的共享内存并行编程模型。为了减少软件开销,GKD-VSM采用了几种方法,包括特殊用途的用户级多线程方案和同步点预取和后存储方案。给出了基本操作的延迟。
{"title":"Implementing a software virtual shared memory on PVM","authors":"Y. Dou, Zhengbin Pang, Xingming Zhou","doi":"10.1109/APDC.1997.574032","DOIUrl":"https://doi.org/10.1109/APDC.1997.574032","url":null,"abstract":"This paper introduces a software virtual shared memory, GKD-VSM on PVM. It provides a shared memory parallel programming model in FORTRAN language for distributed memory environments. To reduce the software overhead GKD-VSM takes several approaches, including special-purposed user-level multithread scheme and Prefetch&Poststore at synchronization points scheme. The latencies for basic operations are presented.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127594644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An improved parallel algorithm for Delaunay triangulation on distributed memory parallel computers 分布式存储并行计算机上Delaunay三角剖分的改进并行算法
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574023
Sangyoon Lee, Chan-Ik Park, Chan-Mo Park
Delaunay triangulation has been much used in such applications as volume rendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulation is large computation time required to obtain the triangulation on an input points set. This time can be reduced by using more than one processor, and several parallel algorithms for Delaunay triangulation have been proposed. In this paper, we propose an improved parallel algorithm for Delaunay triangulation, which partitions the bounding convex region of the input points set into a number of regions by using Delaunay edges and generates Delaunay triangles in each region by applying an incremental construction approach. Partitioning by Delaunay edges makes it possible to eliminate merging step required for integrating subresults. It is shown from the experiments that the proposed algorithm has good load balance and is more efficient than Cignoni et al.'s algorithm (1993) and our previous algorithm (1996).
Delaunay三角剖分已广泛应用于体绘制、形状表示、地形建模等领域。Delaunay三角剖分的主要缺点是在输入点集上进行三角剖分需要大量的计算时间。这种时间可以通过使用多个处理器来减少,并且已经提出了几种Delaunay三角剖分的并行算法。本文提出了一种改进的Delaunay三角剖分并行算法,该算法利用Delaunay边将输入点集的边界凸区域划分为若干区域,并在每个区域采用增量构造方法生成Delaunay三角形。用Delaunay边划分可以消除子结果的合并步骤。实验结果表明,该算法具有良好的负载均衡性,比Cignoni等人的算法(1993)和我们之前的算法(1996)效率更高。
{"title":"An improved parallel algorithm for Delaunay triangulation on distributed memory parallel computers","authors":"Sangyoon Lee, Chan-Ik Park, Chan-Mo Park","doi":"10.1109/APDC.1997.574023","DOIUrl":"https://doi.org/10.1109/APDC.1997.574023","url":null,"abstract":"Delaunay triangulation has been much used in such applications as volume rendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulation is large computation time required to obtain the triangulation on an input points set. This time can be reduced by using more than one processor, and several parallel algorithms for Delaunay triangulation have been proposed. In this paper, we propose an improved parallel algorithm for Delaunay triangulation, which partitions the bounding convex region of the input points set into a number of regions by using Delaunay edges and generates Delaunay triangles in each region by applying an incremental construction approach. Partitioning by Delaunay edges makes it possible to eliminate merging step required for integrating subresults. It is shown from the experiments that the proposed algorithm has good load balance and is more efficient than Cignoni et al.'s algorithm (1993) and our previous algorithm (1996).","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127820128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
"SEQ OF PAR" style structured parallel programming “SEQ OF PAR”风格的结构化并行编程
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574017
Weigang Yuan, Yongqiang Sun
This paper presents a new structured parallel programming model, "SEQ OF PAR", based on the Communication Closed Layer (CCL) principle of causal composition for parallel programs and Bird-Meertens formalism (BMF) of locality-based parallel computation. This model is to support for more general, architecture-independent parallel programming. It provides a structured approach to integrate task (or process) parallelism and data-parallelism in one framework. The well-founded algebra of CCL and BMF makes it also possible to derive, optimize and verify parallel programs through algebraic transformations. Experimental results show that it is very promising to adopt this programming model for getting efficient, portable parallel code.
基于并行程序的通信封闭层(CCL)因果组合原理和基于位置的并行计算的Bird-Meertens形式化(BMF),提出了一种新的结构化并行规划模型“SEQ OF PAR”。该模型支持更通用的、与体系结构无关的并行编程。它提供了一种结构化的方法,将任务(或流程)并行性和数据并行性集成到一个框架中。CCL和BMF的良好代数基础使得通过代数变换推导、优化和验证并行程序成为可能。实验结果表明,采用该编程模型可以获得高效、可移植的并行代码。
{"title":"\"SEQ OF PAR\" style structured parallel programming","authors":"Weigang Yuan, Yongqiang Sun","doi":"10.1109/APDC.1997.574017","DOIUrl":"https://doi.org/10.1109/APDC.1997.574017","url":null,"abstract":"This paper presents a new structured parallel programming model, \"SEQ OF PAR\", based on the Communication Closed Layer (CCL) principle of causal composition for parallel programs and Bird-Meertens formalism (BMF) of locality-based parallel computation. This model is to support for more general, architecture-independent parallel programming. It provides a structured approach to integrate task (or process) parallelism and data-parallelism in one framework. The well-founded algebra of CCL and BMF makes it also possible to derive, optimize and verify parallel programs through algebraic transformations. Experimental results show that it is very promising to adopt this programming model for getting efficient, portable parallel code.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133124104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Small, scalable, and efficient, microkernels for highly parallel computers are possible: Cosy as an example 用于高度并行计算机的小型、可伸缩且高效的微内核是可能的:Cosy就是一个例子
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574033
Roger Butenuth
Although highly parallel distributed memory computers exist for several years, the operating systems used on them did not fit the requirements very well. Most of them are designed for sequential, shared memory parallel or distributed computers. Examples are Unix on the IBM SP/2 and Mach on the Intel Paragon. This results in poor scalability caused by inefficient communication primitives designed for wide area networks or by waste of resources due to huge kernels (e.g. 8 MB per node are reported for Mach an the Paragon, which is harmful especially in highly parallel systems with hundreds or thousands of nodes. With Cosy (Concurrent Operating System) we have shown that a well structured and carefully designed system can be small (70 Kb for the kernel 372 total memory usage per node), efficient (33 /spl mu/s for communication), and scalable (applications run efficient on up to 1024 processors).
尽管高度并行的分布式内存计算机已经存在好几年了,但它们所使用的操作系统并不能很好地满足要求。它们中的大多数是为顺序、共享内存、并行或分布式计算机设计的。例如IBM SP/2上的Unix和Intel Paragon上的Mach。由于为广域网设计的通信原语效率低下,或者由于巨大的内核(例如马赫和Paragon的每个节点报告为8 MB,这在具有数百或数千个节点的高度并行系统中是有害的)而造成资源浪费,这导致了较差的可扩展性。通过使用Cosy(并发操作系统),我们已经展示了一个结构良好且精心设计的系统可以很小(每个节点372个总内存使用量为内核70 Kb)、高效(通信33 /spl mu/s)和可扩展(应用程序在多达1024个处理器上高效运行)。
{"title":"Small, scalable, and efficient, microkernels for highly parallel computers are possible: Cosy as an example","authors":"Roger Butenuth","doi":"10.1109/APDC.1997.574033","DOIUrl":"https://doi.org/10.1109/APDC.1997.574033","url":null,"abstract":"Although highly parallel distributed memory computers exist for several years, the operating systems used on them did not fit the requirements very well. Most of them are designed for sequential, shared memory parallel or distributed computers. Examples are Unix on the IBM SP/2 and Mach on the Intel Paragon. This results in poor scalability caused by inefficient communication primitives designed for wide area networks or by waste of resources due to huge kernels (e.g. 8 MB per node are reported for Mach an the Paragon, which is harmful especially in highly parallel systems with hundreds or thousands of nodes. With Cosy (Concurrent Operating System) we have shown that a well structured and carefully designed system can be small (70 Kb for the kernel 372 total memory usage per node), efficient (33 /spl mu/s for communication), and scalable (applications run efficient on up to 1024 processors).","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124673280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel design and implementation of SOM neural computing model in PVM environment of a distributed system 分布式系统PVM环境下SOM神经计算模型的并行设计与实现
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574010
Huiwei Guan, T. Cheung, Chi-Kwong Li, Songnian Yu
A parallel design and implementation of the Self-Organizing Map (SOM) neural computing model is proposed. The parallel design of SOM is implemented in a parallel virtual machine (PVM) environment of a distributed system. A practical realization of SOM algorithm is investigated, the construction of computing module in parallel virtual machine is discussed, the communication methods and an optimization of message passing between multiple processes are proposed, and the parallel programming technique and a PVM implementation of SOM neural computing model are given and discussed in detail.
提出了一种自组织映射(SOM)神经计算模型的并行设计与实现。SOM的并行设计是在分布式系统的并行虚拟机(PVM)环境中实现的。研究了SOM算法的实际实现,讨论了并行虚拟机中计算模块的构建,提出了多进程间的通信方法和消息传递优化,并给出了SOM神经计算模型的并行编程技术和PVM实现。
{"title":"Parallel design and implementation of SOM neural computing model in PVM environment of a distributed system","authors":"Huiwei Guan, T. Cheung, Chi-Kwong Li, Songnian Yu","doi":"10.1109/APDC.1997.574010","DOIUrl":"https://doi.org/10.1109/APDC.1997.574010","url":null,"abstract":"A parallel design and implementation of the Self-Organizing Map (SOM) neural computing model is proposed. The parallel design of SOM is implemented in a parallel virtual machine (PVM) environment of a distributed system. A practical realization of SOM algorithm is investigated, the construction of computing module in parallel virtual machine is discussed, the communication methods and an optimization of message passing between multiple processes are proposed, and the parallel programming technique and a PVM implementation of SOM neural computing model are given and discussed in detail.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116443162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Experiments on heterogeneous scheduling via Callback 基于回调的异构调度实验
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574044
Xinda Lu, Qianni Deng, Fei Zheng
In this paper, an approach using least squares method for Callback implementation is presented. Experiments on Callback approved that Callback was not only effective in high-level metasystems but also in low level heterogeneous system.
本文提出了一种利用最小二乘法实现回调的方法。实验结果表明,Callback不仅在高层次元系统中有效,在低层次异构系统中也同样有效。
{"title":"Experiments on heterogeneous scheduling via Callback","authors":"Xinda Lu, Qianni Deng, Fei Zheng","doi":"10.1109/APDC.1997.574044","DOIUrl":"https://doi.org/10.1109/APDC.1997.574044","url":null,"abstract":"In this paper, an approach using least squares method for Callback implementation is presented. Experiments on Callback approved that Callback was not only effective in high-level metasystems but also in low level heterogeneous system.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126838098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An improvement on data dependence analysis supporting software pipelining technique 支持软件流水线技术的数据依赖性分析的改进
Pub Date : 1997-03-19 DOI: 10.1109/APDC.1997.574058
Chihong Zhang, Zhizhong Tang
The accuracy of the data dependence analysis of a client program will decide in what an extent the compiler can unleash the power of the potential parallelism of the client program. Most of the current works on dependence analysis are based on the dependence equation and constraint inequalities of loop variable bounds (sometimes augmented with the direction vector). Unfortunately, they can not give an exact detection on the dependence which may greatly affect the parallel optimization of the client program when software pipelining technique is employed. In the paper, we give a more effective constraint inequality which could reflect the characteristics of software pipelining technique and will improve the power of dependence analysis of most of the current algorithms when applied to software pipelining.
客户机程序的数据依赖性分析的准确性将决定编译器能够在多大程度上释放客户机程序的潜在并行性。目前大多数的相关性分析工作都是基于循环变量边界的相关性方程和约束不等式(有时与方向向量增广)。遗憾的是,在采用软件流水线技术时,它们不能准确地检测出依赖关系,这将极大地影响客户端程序的并行优化。本文给出了一个更有效的约束不等式,它能反映软件流水线技术的特点,提高了目前大多数算法在应用于软件流水线时的依赖性分析能力。
{"title":"An improvement on data dependence analysis supporting software pipelining technique","authors":"Chihong Zhang, Zhizhong Tang","doi":"10.1109/APDC.1997.574058","DOIUrl":"https://doi.org/10.1109/APDC.1997.574058","url":null,"abstract":"The accuracy of the data dependence analysis of a client program will decide in what an extent the compiler can unleash the power of the potential parallelism of the client program. Most of the current works on dependence analysis are based on the dependence equation and constraint inequalities of loop variable bounds (sometimes augmented with the direction vector). Unfortunately, they can not give an exact detection on the dependence which may greatly affect the parallel optimization of the client program when software pipelining technique is employed. In the paper, we give a more effective constraint inequality which could reflect the characteristics of software pipelining technique and will improve the power of dependence analysis of most of the current algorithms when applied to software pipelining.","PeriodicalId":413925,"journal":{"name":"Proceedings. Advances in Parallel and Distributed Computing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126383898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings. Advances in Parallel and Distributed Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1