首页 > 最新文献

[1993] Proceedings Seventh International Parallel Processing Symposium最新文献

英文 中文
Dynamic embeddings of trees and quasi-grids into hyper-de Bruijn networks 树和准网格在超德布鲁因网络中的动态嵌入
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262823
Sabine R. Öhring, Sajal K. Das
This paper deals with optimal embeddings of various topologies into the hyper-de Bruijn network, which is a combination of the well known hypercube and the de Bruijn graph. In particular, the authors develop modular embeddings of complete binary trees and other tree-related graphs, and dynamic task allocation embeddings of dynamically evolving arbitrary binary trees. Additionally, an optimal embedding of butterflies and a subgraph-embedding of cube-connected cycles are presented. They also consider how to dynamically embed dynamically evolving grid-structures (so called quasi-grids) into hyper-de Bruijn networks. The results are important in mapping data and algorithm structures on multiprocessor networks.<>
本文研究了将各种拓扑最优嵌入到超立方体和德布鲁因图相结合的超德布鲁因网络中。特别地,作者开发了完全二叉树和其他树相关图的模块化嵌入,以及动态发展的任意二叉树的动态任务分配嵌入。此外,还提出了蝴蝶的最优嵌入和立方连接环的子图嵌入。他们还考虑了如何将动态演化的网格结构(所谓的准网格)动态嵌入到超德布鲁因网络中。这些结果对于在多处理器网络上映射数据和算法结构具有重要意义。
{"title":"Dynamic embeddings of trees and quasi-grids into hyper-de Bruijn networks","authors":"Sabine R. Öhring, Sajal K. Das","doi":"10.1109/IPPS.1993.262823","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262823","url":null,"abstract":"This paper deals with optimal embeddings of various topologies into the hyper-de Bruijn network, which is a combination of the well known hypercube and the de Bruijn graph. In particular, the authors develop modular embeddings of complete binary trees and other tree-related graphs, and dynamic task allocation embeddings of dynamically evolving arbitrary binary trees. Additionally, an optimal embedding of butterflies and a subgraph-embedding of cube-connected cycles are presented. They also consider how to dynamically embed dynamically evolving grid-structures (so called quasi-grids) into hyper-de Bruijn networks. The results are important in mapping data and algorithm structures on multiprocessor networks.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122420973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
OCCAM prototyping of massively parallel applications from colored Petri-nets 彩色Petri-nets大规模并行应用的OCCAM原型
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262772
F. Breant, Jean-François Peyre
The authors present a technique to build a massively parallel application from a formal description. They use the colored Petri-net formalism to model applications. This formalism allows them to concisely describe parallel applications. Theoretical results on this formalism contribute to proving the correctness of the description before implementation. Furthermore, they use some linear invariants to decompose the model into interacting state machines which are easy to implement. An important feature introduced consists in using color to map state machines and to distribute data and communication onto a formal architecture description.<>
作者提出了一种从形式化描述构建大规模并行应用程序的技术。他们使用彩色Petri-net形式化来为应用程序建模。这种形式使他们能够简洁地描述并行应用程序。关于这种形式的理论结果有助于证明在实现之前描述的正确性。此外,他们使用一些线性不变量将模型分解为易于实现的相互作用的状态机。引入的一个重要特性是使用颜色来映射状态机,并将数据和通信分发到正式的体系结构描述上
{"title":"OCCAM prototyping of massively parallel applications from colored Petri-nets","authors":"F. Breant, Jean-François Peyre","doi":"10.1109/IPPS.1993.262772","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262772","url":null,"abstract":"The authors present a technique to build a massively parallel application from a formal description. They use the colored Petri-net formalism to model applications. This formalism allows them to concisely describe parallel applications. Theoretical results on this formalism contribute to proving the correctness of the description before implementation. Furthermore, they use some linear invariants to decompose the model into interacting state machines which are easy to implement. An important feature introduced consists in using color to map state machines and to distribute data and communication onto a formal architecture description.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116071339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards optimal parallel radix sorting 走向最优并行基数排序
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262880
R. Vaidyanathan, C. Hartmann, P. Varshney
The authors propose a radix sorting algorithm for n m-bit numbers (where m= Omega (log n) and polynomially upper bounded in n) that runs in O(t(n)log m) time, on any PRAM with mp(n)/logn logm O(logn)-bit processors; p(n) and t(n) are the number of processors and time needed for any deterministic algorithm to sort n logn-bit numbers stably (integer sorting) on the same type of PRAM as used by the radix sorting algorithm. The proposed algorithm has the same factor of inefficiency (if any) as that of the integer sorting algorithm used by it.<>
在任意具有mp(n)/logn logm O(logn)位处理器的PRAM上,作者提出了一种n m-bit数的基数排序算法(其中m= Omega (logn)和n的多项式上界),该算法运行时间为O(t(n)log m);p(n)和t(n)是任何确定性算法在与基数排序算法使用的相同类型的PRAM上对n个对数位数进行稳定排序(整数排序)所需的处理器数量和时间。所提出的算法具有与它所使用的整数排序算法相同的低效率因素(如果有的话)。
{"title":"Towards optimal parallel radix sorting","authors":"R. Vaidyanathan, C. Hartmann, P. Varshney","doi":"10.1109/IPPS.1993.262880","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262880","url":null,"abstract":"The authors propose a radix sorting algorithm for n m-bit numbers (where m= Omega (log n) and polynomially upper bounded in n) that runs in O(t(n)log m) time, on any PRAM with mp(n)/logn logm O(logn)-bit processors; p(n) and t(n) are the number of processors and time needed for any deterministic algorithm to sort n logn-bit numbers stably (integer sorting) on the same type of PRAM as used by the radix sorting algorithm. The proposed algorithm has the same factor of inefficiency (if any) as that of the integer sorting algorithm used by it.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124386418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Impact of multiple consumption channels on wormhole routed k-ary n-cube networks 多消费通道对虫洞路由k-ary n-cube网络的影响
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262874
S. Balakrishnan, D. Panda
This paper presents a performance evaluation of multiple consumption channels in wormhole routed k-ary n-cube networks. The hotspots produced by non-uniform traffic patterns result in consumption bottleneck. The effects of this bottleneck are examined. The interplay between the number of consumption channels, the underlying routing algorithm, and the topology, is examined from the perspective of overall network performance. Two different communication patterns, all-to-one and non-uniform traffic are used in the study. The authors show that the severity of consumption bottleneck increases as the degree of adaptiveness in a routing algorithm increases, i.e., going from oblivious to partial to fully adaptive routing. They conclude that multiple consumption channels (upto 4 for 2D, 3D and 4D meshes and upto 8 for 8-cube) are desired to reduce the severity of this bottleneck and to exploit the advantages of adaptive routing schemes.<>
提出了一种虫洞路由k-ary n-cube网络中多消费通道的性能评价方法。不均匀的流量模式产生的热点导致了消费瓶颈。研究了这一瓶颈的影响。从整体网络性能的角度考察消费通道数量、底层路由算法和拓扑之间的相互作用。研究中使用了两种不同的通信模式:全对一和非均匀通信。研究表明,随着路由算法自适应程度的增加,即从无意识路由到部分自适应路由再到完全自适应路由,消耗瓶颈的严重程度也随之增加。他们得出结论,需要多个消费通道(2D, 3D和4D网格最多4个,8立方体最多8个)来降低这一瓶颈的严重程度,并利用自适应路由方案的优势
{"title":"Impact of multiple consumption channels on wormhole routed k-ary n-cube networks","authors":"S. Balakrishnan, D. Panda","doi":"10.1109/IPPS.1993.262874","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262874","url":null,"abstract":"This paper presents a performance evaluation of multiple consumption channels in wormhole routed k-ary n-cube networks. The hotspots produced by non-uniform traffic patterns result in consumption bottleneck. The effects of this bottleneck are examined. The interplay between the number of consumption channels, the underlying routing algorithm, and the topology, is examined from the perspective of overall network performance. Two different communication patterns, all-to-one and non-uniform traffic are used in the study. The authors show that the severity of consumption bottleneck increases as the degree of adaptiveness in a routing algorithm increases, i.e., going from oblivious to partial to fully adaptive routing. They conclude that multiple consumption channels (upto 4 for 2D, 3D and 4D meshes and upto 8 for 8-cube) are desired to reduce the severity of this bottleneck and to exploit the advantages of adaptive routing schemes.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122055134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Global semigroup operations in faulty SIMD hypercubes 故障SIMD超多维数据集中的全局半群操作
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262794
C. Raghavendra, M. Sridhar
The authors consider the problem of computing a global semigroup operation (such as addition and multiplication) on a faulty hypercube. In particular, they study the problem of performing such an operation in an n-dimensional SIMD hypercube Q/sub n/, with upto n-1 node and/or link faults. In an SIMD hypercube, during a communication step, nodes can exchange information with their neighbors only across a specific dimension. Given a set of most n-1 faults they develop an ordering d/sub 1/, d/sub 2/,. . .,d/sub n/ of n dimensions, depending on where the faults are located. An important and useful property of this dimension ordering is the following: if the n-cube is partitioned into k-subcubes using the first k dimensions f this ordering, namely d/sub 1/,d/sub 2/. . .d/sub k/ for any 1>
研究了在故障超立方体上计算全局半群运算(如加法和乘法)的问题。特别是,他们研究了在n维SIMD超立方体Q/sub n/中执行此类操作的问题,其中最多有n-1个节点和/或链路故障。在SIMD超立方体中,在通信步骤期间,节点只能跨特定维度与其相邻节点交换信息。给定一组大多数为n-1的断层,它们根据断层所在的位置,发展出n维的d/sub 1/, d/sub 2/,…,d/sub n/的顺序。这个维数排序的一个重要而有用的性质是:如果用这个排序的前k个维,即d/下标1/,d/下标2/,d/下标k/,对任意1>
{"title":"Global semigroup operations in faulty SIMD hypercubes","authors":"C. Raghavendra, M. Sridhar","doi":"10.1109/IPPS.1993.262794","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262794","url":null,"abstract":"The authors consider the problem of computing a global semigroup operation (such as addition and multiplication) on a faulty hypercube. In particular, they study the problem of performing such an operation in an n-dimensional SIMD hypercube Q/sub n/, with upto n-1 node and/or link faults. In an SIMD hypercube, during a communication step, nodes can exchange information with their neighbors only across a specific dimension. Given a set of most n-1 faults they develop an ordering d/sub 1/, d/sub 2/,. . .,d/sub n/ of n dimensions, depending on where the faults are located. An important and useful property of this dimension ordering is the following: if the n-cube is partitioned into k-subcubes using the first k dimensions f this ordering, namely d/sub 1/,d/sub 2/. . .d/sub k/ for any 1<or=k<or=n, then each k-subcube in the partition contains at most k-1 faults. They use this result to develop algorithms for global sum. This ordering can be obtained in the presence of node as well as link faults. They also consider larger fault size, and show how to extend the dimension ordering theorem to handle up to (/sub 2//sup n/) faults. Using this result, it seems possible to obtain even more fault-tolerant algorithms for the semigroup operation problem.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115219661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A high speed dataflow processing element and its performance compared to a von Neumann mainframe 一种高速数据流处理元件及其与冯·诺伊曼大型机的性能比较
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262851
J. N. Coleman
The Event Processor / 3 is a dataflow processing element designed for high performance over a range of general computing tasks. Using a multithreading technique, program parallelism is exploited by interleaving threads onto successive pipeline stages. It may also be used as an element in a multiprocessor system. This paper describes the philosophy and design of the machine, and presents the results of detailed simulations of the performance of a single processing element. This is analysed into three factors: clock period, cycles per instruction and instructions per program; and each factor is compared with the measured performance of an advanced von Neumann computer running equivalent code. It is shown that the dataflow processor compares favourably, given a reasonable degree of parallelism in the program.<>
事件处理器/ 3是一个数据流处理元素,设计用于在一系列通用计算任务上实现高性能。使用多线程技术,程序的并行性是通过将线程交织到连续的流水线阶段来实现的。它也可以用作多处理器系统中的元件。本文介绍了机器的原理和设计,并给出了单个处理元件性能的详细仿真结果。这被分析为三个因素:时钟周期、每条指令的周期和每个程序的指令;并将每个因素与运行等效代码的先进冯·诺伊曼计算机的测量性能进行比较。结果表明,在给定程序中合理的并行度的情况下,数据流处理器比较有利。
{"title":"A high speed dataflow processing element and its performance compared to a von Neumann mainframe","authors":"J. N. Coleman","doi":"10.1109/IPPS.1993.262851","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262851","url":null,"abstract":"The Event Processor / 3 is a dataflow processing element designed for high performance over a range of general computing tasks. Using a multithreading technique, program parallelism is exploited by interleaving threads onto successive pipeline stages. It may also be used as an element in a multiprocessor system. This paper describes the philosophy and design of the machine, and presents the results of detailed simulations of the performance of a single processing element. This is analysed into three factors: clock period, cycles per instruction and instructions per program; and each factor is compared with the measured performance of an advanced von Neumann computer running equivalent code. It is shown that the dataflow processor compares favourably, given a reasonable degree of parallelism in the program.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130232751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Permutation on the mesh with reconfigurable bus: algorithms and practical considerations 具有可重构总线的网格排列:算法和实际考虑
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262896
Yen-Wen Lu, J. Burr, A. Peterson
Permutation is a common problem in both computation and communication. The authors add the buses to the mesh-connected multiprocessors and introduce the tokens to control the buses. They propose to use the mesh with segmented reconfigurable bus to increase performance of data routing. Segmented reconfigurable bus can not only use the bus-token more efficiently than the traditional bus, but also reduce interconnection delay. The authors choose the segment length to balance latency and throughput of the system to get better performance. In the simulation, the mesh with segmented reconfigurable bus can finish N*N permutation in .6065 N steps in average.<>
排列是计算和通信中常见的问题。作者在网格连接的多处理器中增加了总线,并引入了令牌来控制总线。他们提出使用带有分段可重构总线的网格来提高数据路由的性能。分段可重构总线不仅可以比传统总线更有效地利用总线令牌,而且可以减少互连延迟。为了获得更好的性能,作者选择了段长度来平衡系统的延迟和吞吐量。仿真结果表明,采用分段可重构总线的网格平均在0.6065 N步内完成N*N的排列。
{"title":"Permutation on the mesh with reconfigurable bus: algorithms and practical considerations","authors":"Yen-Wen Lu, J. Burr, A. Peterson","doi":"10.1109/IPPS.1993.262896","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262896","url":null,"abstract":"Permutation is a common problem in both computation and communication. The authors add the buses to the mesh-connected multiprocessors and introduce the tokens to control the buses. They propose to use the mesh with segmented reconfigurable bus to increase performance of data routing. Segmented reconfigurable bus can not only use the bus-token more efficiently than the traditional bus, but also reduce interconnection delay. The authors choose the segment length to balance latency and throughput of the system to get better performance. In the simulation, the mesh with segmented reconfigurable bus can finish N*N permutation in .6065 N steps in average.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127816522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Reconfiguration of binary trees in faulty hypercubes 故障超立方体中二叉树的重构
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262916
P. Yang, C. Raghavendra
The authors present a distributed scheme for reconfiguration of embedded binary trees in hypercubes. Their scheme can reconfigure around any 3n/2 faulty nodes in O(n) time, in an n-dimensional hypercube. Their technique, which is based on a key concept called degree of occupancy, can be generalized to any task graph.<>
提出了一种超立方体中嵌入式二叉树重构的分布式方案。在n维超立方体中,他们的方案可以在O(n)时间内围绕任意3n/2个故障节点进行重新配置。他们的技术基于一个叫做占用度的关键概念,可以推广到任何任务图。
{"title":"Reconfiguration of binary trees in faulty hypercubes","authors":"P. Yang, C. Raghavendra","doi":"10.1109/IPPS.1993.262916","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262916","url":null,"abstract":"The authors present a distributed scheme for reconfiguration of embedded binary trees in hypercubes. Their scheme can reconfigure around any 3n/2 faulty nodes in O(n) time, in an n-dimensional hypercube. Their technique, which is based on a key concept called degree of occupancy, can be generalized to any task graph.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124552651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Experimental evidence for the power of random sampling in practical parallel algorithms 实验证明了随机抽样在实际并行算法中的作用
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262819
M. R. Ghouse, M. Goodrich
Recent results in parallel algorithm theory have shown random sampling to be a powerful technique for achieving efficient bounds on the expected asymptotic running time of parallel algorithms for a number of important problems. The authors show experimentally that randomization is also a powerful practical technique in the design and implementation of parallel algorithms. Random sampling can be used to design parallel algorithms with fast expected run times, which meet or beat the run times of methods based on more conventional methods for a variety of benchmark tests. The constant factors of proportionality in the run times are small, and, most importantly, the expected work (and hence running time) avoids worst cases due to input distribution. They justify the approach through experimental results obtained on a Connection Machine CM-2 for a specific problem, namely, segment intersection reporting, and explore the effect of varying the parameters of the method.<>
并行算法理论的最新研究结果表明,随机抽样是一种强大的技术,可以在许多重要问题上实现并行算法的期望渐近运行时间的有效边界。实验表明,随机化在并行算法的设计和实现中也是一种强大的实用技术。随机抽样可用于设计具有快速预期运行时间的并行算法,这些算法在各种基准测试中满足或超过基于更传统方法的方法的运行时间。运行时间中的比例系数很小,而且最重要的是,预期的工作(以及因此产生的运行时间)避免了由于输入分布而导致的最坏情况。他们通过在连接机CM-2上获得的针对特定问题的实验结果来证明该方法的合理性,即分段交叉报告,并探索了改变该方法参数的影响。
{"title":"Experimental evidence for the power of random sampling in practical parallel algorithms","authors":"M. R. Ghouse, M. Goodrich","doi":"10.1109/IPPS.1993.262819","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262819","url":null,"abstract":"Recent results in parallel algorithm theory have shown random sampling to be a powerful technique for achieving efficient bounds on the expected asymptotic running time of parallel algorithms for a number of important problems. The authors show experimentally that randomization is also a powerful practical technique in the design and implementation of parallel algorithms. Random sampling can be used to design parallel algorithms with fast expected run times, which meet or beat the run times of methods based on more conventional methods for a variety of benchmark tests. The constant factors of proportionality in the run times are small, and, most importantly, the expected work (and hence running time) avoids worst cases due to input distribution. They justify the approach through experimental results obtained on a Connection Machine CM-2 for a specific problem, namely, segment intersection reporting, and explore the effect of varying the parameters of the method.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117231021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mapping to reduce contention in multiprocessor architectures 映射以减少多处理器体系结构中的争用
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262889
L. Schwiebert, D. Jayasimha
Reducing communication overhead has been widely recognized as a requirement for achieving efficient mappings which substantially reduce the execution time of parallel algorithms. This paper presents an iterative heuristic for static mapping of parallel algorithms to architectures. Special attention is given to measuring and reducing channel contention. Experimental results are used to show the effects of channel contention for packet-switched networks and the improvement realized by the authors' heuristic. They also present preliminary results for wormhole-routed networks.<>
减少通信开销已被广泛认为是实现有效映射的必要条件,从而大大减少并行算法的执行时间。提出了一种并行算法到体系结构静态映射的迭代启发式算法。特别注意测量和减少信道争用。实验结果显示了信道争用对分组交换网络的影响以及作者的启发式算法所实现的改进。他们还提出了虫洞路由网络的初步结果。
{"title":"Mapping to reduce contention in multiprocessor architectures","authors":"L. Schwiebert, D. Jayasimha","doi":"10.1109/IPPS.1993.262889","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262889","url":null,"abstract":"Reducing communication overhead has been widely recognized as a requirement for achieving efficient mappings which substantially reduce the execution time of parallel algorithms. This paper presents an iterative heuristic for static mapping of parallel algorithms to architectures. Special attention is given to measuring and reducing channel contention. Experimental results are used to show the effects of channel contention for packet-switched networks and the improvement realized by the authors' heuristic. They also present preliminary results for wormhole-routed networks.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121196091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
[1993] Proceedings Seventh International Parallel Processing Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1