[1993] Proceedings Seventh International Parallel Processing Symposium最新文献

英文中文

Data partitioning schemes for the parallel implementation of the revised simplex algorithm for LP problems LP问题中修正单纯形算法并行实现的数据划分方案

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262910

Usha Sridhar, A. Basu

The parallel implementation of the revised simplex algorithm (RSA) using eta-factorization holds the promise of significant improvement in the execution time by virtue of the existence of a high degree of parallelism in the computation within an iteration of the algorithm. However, the scheme employed to partition key data structures in a distributed memory parallel processor has a great impact on the achievable performance. The paper explores the trade-offs between block-row and block-column partitioning schemes for the matrix of constraint coefficients vis-a-vis the communication overheads and granularity of parallel computations. The results of an approximate analysis of the compute-communication balance are compared with measurements from practical implementation of the partitioning schemes on C-DAC's PARAM 8000 distributed memory parallel processor.<>

使用eta-factorization的修正单纯形算法(RSA)的并行实现，由于在算法迭代中的计算中存在高度并行性，因此有望显著改善执行时间。然而，在分布式内存并行处理器中，关键数据结构的分区方案对可实现的性能有很大的影响。本文针对并行计算的通信开销和粒度，探讨了约束系数矩阵的块-行和块-列划分方案之间的权衡。计算-通信平衡的近似分析结果与在C-DAC的PARAM 8000分布式内存并行处理器上实际实现的分区方案的测量结果进行了比较。

引用次数: 0

A load balancing strategy for prioritized execution of tasks 用于优先执行任务的负载平衡策略

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262887

Amitabh Sinha, L. Kalé

Load balancing is a critical factor in achieving optimal performance in parallel applications where tasks are created in a dynamic fashion. In many computations, such as state space search problems, tasks have priorities, and solutions to the computation may be achieved more efficiently if these priorities are adhered to in the parallel execution of the tasks. For such tasks, a load balancing scheme that only seeks to balance load, without balancing high priority tasks over the entire system, might result in the concentration of high priority tasks (even in a balanced-load environment) on a few processors, thereby leading to low priority work being done. In such situations a load balancing scheme is desired which would balance both load and high priority tasks over the system. The authors describe the development of a more efficient prioritized load balancing strategy.<>

在以动态方式创建任务的并行应用程序中，负载平衡是实现最佳性能的关键因素。在许多计算中，例如状态空间搜索问题，任务具有优先级，如果在并行执行任务时遵循这些优先级，则可以更有效地实现计算的解决方案。对于这样的任务，只寻求平衡负载而不平衡整个系统上的高优先级任务的负载平衡方案可能会导致高优先级任务(即使在均衡负载环境中)集中在几个处理器上，从而导致低优先级的工作被完成。在这种情况下，需要一个负载平衡方案来平衡系统上的负载和高优先级任务。作者描述了一种更有效的优先级负载平衡策略的发展。

引用次数: 77

Scheduling a computational dag on a parallel system with communication delays and replication of node execution 在具有通信延迟和节点执行复制的并行系统上调度计算日

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262865

P. Markenscoff, Yong Yuan Li

The authors consider the problem of optimally scheduling the subtasks of a computational task modeled by a dag (directed acyclic graph) on parallel systems with identical processors. Execution of the subtasks (nodes) must satisfy precedence constraints that are met via data exchanges among processors which introduce communication delays. The optimization criterion used is the minimization of the processing time and the authors assume that there is no restriction on the number of processors needed and that a node may be replicated. They prove that the optimal scheduling problem can be solved in polynomial amount of time when the computational graph is a two-level dag. For a general dag they develop an algorithm that significantly reduces the search space over exhaustive search and can work very fast in many cases (the problem is NP-complete).<>

研究了在具有相同处理器的并行系统上，以dag(有向无环图)为模型的计算任务的子任务的最优调度问题。子任务(节点)的执行必须满足优先级约束，这些约束通过处理器之间的数据交换来满足，从而引入通信延迟。使用的优化标准是处理时间的最小化，作者假设对所需的处理器数量没有限制，并且可以复制节点。他们证明了当计算图是一个两层日时，最优调度问题可以在多项式时间内得到解决。一般来说，他们开发了一种算法，该算法大大减少了穷举搜索的搜索空间，并且在许多情况下(问题是np完全的)可以非常快地工作。

引用次数: 9

A trip-based multicasting model for wormhole-routed networks with virtual channels 基于行程的虚拟通道虫洞路由网络组播模型

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262893

Y. Tseng, D. Panda, T. Lai

This paper considers the single-source and multi-source multicasting problem in wormhole-routed networks. A general trip-based model is proposed for any network having at least 2 virtual channels per physical channel. The underlying concept of this model is a node sequence called skirt, which always exists in graphs of any topology. The strength of this model is demonstrated by its capabilities: (a) the resulting routing scheme is simple, adaptive, distributed and deadlock-free; (b) the model is independent of the network topology, regular or irregular; (c) the minimum number of virtual channels required is constant as the network grows in size; and (d) it can tolerate faults easily. Using 2 virtual channels/physical channel, it is shown how to construct a single trip in faulty hypercubes and multiple trips in fault-free meshes. Simulation experimental results indicate the potential of the model to tolerate faults with very little performance degradation and to reduce multicast latency with multiple trips.<>

研究了虫洞路由网络中的单源和多源组播问题。对于每个物理信道至少有2个虚拟信道的网络，提出了一个通用的基于行程的模型。该模型的基本概念是一个被称为裙边的节点序列，它总是存在于任何拓扑的图中。该模型的优点体现在:(a)所得到的路由方案简单、自适应、分布式、无死锁;(b)模型独立于网络拓扑结构，是规则的还是不规则的;(c)随着网络规模的扩大，所需的最低虚拟频道数量是恒定的;(d)它可以很容易地容忍错误。利用2个虚拟通道/物理通道，展示了如何在故障超立方体中构造单行程和在无故障网格中构造多行程。仿真实验结果表明，该模型能够在很小的性能下降的情况下容忍故障，并减少多趟传输的组播延迟。

引用次数: 86

Optimal broadcasting in binary de Bruijn networks and hyper-deBruijn networks 二进制德布鲁因网络和超德布鲁因网络的最优广播

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262803

E. Ganesan, D. Pradhan

The order-(m, n) hyper-deBruijn graph H D(m, n) is the direct product of an order-m hypercube and an order-n deBruijn graph. The hyper-deBruijn graph offers flexibility in terms of connections per node and the level of fault-tolerance. These networks as well possess logarithmic diameter, simple routing algorithms and support many computationally important subgraphs and admit efficient implementation. The authors present asymptotically optimal one-to-all (OTA) broadcasting scheme for these networks, assuming packet switched routing and concurrent communication on all ports. The product structure of the hyper-deBruijn graphs is exploited to construct an optimal number of edge-disjoint spanning trees to achieve this. Also, as an intermediate result they present a technique to construct an optimal number of spanning trees with heights bounded by the diameter in binary deBruijn graphs. This result is used to achieve the fastest OTA broadcasting scheme for binary deBruijn networks. The recent renewed interest of binary deBruijn networks makes this result valuable.<>

阶-(m, n)超德布鲁因图dh (m, n)是一个阶-m超立方体和一个阶-n德布鲁因图的直接乘积。hyper-deBruijn图在每个节点的连接和容错级别方面提供了灵活性。这些网络还具有对数直径，简单的路由算法，支持许多计算上重要的子图，并允许高效实现。提出了一种基于分组交换路由和所有端口并发通信的渐近最优一对所有(OTA)广播方案。利用超德布鲁因图的积结构构造最优数量的边不相交生成树来实现这一目标。此外，作为一个中间结果，他们提出了一种在二值德布鲁因图中构造高度以直径为界的生成树的最优数量的技术。该结果用于实现二进制deBruijn网络中最快的OTA广播方案。最近对二进制deBruijn网络的重新关注使得这个结果很有价值。

引用次数: 6

Simulating interconnection networks in RAW 在RAW中模拟互连网络

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262883

W. Ligon, U. Ramachandran

The authors investigate the relationship between application program characteristics and interconnection network (ICN) performance using an execution driven simulation testbed: the reconfigurable architecture workbench (RAW). RAW simulates a wide variety of parallel architectures including both fine and coarse grain; SIMD, MIMD, and hybrid machines; and a wide variety of ICNs. They present RAW's network model, the structure of RAW's network simulator, a model for k-ary n-cube networks which are currently popular in the literature, and the results of experiments using the simulator. Their results show that application program characteristics can have a profound effect on network performance: a revelation that points out the benefits of studying interconnection networks in the context of overall application performance.<>

作者使用一个执行驱动的仿真测试平台:可重构架构工作台(RAW)来研究应用程序特性与互联网络(ICN)性能之间的关系。RAW模拟各种各样的并行架构，包括细颗粒和粗颗粒;SIMD、MIMD和混合机器;以及各种各样的ICNs。他们介绍了RAW的网络模型，RAW网络模拟器的结构，目前在文献中流行的k-ary n-cube网络模型，以及使用模拟器的实验结果。他们的研究结果表明，应用程序特性可以对网络性能产生深远的影响:这一启示指出了在整体应用程序性能的背景下研究互连网络的好处。

引用次数: 5

Computing convolutions on mesh-like structures 计算类网格结构上的卷积

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262796

O. Schwarzkopf

Although the computation of two dimensional convolutions is one of the basic computational tools for the processing of digitized images, and although it is general knowledge that convolutions can be efficiently computed sequentially with the aid of Fourier transforms, previous work on the parallel computation of convolutions has not been based on Fourier transforms. This is probably due to the fact that the fast Fourier transform cannot be implemented efficiently on simple structures such as the mesh, the mesh with broadcasting the mesh of trees, or the pyramid computer. It is shown that it makes sense to use the Fourier transform, even on such simpler structures, obtaining nearly optimal algorithms for the computation of convolutions on the parallel structures listed above. As an application, an algorithm is given that computes the digitized configuration space of a robot with translation only in the plane.<>

虽然二维卷积的计算是处理数字化图像的基本计算工具之一，虽然卷积可以在傅里叶变换的帮助下有效地顺序计算是众所周知的，但以前关于卷积并行计算的工作并没有基于傅里叶变换。这可能是由于快速傅立叶变换不能有效地实现在简单的结构，如网格，网格与广播网格的树，或金字塔计算机。结果表明，使用傅里叶变换是有意义的，即使是在这样简单的结构上，也可以在上面列出的并行结构上获得近似最优的卷积计算算法。作为应用，给出了一种仅在平面上平移的机器人数字化位形空间的计算算法。

引用次数: 5

Sorting n/sup 2/ numbers on n*n meshes 在n*n个网格上排序n/sup 2/ number

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262858

M. Nigam, S. Sahni

The authors show that by folding data from an n*n mesh onto an n*(n/k) submesh, sorting on the submesh, and finally unfolding back onto the entire n*n mesh it is possible to sort on bidirectional and strict unidirectional meshes using a number of routing steps that is very close to the distance lower bound for these architectures. The technique may also be applied to reconfigurable bus architectures to obtain faster sorting algorithms.<>

作者表明，通过将数据从n*n网格折叠到n*(n/k)个子网格上，在子网格上排序，最后展开到整个n*n网格上，可以使用一些非常接近这些架构的距离下界的路由步骤对双向和严格的单向网格进行排序。该技术还可以应用于可重构总线体系结构，以获得更快的排序算法。

引用次数: 8

On simulations of linear arrays, rings and 2D meshes on Fibonacci cube networks 线性阵列、环和二维网格在斐波那契立方网络上的模拟

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262788

B. Cong, S. Zheng, S. Sharma

The Fibonacci cube was proposed recently as an interconnection network. It has been shown that this new network topology possesses many interesting properties that are important in network design and applications. This paper addresses the following network simulation problem: Given a linear array, a ring or a two-dimensional mesh, how can be assign its nodes to the Fibonacci cube nodes so as to keep their adjacent nodes near each other in the Fibonacci cube. The authors first show a simple fact that there is a Hamiltonian path in any Fibonacci cube. They prove that any ring structure can be embedded into its corresponding optimum Fibonacci cube (the smallest Fibonacci cube with at least the number of nodes in the ring) with dilation 2, which is optimum for most cases. Then, they describe dilation 1 embeddings of a class of meshes into their corresponding optimum Fibonacci cubes. Finally, it is shown that an arbitrary mesh can be embedded into its corresponding optimum or near-optimum Fibonacci cube with dilation 2.<>

斐波那契立方体最近被提出作为一个互连网络。研究表明，这种新的网络拓扑结构具有许多有趣的特性，在网络设计和应用中具有重要意义。本文解决以下网络仿真问题:给定一个线性阵列、一个环或一个二维网格，如何将其节点分配给斐波那契立方体节点，使其相邻节点在斐波那契立方体中彼此靠近。作者首先证明了一个简单的事实，即在任何斐波那契立方体中都存在一条哈密顿路径。他们证明了任何环结构都可以嵌入到其相应的最优斐波那契立方体中(最小的斐波那契立方体，在环中至少有节点数)，其膨胀率为2，在大多数情况下是最优的。然后，他们将一类网格扩展嵌入到相应的最优斐波那契立方体中。最后，证明了任意网格可以嵌入到其相应的最优或近似最优斐波那契立方体中。

引用次数: 27

A parallel Prolog execution model: theoretical approach and experimental results 并行Prolog执行模型:理论方法与实验结果

[1993] Proceedings Seventh International Parallel Processing Symposium

Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262849

J. Bodeveix, Érick Bizouarn

This paper presents a parallel all-solution extension of Prolog integrating AND parallelism and a restricted form of OR parallelism, both explicitly declared by the user. Parallel sub-goals may share variables and incrementally communicate partially instantiated terms via their common variables, thus allowing stream AND parallelism. Furthermore, the communication direction does not need to be declared by the user or deduced by a static analysis. The resolution model is detailed and its completeness proven. The authors describe a transputer network implementation.<>

本文给出了Prolog的一种并行全解扩展，它集成了与并行性和或并行性的一种限制形式，两者都由用户显式声明。并行子目标可以共享变量，并通过它们的公共变量增量地通信部分实例化的术语，从而允许流和并行。此外，通信方向不需要由用户声明或由静态分析推断。详细介绍了该模型，并证明了其完备性。作者描述了一个转发器网络的实现

引用次数: 1

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

[1993] Proceedings Seventh International Parallel Processing Symposium

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀