Parallel Processing Letters最新文献

英文中文

Experimental Results about Mpi Collective Communication Operations Mpi集体通信操作的实验结果

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1999-04-12 DOI: 10.1142/S0129626405002179

M. Bernaschi, G. Iannello, S. Crea

Collective communication performance is critical in a number of MPI applications, yet relatively few results are available to assess the performance of mainstream MPI implementations. In this paper we focus on two widely used primitives, broadcast and reduce, and present experimental results for the Cray T3E and the IBM SP2. We compare the performance of the existing MPI primitives with our implementation based on a new algorithm. Our tests show that existing all-software implementations can be improved and highlight the advantages of the Cray hardware-assisted implementation.

在许多MPI应用程序中，集体通信性能是至关重要的，但相对较少的结果可用于评估主流MPI实现的性能。本文重点研究了广播和约简这两个广泛使用的原语，并给出了在Cray T3E和IBM SP2上的实验结果。我们将现有MPI原语的性能与基于新算法的实现进行了比较。我们的测试表明，现有的全软件实现可以得到改进，并突出了Cray硬件辅助实现的优势。

引用次数: 14

A Note on Communication-Efficient Deterministic Parallel Algorithms for Planar Point Location and 2D Voronoï Diagram 平面点定位与二维Voronoï图的通信高效确定性并行算法研究

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1998-02-25 DOI: 10.1142/S0129626401000622

Mohamadou A. Diallo, Afonso Ferreira, A. Rau-Chaplin

In this note we describe deterministic parallel algorithms for planar point location and for building the Voronoi Diagram of n co-planar points. These algorithms are designed for BSP/CGM-like models of computation, where p processors, with local memory each, communicate through some arbitrary interconnection network. They are communication-efficient since they require, respectively, O(1) and O(log p) communication steps and local computation per step. Both algorithms require local memory.

在本文中，我们描述了平面点定位和n个共面点的Voronoi图的构建的确定性并行算法。这些算法是为类似BSP/ cgm的计算模型设计的，其中p个处理器，每个都有本地内存，通过一些任意互连网络进行通信。它们具有通信效率，因为它们分别需要O(1)和O(log p)个通信步骤和每一步的本地计算。这两种算法都需要本地内存。

引用次数: 7

Wormhole Deadlock Prediction 虫洞死锁预测

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1997-08-26 DOI: 10.1142/S0129626400000287

M. D. Ianni

Deadlock prevention is usually realized by imposing strong restrictions on packet transmissions in the network so that the resulting deadlock free routing algorithms are not optimal with respect to resources utilization. Optimality request can be satisfied by forbidding transmissions only when they would bring the network into a configuration that will necessarily evolve into a deadlock. Hence, optimal deadlock avoidance is closely related to deadlock prediction. In this paper it is shown that wormhole deadlock prediction is an hard problem. Such result is proved with respect to both static and dynamic routing.

死锁预防通常是通过对网络中的数据包传输施加严格的限制来实现的，因此产生的无死锁路由算法在资源利用方面不是最优的。只有当传输将使网络进入必然演变为死锁的配置时，才能通过禁止传输来满足最优性请求。因此，最优死锁避免与死锁预测密切相关。本文指出虫洞死锁预测是一个难题。在静态路由和动态路由两方面都证明了这一结果。

引用次数: 5

Array Dataflow Analysis for Explicitly Parallel Programs 显式并行程序的数组数据流分析

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1996-08-26 DOI: 10.1142/S0129626497000140

J. Collard, M. Griebl

This paper describes a dataflow analysis of array data structures for data-parallel and/or control- (or task-) parallel imperative languages. This analysis departs from previous work because it 1) simultaneously handles both parallel programming paradigms, and 2) does not rely on the usual iterative solving process of a set of data flow equations but extends array dataflow analysis based on integer linear programming, thus improving the precision of results.

本文描述了数据并行和/或控制(或任务)并行命令式语言中数组数据结构的数据流分析。该分析不同于以往的工作，因为它1)同时处理两种并行编程范式，2)不依赖于通常的一组数据流方程的迭代求解过程，而是扩展了基于整数线性规划的数组数据流分析，从而提高了结果的精度。

引用次数: 16

Parallel Creation of Linear Octrees from Quadtree Slices 从四叉树切片并行创建线性八叉树

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1994-12-01 DOI: 10.1007/978-4-431-68456-5_42

L. K. Swift, T. Johnson, P. Livadas

引用次数: 4

On Self-Stabilizing Wait-Free Clock Synchronization 自稳定无等待时钟同步

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1994-07-06 DOI: 10.1142/S0129626497000334

M. Papatriantafilou, P. Tsigas

Clock synchronization algorithms which can tolerate any number of processors that can fail by ceasing operation for an unbounded number of steps and resuming operation (with or) without knowing that they were faulty are called Wait-Free. Furthermore, if these algorithms are also able to work correctly even when the starting state of the system is arbitrary, they are called Wait-Free, Self-Stabilizing. This work deals with the problem of Wait-Free, Self-Stabilizing Clock Synchronization of n processors in an “in-phase” multiprocessor system and presents a solution with quadratic synchronization time. The best previous solution has cubic synchronization time. The idea of the algorithm is based on a simple analysis of the difficulties of the problem which helped us to see how to “re-parametrize” the cubic previously mentioned algorithm in order to get the quadratic synchronization time solution. Both the protocol and its analysis are intuitive and easy to understand.

时钟同步算法可以容忍任意数量的处理器故障，停止操作无限步数，然后在不知道故障的情况下(有或有)恢复操作，称为无等待算法。此外，如果这些算法即使在系统的启动状态是任意的情况下也能正常工作，则称为无等待、自稳定。本文研究了“同相”多处理器系统中n个处理器的无等待自稳定时钟同步问题，并给出了一个二次同步时间的解决方案。以前最好的解决方案是三次同步时间。该算法的思想是基于对问题难点的简单分析，这有助于我们了解如何“重新参数化”前面提到的三次算法，以获得二次同步时间解。该协议及其分析都直观易懂。

引用次数: 29

R2M: A RECONFIGURABLE REWRITE MACHINE R2m:可重构重写机

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1994-06-01 DOI: 10.1142/S0129626494000181

R. Ramesh

Term rewriting is a popular computational paradigm for symbolic computations such as formula manipulation, theorem proving and implementations of nonprocedural programming languages. In rewriting, the most demanding operation is repeated simplification of terms by pattern matching them against rewrite rules. We describe a parallel architecture, R2M, for accelerating this operation. R2M can operate either as a stand-alone processor using its own memory or as a backend device attached to a host using the host’s main memory. R2M uses only a fixed number (independent of input size) of processing units and fixed capacity auxiliary memory units, yet it is capable of handling variable-size rewrite rules that change during simplification. This is made possible by a simple and reconfigurable interconnection present in R2M. Finally, R2M uses a hybrid scheme that combines the ease, and efficiency of parallel pattern matching using the tree representation of terms, and the naturalness of their dag representation for replacements.

术语重写是一种流行的符号计算范式，如公式操作、定理证明和非过程编程语言的实现。在重写过程中，要求最高的操作是根据重写规则对术语进行模式匹配，从而重复简化术语。我们描述了一个并行架构R2M来加速这个操作。R2M既可以使用自己的内存作为独立处理器运行，也可以使用主机的主内存作为连接到主机的后端设备运行。R2M只使用固定数量(与输入大小无关)的处理单元和固定容量的辅助内存单元，但是它能够处理在简化过程中发生变化的可变大小的重写规则。这可以通过R2M中存在的简单且可重构的互连来实现。最后，R2M使用一种混合方案，该方案结合了使用术语的树表示的并行模式匹配的便利性和效率，以及替换术语的日表示的自然性。

引用次数: 1

A Theory to Increase the Effective Redundancy in Wormhole Networks 一种增加虫洞网络有效冗余的理论

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1993-09-13 DOI: 10.1142/S0129626494000144

J. Duato

Fault-tolerant systems aim at providing continuous operations in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyses the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level, giving a sufficient condition for a channel to be redundant and computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies a lower bound for it. Finally, a fault-tolerant routing algorithm based on the former theory is proposed.

容错系统的目标是在存在故障的情况下提供连续的操作。多台计算机依靠处理器之间的互连网络来支持消息传递机制。因此，互联网络的可靠性对整个系统的可靠性至关重要。结合连通性和死锁自由，分析了虫洞网络的有效冗余。在信道级定义冗余，给出信道冗余的充分条件，并计算冗余信道的集合。定义了网络的冗余水平，并提出了一个定理，为其提供了一个下界。最后，在上述理论的基础上提出了一种容错路由算法。

引用次数: 15

Bit-Level Systolic Arrays for Digital Contour Smoothing by Abel-Poisson Kernel Abel-Poisson核数字轮廓平滑的位级收缩阵列

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1993-03-01 DOI: 10.1142/S0129626493000071

J. Glasa

Two different bit-level systolic arrays for digital contour smoothing by Abel-Poisson kernel which minimize the execution time and the number of functional elements required are suggested. The arrays are fully pipelined on the bit-level achieving very high clock frequency. They are implementable in VLSI and are dedicated for real-time applications.

提出了两种不同的位级收缩阵列，以最大限度地减少执行时间和所需功能元素的数量。这些阵列在位级上完全流水线化，实现了非常高的时钟频率。它们可在VLSI中实现，并专门用于实时应用。

引用次数: 2

SCHEDULING A SCATTERING-GATHERING SEQUENCE ON HYPERCUBES 调度超多维数据集上的分散收集序列

IF 0.4 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Parallel Processing Letters

Pub Date : 1993-03-01 DOI: 10.1142/S012962649300006X

Charles Henri-Pierre, Fraigniaud Pierre

The scattering problem refers to the gossiping and the broadcasting problems [1, 2]. It consists in distributing a set of data from a single source such that each component is sent to a distinct address. The gathering operation is the reverse of the scattering operation. This paper studies the problem of pipelining a scattering-gathering sequence in order to overlap these operations. We first give a general solution for distributed memory parallel computers, and next we particularly study this problem on hypercubes.

散射问题是指八卦和广播问题[1,2]。它包括分布来自单一来源的一组数据，这样每个组件都被发送到不同的地址。聚集操作与散射操作相反。本文研究了一个分散-收集序列的流水线问题，以使这些操作重叠。本文首先给出了分布式内存并行计算机的一般解决方案，然后在超立方体上对该问题进行了具体研究。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Parallel Processing Letters

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀