Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing最新文献

英文中文

Effective mapping of artificial neural network algorithms onto massively parallel hardware: the REMAP programming environment 人工神经网络算法在大规模并行硬件上的有效映射:REMAP编程环境

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472292

Guang Li, B. Svensson

The application of artificial neural networks (ANN) in real-time embedded systems demands high performance computers. Miniaturized massively parallel architectures are suitable computation platforms for this task. An important question which arises is how to establish an effective mapping from ANN algorithms to hardware. In this paper, we demonstrate how an effective mapping can be achieved with our programming environment in close combination with an optimized architecture design targeted for neuro-computing.<>

人工神经网络在实时嵌入式系统中的应用需要高性能的计算机。小型化的大规模并行架构是解决这一问题的理想计算平台。一个重要的问题是如何建立从人工神经网络算法到硬件的有效映射。在本文中，我们演示了如何将我们的编程环境与针对神经计算的优化架构设计紧密结合，实现有效的映射。

引用次数: 0

On the feasibility of a scalable opto-electronic CRCW shared memory 可扩展光电CRCW共享存储器的可行性研究

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472187

P. Lukowicz, W. Tichy

We discuss the results of a feasibility study of an opto-electronic shared memory with concurrent read, concurrent write capability. Unlike previous such work we consider a true hardware shared memory rather then a simulation on a tightly, optically connected distributed memory computer. We describe a design that could be implemented using compact integrated semiconductor modules and propose ways to solve two major problems faced by such a device: optical system complexity and parallel word level write consistency. It is shown that, in principle, a memory with GBytes capacity and a latency of less then 1 ns, accessed by up to 10/sup 5/ processors could be feasible. Using devices currently available as laboratory prototypes and taking into account energy and crosstalk considerations a capacity of more then 1 MB and a latency of about 50 ns might be attained for up to 1000 processors.<>

讨论了具有并发读、并发写能力的光电共享存储器的可行性研究结果。与以前的工作不同，我们考虑的是一个真正的硬件共享内存，而不是在一个紧密的、光连接的分布式内存计算机上的模拟。我们描述了一种可以使用紧凑集成半导体模块实现的设计，并提出了解决这种器件面临的两个主要问题的方法:光学系统复杂性和并行字级写入一致性。这表明，原则上，具有gb容量和小于1ns的延迟，由多达10/sup 5/个处理器访问的存储器是可行的。使用目前作为实验室原型的设备，并考虑到能量和串扰因素，对于多达1000个处理器，可能达到超过1mb的容量和约50 ns的延迟。

引用次数: 0

Asynchronous interaction in massively parallel computing 大规模并行计算中的异步交互

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472302

V.L. Varscavsky

From the standpoint of hardware experts, asynchronism is connected with the concept of physical time as an independent physical variable and is determined by the variations of transient process durations in hardware circuits, modules and blocks that are physical objects by their nature. Software and architecture experts treat asynchronism as a partial order on events that are logical objects, i.e. they think in terms of logical time. In these terms, asynchronism is the variation of the process step quantity without respect to the real duration of these seeps in physical time. The measuring tool for time is a clock and the precision of the clock (along with the system of signal delivery) we can attain determines the area of its application (the allowed value of physical time step). The basic idea of self-timing is detecting the moments when transient processes in physical components are over and producing the corresponding logical signals that provide the transition to logical time (delay-insensitive design) in spite of the delay variation reasons. As all the logical signals invariant to the physical time and representing the events in the system are formed, self-timed methodology has a number of efficient hardware support methods to coordinate the events of the corresponding concurrent specification.<>

从硬件专家的角度来看，异步性与物理时间的概念有关，作为一个独立的物理变量，由硬件电路、模块和块中的瞬态过程持续时间的变化决定，这些电路、模块和块本质上是物理对象。软件和体系结构专家将异步视为逻辑对象事件的部分顺序，也就是说，他们从逻辑时间的角度进行思考。在这些术语中，异步是过程步骤数量的变化，而不考虑这些泄漏在物理时间中的实际持续时间。时间的测量工具是一个时钟，时钟的精度(以及信号传输系统)决定了它的应用范围(物理时间步长允许值)。自定时的基本思想是检测物理组件中的瞬态过程结束时的时刻，并产生相应的逻辑信号，尽管存在延迟变化的原因，但这些信号提供了向逻辑时间(延迟不敏感设计)的过渡。由于所有的逻辑信号都与物理时间保持不变，并表示系统中的事件，因此自定时方法有许多有效的硬件支持方法来协调相应并发规范的事件。

{"title":"Asynchronous interaction in massively parallel computing","authors":"V.L. Varscavsky","doi":"10.1109/ICAPP.1995.472302","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472302","url":null,"abstract":"From the standpoint of hardware experts, asynchronism is connected with the concept of physical time as an independent physical variable and is determined by the variations of transient process durations in hardware circuits, modules and blocks that are physical objects by their nature. Software and architecture experts treat asynchronism as a partial order on events that are logical objects, i.e. they think in terms of logical time. In these terms, asynchronism is the variation of the process step quantity without respect to the real duration of these seeps in physical time. The measuring tool for time is a clock and the precision of the clock (along with the system of signal delivery) we can attain determines the area of its application (the allowed value of physical time step). The basic idea of self-timing is detecting the moments when transient processes in physical components are over and producing the corresponding logical signals that provide the transition to logical time (delay-insensitive design) in spite of the delay variation reasons. As all the logical signals invariant to the physical time and representing the events in the system are formed, self-timed methodology has a number of efficient hardware support methods to coordinate the events of the corresponding concurrent specification.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128726601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A slicing-floorplan algorithm implementation for VLSI design VLSI设计中的切片平面算法实现

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472278

N. Mani, B. Srinivasan

This paper describes a floorplan design approach that combines both a heuristic graph bipartitioning procedure and a slicing tree representation in the physical design of VLSI systems. The description of the circuit to be floorplanned contains a set of functional modules each having a number of possible dimensions and a net-list containing the connectivity information. The slicing tree representation provides an efficient free traversal operations using recursion for obtaining area-efficient floorplans. The slicing paradigm also eliminates the cyclical conflicts in module placement and hence ensures better routability.<>

本文介绍了一种结合启发式图二分法和切片树表示的超大规模集成电路系统物理设计中的平面设计方法。要进行布局的电路的描述包含一组功能模块，每个模块都有许多可能的尺寸和包含连接信息的网络列表。切片树表示法提供了一种有效的自由遍历操作，使用递归来获得面积有效的平面图。切片模式还消除了模块放置中的周期性冲突，从而确保了更好的可达性

引用次数: 1

A software instrumentation technique for performance tuning of message-passing programs 一种软件检测技术，用于消息传递程序的性能调优

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472245

S. Lei, Kang Zhang

A major problem with collecting trace data for performance monitoring is its intrusiveness to the program being monitored. It sometimes distorts the run-time behaviour of the program so that the collected data become irrelevant to its original program. We proposed a new technique, called the postponing technique, to maintain the original program behaviour in order to collect accurate performance data. It preserves event orders by equalling the instrumentation delay for each pair of communication events. This technique does not extend the execution time taken by the conventional approach and is able to estimate the original event ordering. Our technique was implemented on a Connection Machine, CM-5. We find that the technique estimates more accurate event ordering information than the conventional technique.<>

收集跟踪数据用于性能监视的一个主要问题是它对被监视程序的侵入性。它有时会扭曲程序的运行时行为，使收集到的数据与原始程序无关。我们提出了一种新的技术，称为延迟技术，以保持原有的程序行为，以收集准确的性能数据。它通过使每对通信事件的检测延迟相等来保持事件顺序。该技术不会延长传统方法所花费的执行时间，并且能够估计原始事件顺序。我们的技术是在连接机CM-5上实现的。我们发现，该技术比传统技术估计更准确的事件排序信息。

引用次数: 0

An L/sub 1/ Voronoi diagram algorithm for a reconfigurable mesh 可重构网格的L/sub 1/ Voronoi图算法

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472216

H. ElGindy, L. Wetherall

In this paper we introduce an algorithm for computing the Voronoi Diagram using the L/sub 1/ metric for n planar points on the reconfigurable mesh model of computation. The algorithm contains a new technique of embedding a planar graph on the mesh using the reconfigurable nature of the architecture.<>

本文介绍了一种基于可重构网格计算模型上n个平面点的L/sub 1/ metric计算Voronoi图的算法。该算法包含了一种利用结构的可重构特性在网格上嵌入平面图形的新技术。

引用次数: 2

Associative broadcast communication in massively parallel SIMD machines: a practical approach 大规模并行SIMD机器中的关联广播通信:一种实用方法

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472291

Ok-Hyeong Cho, R. Colomb

In massively parallel SIMD machines, communication bottlenecks have been a major problem due to the limitation of available topologies. Especially they are not well suited to broadcast-type communications. Some suggested approaches are not practical, even though they are asymptotically fast, because they incur large minimum latency. In this paper, a simple and practical linear broadcast-type communication algorithm which is based on associative computing and does not use interconnection networks at all, is presented.<>

在大规模并行SIMD机器中，由于可用拓扑的限制，通信瓶颈一直是一个主要问题。特别是它们不太适合广播型通信。一些建议的方法是不实用的，尽管它们是渐近快速的，因为它们会产生很大的最小延迟。本文提出了一种简单实用的基于关联计算的线性广播型通信算法，该算法完全不使用互连网络。

引用次数: 0

A new design methodology for optical hypercube interconnection network 一种新的光超立方体互连网络设计方法

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472289

M.F. Ali, M. Guizani

An efficient design methodology for the construction of an optical space invariant hypercube interconnection network is presented. This network connects a two-dimensional array of input nodes to a two-dimensional array of output nodes. The basis of the design is a 2/sup 6/ node hypercube from which hypercubes of higher dimensions can be built. The requirements for the optical implementation of this scheme are also proposed. It is shown that hypercubes of dimension up to 21 can be realized using the given implementation.<>

提出了一种构建光学空间不变超立方体互联网络的有效设计方法。该网络将二维输入节点阵列连接到二维输出节点阵列。设计的基础是一个2/sup 6/ node的超立方体，在此基础上可以构建更高维度的超立方体。提出了该方案的光学实现要求。结果表明，使用给定的实现可以实现21维的超立方体。

引用次数: 2

A communication framework for heterogeneous distributed pattern analysis 异构分布式模式分析的通信框架

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472283

Gernot A. Fink, N. Jungclaus, Helge Ritter, G. Sagerer

Unlike in traditional approaches to parallel or distributed processing where normally well structured problems are implemented completely in some programming environment we are faced with the problem of integrating existing heterogeneous software systems. Furthermore, pattern analysis stresses special aspects of communication capabilities. Therefore, we propose a new communication framework dedicated to heterogeneous pattern analysis systems that handles typed structured data, enables completely symmetric interaction, and provides various call semantics. A first prototype evaluating some of the concepts in practical situations is presented.<>

在传统的并行或分布式处理方法中，通常结构良好的问题在某些编程环境中完全实现，而我们面临的问题是集成现有的异构软件系统。此外，模式分析强调通信能力的特殊方面。因此，我们提出了一种新的通信框架，专门用于异构模式分析系统，该系统处理类型化结构化数据，支持完全对称交互，并提供各种调用语义。第一个原型评估了一些概念在实际情况下

引用次数: 42

A global code scheduling technique using guarded PDG 一种基于保护PDG的全局代码调度技术

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

Pub Date : 1995-04-19 DOI: 10.1109/ICAPP.1995.472253

A. Koseki, H. Komatsu, Y. Fukazawa

For instruction-level parallel machines, it is essential to extract parallelly executable instructions from a program by code scheduling. In this paper, we propose a new code scheduling technique using an extension of PDG. This technique parallelizes non-numerical programs, producing better machine codes than these created by percolation scheduling.<>

对于指令级并行机，通过代码调度从程序中提取并行可执行指令至关重要。本文提出了一种新的基于PDG的代码调度技术。这种技术将非数值程序并行化，产生比渗透调度产生的代码更好的机器码。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀