The Sixth Distributed Memory Computing Conference, 1991. Proceedings最新文献

英文中文

Performance Of Mulitprocessor Structures For Fast Digital Sar Processing 快速数字Sar处理的多处理器结构性能研究

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633345

R. Albrizio, G. Aloisio, A. Mazzone, P. Messina, N. Veneziani

引用次数: 2

Virtual Processors Considered Harmful 被认为有害的虚拟处理器

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633096

Peter Christy

The design and implementation of commercial, massively parallel computers is at an interesting level of maturity where many basic design decisions are still influx. This paper considers alternative means of programming machine-size independent computations on a parallel array computer. Two specific mechanisms are contrasted:

商业、大规模并行计算机的设计和实现正处于一个有趣的成熟阶段，许多基本的设计决策仍在不断涌现。本文考虑了在并行阵列计算机上编程与机器大小无关的计算的替代方法。对比了两种具体机制:

引用次数: 10

An Evolutionary Approach to Load Balancing Parallel Computations 负载平衡并行计算的进化方法

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633124

N. Mansour, Geoffrey Fox

We present a new approach to balancing the work load in a multicomputer when the problem is de composed into subproblems mapped to the processors. It is based on a hybrid genetic algo rithm. A number of design choices for genetic algo rithms are combined in order to ameliorate the problem of premature convergence that is often en countered in the implementation of classical genet ic algorithms. The algorithm is hybridized by including a hill climbing procedure which signifi cantly improves the efficiency of the evolution. Moreover, it makes use of problem specific infor mation to evade some computational costs and to reinforce favorable aspects of the genetic search at some appropriate points. The experimental results show that the hybrid genetic algorithm can find so lutions within 3% of the optimum in a reasonable time. They also suggest that this approach is not bi ased towards particular problem structures.

提出了一种将问题分解成映射到处理器上的子问题来平衡多机工作负载的新方法。它是基于一种混合遗传算法。为了改善经典遗传算法在实现过程中经常遇到的早熟收敛问题，将多种遗传算法的设计选择组合在一起。该算法通过加入爬坡过程进行杂交，显著提高了进化效率。此外，它利用问题特定信息来避免一些计算成本，并在适当的点上加强遗传搜索的有利方面。实验结果表明，混合遗传算法能在合理的时间内，在最优解的3%以内找到3个解。他们还指出，这种方法并不针对特定的问题结构。

引用次数: 6

A Parallel Multi-Phase Implementation of Simulated Annealing for the Traveling Salesman Problem 旅行商问题的模拟退火并行多阶段实现

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633303

D.R. Mallampati, P. Mutalik, R. L. Wainwright

This paper describes and unulyses a new parallel algorithm using simulated annealing forfinding a good solution to the Traveling Salesman Problem. This algorithm combines the strong points of three recent implementations [ I ,251 with some new features. An initial tour is generated and partitioned among a ring of processors. Each processor receives two disconnected parts (tiers) of the tour. The algorithm is subdivided into three phases. In phase one, 2-opting is performed separately within each of the two tiers of the tour. During the secondphase remoteswapping is performed between cities from the two diflerent tiers of the tour. During phase three, synchronization of the cities is accomplished by each processor shifting a quarter of its cities in a clock-wise direction to its neighboring node. This is called a quarter-spin. Results show this algorithm is superior over recent implementations. For the datasets tested, this algorithm yielded improvements ranging from 32% to 56% compared to three recent implementations. The signiBcance of this algorithm is the manner in which cities from different parts of the tour are combined to form new tours. The multiple phases within the algorithm allows for a better mixture of cities compared to previous algorithms.

本文描述并分析了一种新的用模拟退火求解旅行商问题的并行算法。该算法结合了最近三种实现[I,251]的优点和一些新特性。生成初始漫游，并在一圈处理器之间进行分区。每个处理器接收两个不相连的部分(层)。该算法分为三个阶段。在第一阶段，选择2是在旅游的两层中分别进行的。在第二阶段，在两个不同层次的城市之间进行远程切换。在第三阶段，城市的同步是通过每个处理器将四分之一的城市按顺时针方向移动到相邻节点来完成的。这被称为四分之一旋转。结果表明，该算法优于现有的算法。对于测试的数据集，与最近的三种实现相比，该算法的改进幅度从32%到56%不等。该算法的重要之处在于，它将不同部分的城市组合成新的线路。与以前的算法相比，算法中的多个阶段允许更好地混合城市。

{"title":"A Parallel Multi-Phase Implementation of Simulated Annealing for the Traveling Salesman Problem","authors":"D.R. Mallampati, P. Mutalik, R. L. Wainwright","doi":"10.1109/DMCC.1991.633303","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633303","url":null,"abstract":"This paper describes and unulyses a new parallel algorithm using simulated annealing forfinding a good solution to the Traveling Salesman Problem. This algorithm combines the strong points of three recent implementations [ I ,251 with some new features. An initial tour is generated and partitioned among a ring of processors. Each processor receives two disconnected parts (tiers) of the tour. The algorithm is subdivided into three phases. In phase one, 2-opting is performed separately within each of the two tiers of the tour. During the secondphase remoteswapping is performed between cities from the two diflerent tiers of the tour. During phase three, synchronization of the cities is accomplished by each processor shifting a quarter of its cities in a clock-wise direction to its neighboring node. This is called a quarter-spin. Results show this algorithm is superior over recent implementations. For the datasets tested, this algorithm yielded improvements ranging from 32% to 56% compared to three recent implementations. The signiBcance of this algorithm is the manner in which cities from different parts of the tour are combined to form new tours. The multiple phases within the algorithm allows for a better mixture of cities compared to previous algorithms.","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"55 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131965914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

High-Performance Adaptive Routing in Multicomputers Using Dynamic Virtual Circuits 基于动态虚拟电路的多机高性能自适应路由

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633176

Y. Tamir, Yoshio Turner

A message transport mechanism which provides highbandwidth low-latency interprocessor communication is the key to the ability of multicomputers to achieve high performance. The system should adapt to changing conditions by routing packets around congested areas and failed links or nodes. We introduce a new message transport mechanism, called Dynamic Virtual Circuits, that combines the best features of circuit switching, packet switching, and static virtual circuits. Routing through intermediate nodes usually requires only a single lookup in a small table, packets include minimal control information, and are delivered in FIFO order. Nodes in the middle of a Dynamic Virtual Circuit can break it and later reestablish it through a different physical path, thus supporting adaptive routing while maintaining the semantics of virtual circuits. We present the basic algorithms for Dynamic Virtual Circuits and the required hardware support in the context of a VLSI communication coprocessor for multicomputers.

提供高带宽、低延迟处理器间通信的消息传输机制是多计算机实现高性能的关键。系统应该通过在拥塞区域和故障链路或节点周围路由数据包来适应不断变化的条件。我们引入了一种新的消息传输机制，称为动态虚拟电路，它结合了电路交换、分组交换和静态虚拟电路的最佳特性。通过中间节点的路由通常只需要在一个小表中进行一次查找，数据包包含最小的控制信息，并按FIFO顺序传递。动态虚拟电路中间的节点可以中断它，然后通过不同的物理路径重新建立它，从而在保持虚拟电路语义的同时支持自适应路由。我们提出了动态虚拟电路的基本算法，以及在多台计算机的VLSI通信协处理器的背景下所需的硬件支持。

引用次数: 10

Message Routing On Irregular 2d-meshes And Tori 不规则二维网格和Tori上的消息路由

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633115

T. Stricker

Wormhole message routing is supported by the communication hardware of several distributed memory machines. This particular method of message routing has numerous advantages but creates the problem of a routing deadlock. When long messages compete for the same channels in the network, some messages will be blocked until the the first message is fully consumed by the processor at the destination of the message. A deadlock occurs if a set of messages mutually blocks, and no message can progress towards its destination. Most deadlock free routing schemes previously known are designed to work on regular binary hypercubes. Regular hypercubes and meshes are just a special case of networks. However, these routing schemes do not provide enough flexibility to deal with irregular 2-D-tori and with attached auxiliary cells, which can be found on many newer parallel systems. To handle irregular topologies elegantly, a simple proof is necessary to verify the router code. The new proof given in this report is carried out directly on the network graph. It is constructive in the sense that it reveals the design options to deal with irregularities and shows how additional flexibility can be used to achieve better load balancing. Based on the modified routing model, a set of deadlock free router functions relevant to the iWarp system configurations are described and proven to be correct.

虫洞消息路由由多台分布式内存机的通信硬件支持。这种特定的消息路由方法有许多优点，但会产生路由死锁的问题。当长消息在网络中竞争相同的通道时，一些消息将被阻塞，直到第一条消息被消息目的地的处理器完全消耗。如果一组消息相互阻塞，并且没有消息可以向其目的地前进，则会发生死锁。以前已知的大多数无死锁路由方案都设计用于正则二进制超多维数据集。常规超立方体和网格只是网络的一种特殊情况。然而，这些路由方案没有提供足够的灵活性来处理不规则的二维环面和附加的辅助单元，这可以在许多新的并行系统中找到。为了优雅地处理不规则拓扑，需要一个简单的证明来验证路由器代码。本文给出的新证明是直接在网络图上进行的。它是建设性的，因为它揭示了处理不规则性的设计选项，并展示了如何使用额外的灵活性来实现更好的负载平衡。基于改进的路由模型，描述了一组与iWarp系统配置相关的无死锁路由函数，并证明了其正确性。

{"title":"Message Routing On Irregular 2d-meshes And Tori","authors":"T. Stricker","doi":"10.1109/DMCC.1991.633115","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633115","url":null,"abstract":"Wormhole message routing is supported by the communication hardware of several distributed memory machines. This particular method of message routing has numerous advantages but creates the problem of a routing deadlock. When long messages compete for the same channels in the network, some messages will be blocked until the the first message is fully consumed by the processor at the destination of the message. A deadlock occurs if a set of messages mutually blocks, and no message can progress towards its destination. Most deadlock free routing schemes previously known are designed to work on regular binary hypercubes. Regular hypercubes and meshes are just a special case of networks. However, these routing schemes do not provide enough flexibility to deal with irregular 2-D-tori and with attached auxiliary cells, which can be found on many newer parallel systems. To handle irregular topologies elegantly, a simple proof is necessary to verify the router code. The new proof given in this report is carried out directly on the network graph. It is constructive in the sense that it reveals the design options to deal with irregularities and shows how additional flexibility can be used to achieve better load balancing. Based on the modified routing model, a set of deadlock free router functions relevant to the iWarp system configurations are described and proven to be correct.","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125921645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Parallel Solutions to the Phase Problem in X-Ray Crystallography: An Update x射线晶体学中相问题的平行解:更新

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633318

G. DeTitta, H. Hauptman, R. Miller, M. Pagels, T. Sabin, P. Thuman, C. Weeks

引用次数: 3

The Touchstone 30 Gigaflop DELTA Prototype 试金石30千兆次浮点DELTA原型机

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633353

S. Lillevik

In Sep tember , 1990, the Intel C o r p o r a t i o n demonstrated the third of four major Touchstone Program prototype systems. Denoted DELTA, the prototype scales to over 500 nodes, provides aggregate peak performance in excess of 30 GFLOP’s, and contains a new ,interconnect network based on a Caltech-designed router device. DELTA contains four heterogeneous node types f o r numeric, service, inputloutput, and network fmctions. The operating system supports message-passing paradigms and intefaces with a Concurrent File System. Users access DELTA across a local area network and may select either the C or FORTRAN programmin,g languages. An interactive parallel debugger assists in application development and performance tuning.

1990年9月登贝就是,英特尔C o r p o r t i o n证明了四个主要的第三个试金石程序原型系统。该原型被称为DELTA，可扩展到500多个节点，提供超过30 GFLOP的总峰值性能，并包含一个基于加州理工学院设计的路由器设备的新型互连网络。DELTA包含四种异构节点类型:数字、服务、输入输出和网络功能。操作系统支持消息传递范例和并发文件系统接口。用户通过局域网访问DELTA，可以选择C或FORTRAN编程语言。交互式并行调试器有助于应用程序开发和性能调优。

引用次数: 87

Resource Management in a Large Reconfigurable Transputer-based System 大型可重构传输器系统中的资源管理

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633072

Halsur V. Sreekantaswamy, N. Goldstein, A. Wagner, S. Chanson

This paper describes two of the intijor components of TIPS, a Transputer-based Iiirteractive Parallelizing System, under development at UBC. The system runs on a 74 node transpuier system iiilercorinectcd by crossbar switches with inulliple liiihs to ike host, a SUN-4. It uses Trollaus with the Logical Syslems C compiler. The first component described is TMRP, a topology independent mapping facility. TMAP’s objective is to automate the mapping process, and muhe it independent from changes an the underlying architecture. It integrates two large pieces of soft,ware, Trollivs and Prep-p. We describe its design and discuss specific problems in trying to achieve a machine iudependent environment. The second com.poiient described is TRES, a higher level reso’urce managernelit facility. TRES is based on parameterized models of coniputation which are used 20 predict perforni.nnce and optimize the use of machine resources. The user need only specify the model (i.e. prograinmin,g paradigm) and the computational task to be performed. TRES determines the optimal topology and number of processors to use. This inforniation is used by the TMAP system.

本文介绍了UBC正在开发的基于转译器的交互式并行化系统TIPS的两个主要组件。该系统运行在一个74节点的透光系统上，该透光系统由具有多个接口的交叉开关连接到像SUN-4这样的主机。它使用Trollaus和logic systems C编译器。描述的第一个组件是TMRP，它是一种与拓扑无关的映射工具。TMAP的目标是使映射过程自动化，并使其独立于底层体系结构的更改。它集成了两大软件:Trollivs和Prep-p。我们描述了它的设计，并讨论了试图实现机器独立环境的具体问题。第二个com。所描述的特性是TRES，一种更高级别的资源管理工具。TRES基于争论的参数化模型，该模型用于预测性能。优化机器资源的使用。用户只需要指定模型(即编程，范式)和要执行的计算任务。TRES确定要使用的最优拓扑和处理器数量。TMAP系统使用这些信息。

{"title":"Resource Management in a Large Reconfigurable Transputer-based System","authors":"Halsur V. Sreekantaswamy, N. Goldstein, A. Wagner, S. Chanson","doi":"10.1109/DMCC.1991.633072","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633072","url":null,"abstract":"This paper describes two of the intijor components of TIPS, a Transputer-based Iiirteractive Parallelizing System, under development at UBC. The system runs on a 74 node transpuier system iiilercorinectcd by crossbar switches with inulliple liiihs to ike host, a SUN-4. It uses Trollaus with the Logical Syslems C compiler. The first component described is TMRP, a topology independent mapping facility. TMAP’s objective is to automate the mapping process, and muhe it independent from changes an the underlying architecture. It integrates two large pieces of soft,ware, Trollivs and Prep-p. We describe its design and discuss specific problems in trying to achieve a machine iudependent environment. The second com.poiient described is TRES, a higher level reso’urce managernelit facility. TRES is based on parameterized models of coniputation which are used 20 predict perforni.nnce and optimize the use of machine resources. The user need only specify the model (i.e. prograinmin,g paradigm) and the computational task to be performed. TRES determines the optimal topology and number of processors to use. This inforniation is used by the TMAP system.","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127931834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On Embedding Permutations in Hypercubes 关于超立方体中的嵌入置换

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633347

Arun Kumar Somani, Sangbang Choi

The interconnection network of a multiprocessor system should be able to embed an arbitrary permutation of nodes to map an arbitrary structure of a program graph and realize required communication paths. We show that distributed routing algorithms have high blocking probability to route permutations in binarycube-based systems. We further show that there exists no recursive algorithm to embed a permutation in binary n-cube for n 2 5 . We t.2en develop rearrangeable hypercube architectures and ro,uting algorithms to realize arbitra y permutations in circzlit switching. We show that if each connection between two neighboring nodes consists of 2 pairs of links, i.e., (2 full-duplex communication lines), the hypercube can embed 2 arbitrary permutations of nodes simultaneously. We also prove that a hypercube is rearrangeable i f one additional pair of links is provided in any one dimension of connections.

多处理器系统的互连网络应该能够嵌入任意排列的节点，以映射任意结构的程序图，并实现所需的通信路径。在基于二进制立方体的系统中，分布式路由算法对路由排列具有很高的阻塞概率。我们进一步证明了不存在递归算法来嵌入二进制n-cube中n- 25的排列。我们还开发了可重新排列的超立方体架构和路由算法，以实现电路交换中的任意排列。我们证明，如果两个相邻节点之间的每个连接由2对链路组成，即(2条全双工通信线路)，则超立方体可以同时嵌入2个任意排列的节点。我们还证明了如果在任意一维的连接中提供额外的一对链路，则超立方体是可重排的。

引用次数: 6

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀