首页 > 最新文献

[1993] Proceedings Seventh International Parallel Processing Symposium最新文献

英文 中文
On the power of segmenting and fusing buses 关于分割和融合总线的力量
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262859
R. K. Thiruchelvan, J. Trahan, R. Vaidyanathan
The authors investigate communication among synchronous parallel processors through a new model of parallel computation called the reconfigurable multiple bus machine (RMBM). Four versions of the RMBM are introduced. In these models processors communicate over buses and possess varying abilities to segment and/or fuse buses. A hierarchy of the versions of the RMBM and the PRAM based on their relative 'powers' is established, indicating the relative contribution of segmenting and fusing buses.<>
作者通过一种新的并行计算模型——可重构多总线机(RMBM)来研究同步并行处理器之间的通信。介绍了四种版本的RMBM。在这些模型中,处理器通过总线进行通信,并具有分段和/或融合总线的不同能力。根据其相对“权力”建立了RMBM和PRAM版本的层次结构,表明了分割和融合总线的相对贡献
{"title":"On the power of segmenting and fusing buses","authors":"R. K. Thiruchelvan, J. Trahan, R. Vaidyanathan","doi":"10.1109/IPPS.1993.262859","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262859","url":null,"abstract":"The authors investigate communication among synchronous parallel processors through a new model of parallel computation called the reconfigurable multiple bus machine (RMBM). Four versions of the RMBM are introduced. In these models processors communicate over buses and possess varying abilities to segment and/or fuse buses. A hierarchy of the versions of the RMBM and the PRAM based on their relative 'powers' is established, indicating the relative contribution of segmenting and fusing buses.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122539199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
A performance comparison of several superscalar processor models with a VLIW processor 几种超标量处理器模型与VLIW处理器的性能比较
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262853
John Lenell, N. Bagherzadeh
This paper quantitatively compares various superscalar processor architectures with a very long instruction word architecture developed at the University of California, Irvine. The motivation for this comparison is to study the capability of a dynamically scheduled processor to obtain the same performance achieved by a statically scheduled processor, and examine the hardware resources required by each.<>
本文以加州大学欧文分校开发的超长指令字体系结构为例,对各种超标量处理器体系结构进行了定量比较。进行这种比较的动机是研究动态调度处理器获得与静态调度处理器相同性能的能力,并检查每个处理器所需的硬件资源。
{"title":"A performance comparison of several superscalar processor models with a VLIW processor","authors":"John Lenell, N. Bagherzadeh","doi":"10.1109/IPPS.1993.262853","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262853","url":null,"abstract":"This paper quantitatively compares various superscalar processor architectures with a very long instruction word architecture developed at the University of California, Irvine. The motivation for this comparison is to study the capability of a dynamically scheduled processor to obtain the same performance achieved by a statically scheduled processor, and examine the hardware resources required by each.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115190288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Parallel A* algorithms and their performance on hypercube multiprocessors 并行A*算法及其在超立方体多处理器上的性能
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262779
S. Dutt, N. Mahapatra
The authors develop parallel A* algorithms suitable for distributed-memory machines. In parallel A* algorithms, inefficiencies grow with the number of processors P used, causing performance to drop significantly at lower and intermediate work densities (the ratio of the problem size to P). To alleviate this effect, they propose a novel parallel startup phase and efficient dynamic work distribution strategies and thus improve the scalability of parallel A* search. They also tackle the problem of duplicate searching by different processors, by using work transfer as a means to partial duplicate pruning. The parallel startup scheme proposed requires only Theta (log P) time compared to Theta (P) time for sequential startup methods used in the past. Using the traveling salesman problem (TSP) as the test case, the work distribution strategies yield speedup improvements of more than 30% and 15% at lower and intermediate work densities, respectively, while requiring 20% to 45% less memory, compared to previous approaches. Moreover, the simple duplicate pruning scheme provides an average reduction of 20% in execution time for up to 64 processors, compared to previous approaches that do not prune any duplicates.<>
作者开发了适用于分布式存储机器的并行A*算法。在并行A*算法中,低效率随着使用的处理器数量P的增加而增加,导致在低和中等工作密度(问题大小与P的比率)下性能显著下降。为了缓解这种影响,他们提出了一种新的并行启动阶段和有效的动态工作分配策略,从而提高并行A*搜索的可扩展性。他们还解决了不同处理器的重复搜索问题,通过使用功转移作为部分重复修剪的手段。与过去使用的顺序启动方法的Theta (P)时间相比,提出的并行启动方案只需要Theta (log P)时间。以旅行推销员问题(TSP)为测试用例,工作分配策略在较低和中等工作密度下的速度分别提高了30%和15%以上,而所需的内存比以前的方法减少了20%到45%。此外,与以前不修剪任何重复的方法相比,简单的重复修剪方案为最多64个处理器提供了平均减少20%的执行时间。
{"title":"Parallel A* algorithms and their performance on hypercube multiprocessors","authors":"S. Dutt, N. Mahapatra","doi":"10.1109/IPPS.1993.262779","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262779","url":null,"abstract":"The authors develop parallel A* algorithms suitable for distributed-memory machines. In parallel A* algorithms, inefficiencies grow with the number of processors P used, causing performance to drop significantly at lower and intermediate work densities (the ratio of the problem size to P). To alleviate this effect, they propose a novel parallel startup phase and efficient dynamic work distribution strategies and thus improve the scalability of parallel A* search. They also tackle the problem of duplicate searching by different processors, by using work transfer as a means to partial duplicate pruning. The parallel startup scheme proposed requires only Theta (log P) time compared to Theta (P) time for sequential startup methods used in the past. Using the traveling salesman problem (TSP) as the test case, the work distribution strategies yield speedup improvements of more than 30% and 15% at lower and intermediate work densities, respectively, while requiring 20% to 45% less memory, compared to previous approaches. Moreover, the simple duplicate pruning scheme provides an average reduction of 20% in execution time for up to 64 processors, compared to previous approaches that do not prune any duplicates.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115537764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Gossiping on interval graphs (computer networks) 区间图上的闲谈(计算机网络)
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262902
Suresh Singh, M. Sridhar
The authors present an algorithm that produces a schedule of transmissions to achieve gossiping in a system where nodes are equipped with radio transceivers; different nodes have transceivers with different ranges. They assume that all the nodes are placed on a line and that all communication occurs in one dimension. Finally, unlike traditional system models for gossiping, they assume that all nodes within range can hear a transmission and that simultaneous transmissions may cause information loss via collisions. The gossiping algorithm developed decomposes a system configuration into a spine tree and uses this structure to recursively produce a transmission schedule.<>
作者提出了一种算法,该算法产生传输时间表,以在节点配备无线电收发器的系统中实现八卦;不同的节点有不同范围的收发器。他们假设所有的节点都放在一条线上,所有的通信都发生在一个维度上。最后,与传统的八卦系统模型不同,他们假设范围内的所有节点都能听到传输,同时传输可能会通过碰撞导致信息丢失。所开发的八卦算法将系统配置分解为脊柱树,并使用该结构递归地生成传输调度。
{"title":"Gossiping on interval graphs (computer networks)","authors":"Suresh Singh, M. Sridhar","doi":"10.1109/IPPS.1993.262902","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262902","url":null,"abstract":"The authors present an algorithm that produces a schedule of transmissions to achieve gossiping in a system where nodes are equipped with radio transceivers; different nodes have transceivers with different ranges. They assume that all the nodes are placed on a line and that all communication occurs in one dimension. Finally, unlike traditional system models for gossiping, they assume that all nodes within range can hear a transmission and that simultaneous transmissions may cause information loss via collisions. The gossiping algorithm developed decomposes a system configuration into a spine tree and uses this structure to recursively produce a transmission schedule.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114766468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New degree four networks: properties and performance 新四级网络:属性和性能
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262875
Gebre A. Gessesse, S. Chalasani
Two-dimensional tori, or its variants such as the midimew networks, are the most popular degree-four interconnection networks. However, the number of nodes interconnected by two-dimensional tori or the midimew networks grows as a square of their diameters. The authors discuss two different types of degree-four interconnection networks, the starcake networks and the k-ary 2-cliques. These graphs are regular, vertex-symmetric, maximally fault-tolerant and have a better diameter than the popular degree-four networks. They discuss the construction and routing of these networks and compare them with other interconnection networks. A preliminary performance comparison indicates that the proposed networks offer better throughput-delay characteristics than tori and midimew networks.<>
二维环面,或其变体,如中间网络,是最流行的四级互连网络。然而,由二维环面或中间网络连接的节点数量以其直径的平方增长。讨论了两种不同类型的四度互连网络,即starcake网络和k-ary - 2-cliques网络。这些图是规则的、顶点对称的、最大容错性的,并且具有比流行的四度网络更好的直径。他们讨论了这些网络的结构和路由,并将它们与其他互连网络进行了比较。初步的性能比较表明,所提出的网络比tori和midw网络具有更好的吞吐量-延迟特性
{"title":"New degree four networks: properties and performance","authors":"Gebre A. Gessesse, S. Chalasani","doi":"10.1109/IPPS.1993.262875","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262875","url":null,"abstract":"Two-dimensional tori, or its variants such as the midimew networks, are the most popular degree-four interconnection networks. However, the number of nodes interconnected by two-dimensional tori or the midimew networks grows as a square of their diameters. The authors discuss two different types of degree-four interconnection networks, the starcake networks and the k-ary 2-cliques. These graphs are regular, vertex-symmetric, maximally fault-tolerant and have a better diameter than the popular degree-four networks. They discuss the construction and routing of these networks and compare them with other interconnection networks. A preliminary performance comparison indicates that the proposed networks offer better throughput-delay characteristics than tori and midimew networks.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114728208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Parallel execution of real-time rule-based systems 基于规则的实时系统的并行执行
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262782
A. Cheng
When rule-based expert systems are used to monitor and control real-time systems, the ability of these expert systems to met stringent response time constraints is as important as their ability to produce correct results to react to input. This paper explores parallel execution as an approach to achieve higher execution speed in rule-based systems in domains requiring high performance and real-time response. In particular, it shows how rule-firing parallelism can be automatically extracted from a real-time rule-based system via static analysis of the system source code. To demonstrate the practicality of this approach, the proposed technique is applied to reduce the execution time of two NASA expert systems.<>
当基于规则的专家系统用于监测和控制实时系统时,这些专家系统满足严格响应时间限制的能力与生成正确结果以响应输入的能力同样重要。本文探讨了并行执行作为一种方法,在需要高性能和实时响应的领域中实现基于规则的系统的更高执行速度。特别是,它展示了如何通过对系统源代码的静态分析,从基于实时规则的系统中自动提取规则触发并行性。为了证明该方法的实用性,将所提出的技术应用于减少两个NASA专家系统的执行时间。
{"title":"Parallel execution of real-time rule-based systems","authors":"A. Cheng","doi":"10.1109/IPPS.1993.262782","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262782","url":null,"abstract":"When rule-based expert systems are used to monitor and control real-time systems, the ability of these expert systems to met stringent response time constraints is as important as their ability to produce correct results to react to input. This paper explores parallel execution as an approach to achieve higher execution speed in rule-based systems in domains requiring high performance and real-time response. In particular, it shows how rule-firing parallelism can be automatically extracted from a real-time rule-based system via static analysis of the system source code. To demonstrate the practicality of this approach, the proposed technique is applied to reduce the execution time of two NASA expert systems.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"11 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120909721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Managing the bottlenecks of a parallel Gauss-Seidel algorithm for power flow analysis 潮流分析中并行高斯-塞德尔算法的瓶颈管理
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262781
Garng M. Huang, W. Ongsakul
The parallelization and implementations of Gauss-Seidel (G-S) algorithms for power flow analysis have been investigated on a Sequent Balance shared memory (SM) machine. In this paper, the authors generalize the idea to more general computer architectures and demonstrate how to effectively increase the speedup upper bounds of G-S algorithms by properly managing the bottlenecks.<>
本文研究了一种用于潮流分析的高斯-赛德尔(G-S)算法在顺序平衡共享内存(SM)机上的并行化和实现。在本文中,作者将这一思想推广到更通用的计算机体系结构中,并演示了如何通过适当管理瓶颈来有效地提高G-S算法的加速上界。
{"title":"Managing the bottlenecks of a parallel Gauss-Seidel algorithm for power flow analysis","authors":"Garng M. Huang, W. Ongsakul","doi":"10.1109/IPPS.1993.262781","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262781","url":null,"abstract":"The parallelization and implementations of Gauss-Seidel (G-S) algorithms for power flow analysis have been investigated on a Sequent Balance shared memory (SM) machine. In this paper, the authors generalize the idea to more general computer architectures and demonstrate how to effectively increase the speedup upper bounds of G-S algorithms by properly managing the bottlenecks.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115209841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Least common ancestor networks 最小共同祖先网络
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262825
I. Scherson, Chi-Kai Chien
Least common ancestor networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subclass are analyzed. Simulation results for three permutation classes verify the accuracy of an iterative solution for a randomized routing strategy.<>
介绍并展示了最小共同祖先网络(lcan)是一类网络,包括fat-tree、基线网络、SW-banyans和TRAC 1.1和2.0的路由器网络以及CM-5。阐述了LCAN的一些性质,并分析了一个重要子类的排列路由能力。三种排列类型的仿真结果验证了随机化路由策略迭代解的准确性。
{"title":"Least common ancestor networks","authors":"I. Scherson, Chi-Kai Chien","doi":"10.1109/IPPS.1993.262825","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262825","url":null,"abstract":"Least common ancestor networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subclass are analyzed. Simulation results for three permutation classes verify the accuracy of an iterative solution for a randomized routing strategy.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122252938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Class and user based parallelism in Raven Raven中基于类和用户的并行性
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262791
D. Acton, G. Neufeld
This paper presents the concurrency features found in Raven, an object-oriented parallel programming system. Raven supports coarse-grained parallelism via class based and user based parallelism. Class based parallelism is provided by the implementor of the class, while user based parallelism is provided by the user, or client of objects. Raven also supports object properties which are determined at object creation time, thereby eliminating the need for separate class hierarchies that support concurrency. Raven is operational on a variety of machine architectures, including a shared memory multiprocessor. Initial experience indicates that sequential code can easily be transformed into parallel code and that a substantial speedup is possible.<>
本文介绍了面向对象并行程序设计系统Raven的并发特性。Raven通过基于类和基于用户的并行性支持粗粒度并行性。基于类的并行性由类的实现者提供,而基于用户的并行性由用户或对象的客户端提供。Raven还支持在对象创建时确定的对象属性,从而消除了对支持并发性的单独类层次结构的需求。Raven可以在多种机器架构上运行,包括共享内存多处理器。最初的经验表明,顺序代码可以很容易地转换为并行代码,并且可以大大加快速度。
{"title":"Class and user based parallelism in Raven","authors":"D. Acton, G. Neufeld","doi":"10.1109/IPPS.1993.262791","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262791","url":null,"abstract":"This paper presents the concurrency features found in Raven, an object-oriented parallel programming system. Raven supports coarse-grained parallelism via class based and user based parallelism. Class based parallelism is provided by the implementor of the class, while user based parallelism is provided by the user, or client of objects. Raven also supports object properties which are determined at object creation time, thereby eliminating the need for separate class hierarchies that support concurrency. Raven is operational on a variety of machine architectures, including a shared memory multiprocessor. Initial experience indicates that sequential code can easily be transformed into parallel code and that a substantial speedup is possible.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"2 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical interconnection cache networks 分层互连缓存网络
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262870
Sizheng Wei, E. Schenfeld
The hierarchical interconnection cache network (HICN) is a novel network architecture for massively parallel processing systems. The HICN's topology is a hierarchy of multiple, three-stage interconnection cache networks. The first and third stages of each network use small, fast crossbar switches. Large, slow switching (reconfigurable) crossbars are used in the middle stages. HICN exploits a special kind of communication locality, called switching locality, offering greater flexibility and lower latency compared with the classical hierarchical networks. HICN uses small size switches for the communication routing and large size switches for setting up the network (reconfiguration) to match as close as possible the expected communication pattern. The trade-off between the routing speed and the switch size is one major factor of achieving high speed communication in massively parallel interconnection networks. The authors present efficient embeddings of several classical network topologies, such as hypercubes, complete binary trees, and grids, into HICNs. They also show that HICNs are flexibly partitionable.<>
分层互连缓存网络(HICN)是一种面向大规模并行处理系统的新型网络结构。HICN的拓扑结构是由多个三级互连缓存网络组成的层次结构。每个网络的第一阶段和第三阶段使用小型、快速的横排交换机。中间阶段使用大的、慢的交换(可重构)交叉条。HICN利用一种特殊的通信局部性,称为交换局部性,与传统的分层网络相比,它提供了更大的灵活性和更低的延迟。HICN使用小尺寸交换机进行通信路由,使用大尺寸交换机建立网络(重新配置),以尽可能接近预期的通信模式。路由速度和交换机大小之间的权衡是实现大规模并行互连网络中高速通信的一个主要因素。作者提出了几种经典网络拓扑的有效嵌入,如超立方体、完全二叉树和网格,到hicn中。他们还证明了hicn是灵活可分的。
{"title":"Hierarchical interconnection cache networks","authors":"Sizheng Wei, E. Schenfeld","doi":"10.1109/IPPS.1993.262870","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262870","url":null,"abstract":"The hierarchical interconnection cache network (HICN) is a novel network architecture for massively parallel processing systems. The HICN's topology is a hierarchy of multiple, three-stage interconnection cache networks. The first and third stages of each network use small, fast crossbar switches. Large, slow switching (reconfigurable) crossbars are used in the middle stages. HICN exploits a special kind of communication locality, called switching locality, offering greater flexibility and lower latency compared with the classical hierarchical networks. HICN uses small size switches for the communication routing and large size switches for setting up the network (reconfiguration) to match as close as possible the expected communication pattern. The trade-off between the routing speed and the switch size is one major factor of achieving high speed communication in massively parallel interconnection networks. The authors present efficient embeddings of several classical network topologies, such as hypercubes, complete binary trees, and grids, into HICNs. They also show that HICNs are flexibly partitionable.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133645998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
[1993] Proceedings Seventh International Parallel Processing Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1