首页 > 最新文献

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

英文 中文
Representing the scaling behavior of parallel algorithm-machine combinations 表示并行算法-机器组合的缩放行为
D. Rover, Xian-He Sun
The scaling of algorithms and machines is essential to achieve the goals of high-performance computing. Thus, scalability has become an important aspect of parallel algorithm and machine design. It is a desirable property that has been used to describe the demand for proportionate changes in performance with adjustments in system size. It should provide guidance toward an optimal choice of an architecture, algorithm, machine size, and problem size combination. However, as a performance metric, it is not yet well defined or understood. The paper summarizes several scalability metrics, including one that highlights the behavior of algorithm-machine combinations as sizes are varied under an isospeed condition. A scaling relation is presented to facilitate general mathematical and visual techniques for characterizing and comparing the scalability information of these metrics.<>
算法和机器的扩展是实现高性能计算目标的必要条件。因此,可扩展性已成为并行算法和机器设计的一个重要方面。这是一个理想的属性,用于描述随着系统大小的调整而对性能的成比例变化的需求。它应该为体系结构、算法、机器大小和问题大小组合的最佳选择提供指导。然而,作为一种性能度量,它还没有得到很好的定义或理解。本文总结了几个可扩展性指标,其中一个指标强调了算法-机器组合在等速条件下大小变化时的行为。提出了一种比例关系,以方便一般的数学和视觉技术来表征和比较这些度量的可伸缩性信息。
{"title":"Representing the scaling behavior of parallel algorithm-machine combinations","authors":"D. Rover, Xian-He Sun","doi":"10.1109/FMPC.1992.234919","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234919","url":null,"abstract":"The scaling of algorithms and machines is essential to achieve the goals of high-performance computing. Thus, scalability has become an important aspect of parallel algorithm and machine design. It is a desirable property that has been used to describe the demand for proportionate changes in performance with adjustments in system size. It should provide guidance toward an optimal choice of an architecture, algorithm, machine size, and problem size combination. However, as a performance metric, it is not yet well defined or understood. The paper summarizes several scalability metrics, including one that highlights the behavior of algorithm-machine combinations as sizes are varied under an isospeed condition. A scaling relation is presented to facilitate general mathematical and visual techniques for characterizing and comparing the scalability information of these metrics.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122312135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Network design and performance for a massively parallel SIMD system 大规模并行SIMD系统的网络设计与性能
S. Darbha, E. Davis
It is shown that a nearest neighbor communication network can be complimented with a log-diameter multistage network to handle different communications patterns. This is especially useful when the pattern of data movement is not uniform. The designed network is evaluated for two cases: a dense case with many processing elements communicating and a sparse case. For 32-b data, the algorithm for computing partial sums of an array improves by 2.7 times with the multistage interconnection network. In a sparse random case, the number of cycles taken to communicate 32 b is 4000 (with 10% of the nodes communicating). Thus, it is concluded that a network like a multistage omega network is very useful for SIMD (single-instruction multiple-data) massively parallel machines. This is especially true if the machine is to be used for applications where long distance and nonuniform routing patterns are needed.<>
结果表明,最近邻通信网络可以与对数直径多级网络互补,以处理不同的通信模式。这在数据移动模式不统一时特别有用。对设计的网络进行了两种情况下的评价:一种是具有多个处理元素通信的密集情况,另一种是稀疏情况。对于32b数据,采用多级互连网络,阵列部分和的计算算法提高了2.7倍。在稀疏随机情况下,通信32b所需的周期数为4000(10%的节点通信)。因此,可以得出结论,像多级omega网络这样的网络对于SIMD(单指令多数据)大规模并行机非常有用。如果机器要用于需要长距离和非均匀路由模式的应用,则尤其如此。
{"title":"Network design and performance for a massively parallel SIMD system","authors":"S. Darbha, E. Davis","doi":"10.1109/FMPC.1992.234889","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234889","url":null,"abstract":"It is shown that a nearest neighbor communication network can be complimented with a log-diameter multistage network to handle different communications patterns. This is especially useful when the pattern of data movement is not uniform. The designed network is evaluated for two cases: a dense case with many processing elements communicating and a sparse case. For 32-b data, the algorithm for computing partial sums of an array improves by 2.7 times with the multistage interconnection network. In a sparse random case, the number of cycles taken to communicate 32 b is 4000 (with 10% of the nodes communicating). Thus, it is concluded that a network like a multistage omega network is very useful for SIMD (single-instruction multiple-data) massively parallel machines. This is especially true if the machine is to be used for applications where long distance and nonuniform routing patterns are needed.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122876280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Solutions to the phase problem of X-ray crystallography on the Connection Machine CM-2 CM-2型连接机上x射线结晶学相位问题的解决
C.-S. Chang, G. DeTitta, H. Hauptman, R. Miller, M. Poulin, P. Thuman, C. Weeks
The authors have developed a formulation of the phase problem of X-ray crystallography in terms of a minimal function of phases and a new minimization algorithm called shake-and-bake for solving this minimal function. The implementation details of the shake-and-bake strategy on the Connection Machine CM-2 are presented. The shake-and-bake algorithm has been used to determine the atomic structure of four test structures, ranging from 28 to 317 atoms. These results indicate that shake-and-bake is effective on structures of this size.<>
作者提出了一种用相的最小函数表示的x射线晶体学相问题的公式,并提出了一种新的最小化算法,称为震动-烘烤算法来求解该最小函数。给出了在连接机CM-2上实现摇烤策略的具体细节。摇烤算法已经被用来确定四个测试结构的原子结构,从28到317个原子不等。这些结果表明,摇烘法对这种尺寸的结构是有效的
{"title":"Solutions to the phase problem of X-ray crystallography on the Connection Machine CM-2","authors":"C.-S. Chang, G. DeTitta, H. Hauptman, R. Miller, M. Poulin, P. Thuman, C. Weeks","doi":"10.1109/FMPC.1992.234868","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234868","url":null,"abstract":"The authors have developed a formulation of the phase problem of X-ray crystallography in terms of a minimal function of phases and a new minimization algorithm called shake-and-bake for solving this minimal function. The implementation details of the shake-and-bake strategy on the Connection Machine CM-2 are presented. The shake-and-bake algorithm has been used to determine the atomic structure of four test structures, ranging from 28 to 317 atoms. These results indicate that shake-and-bake is effective on structures of this size.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129870625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of hot spot on the performance of multistage interconnection networks 热点对多级互连网络性能的影响
Mohammed Atiquzzaman, M. S. Akhtar
Hot spots in multistage interconnection networks (MSINs) results in performance degradation of the network. The authors develop an analytical model for the performance evaluation of unbuffered MSINs under a single hot spot, followed by a performance comparison with buffered MSINs. For uniform traffic, a buffered network performs better than an unbuffered network. For a nonuniform traffic pattern causing congestion (for example, tree saturation) in the network, an unbuffered network outperforms a buffered network. This leads the authors to suggest a hybrid network which will be capable of switching from the buffered mode to the unbuffered mode in the presence of network congestion.<>
在多级互连网络中,热点问题会导致网络性能下降。作者建立了单一热点下非缓冲MSINs性能评估的分析模型,并与缓冲MSINs进行了性能比较。对于均匀的流量,缓冲网络比无缓冲网络性能更好。对于导致网络拥塞(例如,树饱和)的非均匀流量模式,非缓冲网络的性能优于缓冲网络。这导致作者提出了一种混合网络,它将能够在网络拥塞的情况下从缓冲模式切换到非缓冲模式。
{"title":"Effect of hot spot on the performance of multistage interconnection networks","authors":"Mohammed Atiquzzaman, M. S. Akhtar","doi":"10.1109/FMPC.1992.234871","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234871","url":null,"abstract":"Hot spots in multistage interconnection networks (MSINs) results in performance degradation of the network. The authors develop an analytical model for the performance evaluation of unbuffered MSINs under a single hot spot, followed by a performance comparison with buffered MSINs. For uniform traffic, a buffered network performs better than an unbuffered network. For a nonuniform traffic pattern causing congestion (for example, tree saturation) in the network, an unbuffered network outperforms a buffered network. This leads the authors to suggest a hybrid network which will be capable of switching from the buffered mode to the unbuffered mode in the presence of network congestion.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124684721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Combining switches for the NYU Ultracomputer 纽约大学超级计算机的组合开关
S. Dickey, R. Kenner
A pairwise combining switch has been implemented for use in the 16*16 processor/memory interconnection network of the NYU Ultracomputer prototype. The switch design may be extended for use in very large systems by providing greater combining capability. Methods for doing so are discussed.<>
在纽约大学超级计算机原型机的16*16处理器/存储器互连网络中实现了一对组合开关。通过提供更大的组合能力,开关设计可以扩展到非常大的系统中使用。讨论了这样做的方法
{"title":"Combining switches for the NYU Ultracomputer","authors":"S. Dickey, R. Kenner","doi":"10.1109/FMPC.1992.234864","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234864","url":null,"abstract":"A pairwise combining switch has been implemented for use in the 16*16 processor/memory interconnection network of the NYU Ultracomputer prototype. The switch design may be extended for use in very large systems by providing greater combining capability. Methods for doing so are discussed.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123897494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Performance studies of packet switched augmented shuffle exchange networks 分组交换增强shuffle交换网络的性能研究
V. Ramachandran, R. Raines, J.S. Park, N. Davis
Extends previous research efforts related to the performance modeling of the fault-tolerant Augmented Shuffle Exchange Network (ASEN). The authors examine the ASEN run-time performance characteristics in a packet switched environment. The network performance is examined under a fault-free but congested network operating environment. Network performance parameters of time-in-system, queue lengths and delays, as well as the effects of non-uniform loading of the network are presented. The cost associated with implementation of an ASEN is compared with previously published metrics for the multistage cube network operating under the same environments. The authors conclude that, for the network and operating assumptions defined, the ASEN provides better performance at lower implementation costs than the multistage cube interconnection network.<>
扩展了先前与容错增强Shuffle交换网络(ASEN)的性能建模相关的研究工作。作者研究了分组交换环境中的ASEN运行时性能特征。在无故障但拥塞的网络运行环境下检测网络性能。给出了系统时间、队列长度和延迟等网络性能参数,以及网络非均匀加载的影响。与ASEN实施相关的成本与先前发布的在相同环境下运行的多级多维数据集网络的指标进行了比较。作者得出结论,对于定义的网络和运行假设,ASEN以较低的实施成本提供比多级立方体互连网络更好的性能。
{"title":"Performance studies of packet switched augmented shuffle exchange networks","authors":"V. Ramachandran, R. Raines, J.S. Park, N. Davis","doi":"10.1109/FMPC.1992.234920","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234920","url":null,"abstract":"Extends previous research efforts related to the performance modeling of the fault-tolerant Augmented Shuffle Exchange Network (ASEN). The authors examine the ASEN run-time performance characteristics in a packet switched environment. The network performance is examined under a fault-free but congested network operating environment. Network performance parameters of time-in-system, queue lengths and delays, as well as the effects of non-uniform loading of the network are presented. The cost associated with implementation of an ASEN is compared with previously published metrics for the multistage cube network operating under the same environments. The authors conclude that, for the network and operating assumptions defined, the ASEN provides better performance at lower implementation costs than the multistage cube interconnection network.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"165 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120929768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improving massively data parallel system performance with heterogeneity 提高异构大数据并行系统性能
S. Noh, K. Dussa-Zieger
The authors introduce a new type of combined SIMD/MIMD (single-instruction multiple-data/multiple-instruction multiple-data) architecture called a hybrid system. The hybrid system consists of two components. The first component is massively parallel and consists of a large number of slow processors that are organized in an SIMD architecture. The second component consists of only a few fast processors (possibly only one) which are organized in an MIMD architecture. The authors contend that a hybrid system provides a means to adequately adjust to the characteristics of a parallel program, i.e., changing parallelism. They describe the machine and application model, and discuss the performance impact of such a system. Viewing the CM-2 with its front-end as a special case of a hybrid system, they substantiate the arguments and report measurements for a Gaussian elimination algorithm.<>
作者介绍了一种新的SIMD/MIMD(单指令多数据/多指令多数据)组合体系结构,称为混合系统。混合动力系统由两部分组成。第一个组件是大规模并行的,由在SIMD体系结构中组织的大量慢速处理器组成。第二个组件仅由几个快速处理器(可能只有一个)组成,这些处理器组织在一个MIMD体系结构中。作者认为,混合系统提供了一种手段,以充分调整并行程序的特点,即改变并行性。他们描述了机器和应用模型,并讨论了这样一个系统对性能的影响。将CM-2及其前端视为混合系统的特殊情况,他们证实了这些论点,并报告了高斯消除算法的测量结果。
{"title":"Improving massively data parallel system performance with heterogeneity","authors":"S. Noh, K. Dussa-Zieger","doi":"10.1109/FMPC.1992.234901","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234901","url":null,"abstract":"The authors introduce a new type of combined SIMD/MIMD (single-instruction multiple-data/multiple-instruction multiple-data) architecture called a hybrid system. The hybrid system consists of two components. The first component is massively parallel and consists of a large number of slow processors that are organized in an SIMD architecture. The second component consists of only a few fast processors (possibly only one) which are organized in an MIMD architecture. The authors contend that a hybrid system provides a means to adequately adjust to the characteristics of a parallel program, i.e., changing parallelism. They describe the machine and application model, and discuss the performance impact of such a system. Viewing the CM-2 with its front-end as a special case of a hybrid system, they substantiate the arguments and report measurements for a Gaussian elimination algorithm.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121196196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The speedup and efficiency of 3-D FFT on distributed memory MIMD systems 三维FFT在分布式存储MIMD系统上的加速和效率
D. Marinescu
The author analyzes a 3-D FFT (fast Fourier transform) algorithm for a distributed memory MIMD (multiple-instruction multiple-data) system. It is shown that the communication complexity limits the efficiency even under ideal conditions. The efficiency for the optimal speedup is eta /sub opt/=0.5. Actual applications which experience load imbalance, duplication of work, and blocking are even less efficient. Therefore the speedup with P processing elements, S(P)= eta *P, is disappointingly low. Moreover, the 3-D FFT algorithm is not susceptible to massive parallelization, and the optimal number of PEs is rather low even for large problem size and fast communication. A strategy to reduce the communication complexity is presented.<>
分析了一种适用于分布式存储多指令多数据系统的三维快速傅里叶变换算法。结果表明,即使在理想条件下,通信复杂性也会限制效率。最佳加速效率为eta /sub opt/=0.5。遇到负载不平衡、重复工作和阻塞的实际应用程序甚至效率更低。因此,P个处理元素的加速,S(P)= eta *P,低得令人失望。此外,三维FFT算法不容易受到大规模并行化的影响,即使在大问题规模和快速通信的情况下,pe的最优数量也很低。提出了一种降低通信复杂度的策略。
{"title":"The speedup and efficiency of 3-D FFT on distributed memory MIMD systems","authors":"D. Marinescu","doi":"10.1109/FMPC.1992.234894","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234894","url":null,"abstract":"The author analyzes a 3-D FFT (fast Fourier transform) algorithm for a distributed memory MIMD (multiple-instruction multiple-data) system. It is shown that the communication complexity limits the efficiency even under ideal conditions. The efficiency for the optimal speedup is eta /sub opt/=0.5. Actual applications which experience load imbalance, duplication of work, and blocking are even less efficient. Therefore the speedup with P processing elements, S(P)= eta *P, is disappointingly low. Moreover, the 3-D FFT algorithm is not susceptible to massive parallelization, and the optimal number of PEs is rather low even for large problem size and fast communication. A strategy to reduce the communication complexity is presented.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121652632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boolean function manipulation on massively parallel computers 大规模并行计算机上的布尔函数操作
G. Cabodi, S. Gai, M. Reorda
A new algorithm for implementing the basic operations on BDDs (binary decision diagrams) on a massively parallel computer is presented. Each node is associated with a processor, and nodes belonging to the same level are evaluated together. An implementation of the algorithm on a Connection Machine CM2 has been done, and the prototype is being tested on a set of benchmark applications. Experimental results, showing the time required to perform the apply operation on BDDs of growing size demonstrate the exactness of the complexity analysis and the effectiveness of the approach.<>
提出了一种在大规模并行计算机上实现二进制决策图基本运算的新算法。每个节点与一个处理器相关联,属于同一级别的节点一起计算。该算法已在连接机CM2上实现,原型正在一组基准应用程序上进行测试。实验结果表明,在不断增长的bdd上执行应用操作所需的时间证明了复杂性分析的准确性和该方法的有效性。
{"title":"Boolean function manipulation on massively parallel computers","authors":"G. Cabodi, S. Gai, M. Reorda","doi":"10.1109/FMPC.1992.234869","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234869","url":null,"abstract":"A new algorithm for implementing the basic operations on BDDs (binary decision diagrams) on a massively parallel computer is presented. Each node is associated with a processor, and nodes belonging to the same level are evaluated together. An implementation of the algorithm on a Connection Machine CM2 has been done, and the prototype is being tested on a set of benchmark applications. Experimental results, showing the time required to perform the apply operation on BDDs of growing size demonstrate the exactness of the complexity analysis and the effectiveness of the approach.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127775475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
The MetaMP approach to parallel programming 并行编程的MetaMP方法
S. Otto, M. Wolfe
The authors are researching techniques for the programming of large-scale parallel machines for scientific computation. They use an intermediate-level language, MetaMP, that sits between High Performance Fortran (HPF) and low-level message passing. They are developing an efficient set of primitives in the intermediate language and are investigating compilation methods that can semi-automatically reason about parallel programs. The focus is on distributed memory hardware. The work has many similarities with HPF efforts although their approach is aimed at shorter-term solutions. They plan to keep the programmer centrally involved in the development and optimization of the parallel program.<>
作者正在研究用于科学计算的大型并行机的编程技术。它们使用介于高性能Fortran (High Performance Fortran, HPF)和低级消息传递之间的中级语言MetaMP。他们正在用中间语言开发一组有效的原语,并正在研究能够半自动地对并行程序进行推理的编译方法。重点是分布式内存硬件。这项工作与HPF的努力有许多相似之处,尽管他们的方法旨在短期解决方案。他们计划让程序员集中参与并行程序的开发和优化
{"title":"The MetaMP approach to parallel programming","authors":"S. Otto, M. Wolfe","doi":"10.1109/FMPC.1992.234921","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234921","url":null,"abstract":"The authors are researching techniques for the programming of large-scale parallel machines for scientific computation. They use an intermediate-level language, MetaMP, that sits between High Performance Fortran (HPF) and low-level message passing. They are developing an efficient set of primitives in the intermediate language and are investigating compilation methods that can semi-automatically reason about parallel programs. The focus is on distributed memory hardware. The work has many similarities with HPF efforts although their approach is aimed at shorter-term solutions. They plan to keep the programmer centrally involved in the development and optimization of the parallel program.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126346196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1