首页 > 最新文献

[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing最新文献

英文 中文
p4-Linda: a portable implementation of Linda p4-Linda: Linda的可移植实现
R. Butler, Alan L. Leveton, E. Lusk
Facilities such as interprocess communication and protection of shared resources have been added to operating systems to support multiprogramming and have since been adapted to exploit explicit multiprocessing within the scope of two models: the shared-memory model and the distributed (message-passing) model. When multiprocessors (or networks of heterogeneous processors) are used for explicit parallelism, the difference between these models is exposed to the programmer. The p4 tool set was originally developed to buffer the programmer from synchronization issues while offering an added advantage in portability, however two models are often still needed to develop parallel algorithms. The authors provide two implementations of Linda in an attempt to support a single high-level programming model on top of the existing paradigms in order to provide a consistent semantics regardless of the underlying model. Linda's fundamental properties associated with generative communication eliminate the distinction between shared and distributed memory.<>
诸如进程间通信和共享资源保护之类的工具已经被添加到操作系统中以支持多路编程,并且已经被调整为在两个模型范围内利用显式多处理:共享内存模型和分布式(消息传递)模型。当将多处理器(或异构处理器网络)用于显式并行时,这些模型之间的差异将暴露给程序员。最初开发p4工具集是为了缓冲程序员的同步问题,同时提供可移植性方面的额外优势,但是开发并行算法通常仍然需要两个模型。作者提供了Linda的两种实现,试图在现有范例之上支持单个高级编程模型,以便提供一致的语义,而不管底层模型是什么。Linda与生成式通信相关的基本属性消除了共享内存和分布式内存之间的区别
{"title":"p4-Linda: a portable implementation of Linda","authors":"R. Butler, Alan L. Leveton, E. Lusk","doi":"10.1109/HPDC.1993.263858","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263858","url":null,"abstract":"Facilities such as interprocess communication and protection of shared resources have been added to operating systems to support multiprogramming and have since been adapted to exploit explicit multiprocessing within the scope of two models: the shared-memory model and the distributed (message-passing) model. When multiprocessors (or networks of heterogeneous processors) are used for explicit parallelism, the difference between these models is exposed to the programmer. The p4 tool set was originally developed to buffer the programmer from synchronization issues while offering an added advantage in portability, however two models are often still needed to develop parallel algorithms. The authors provide two implementations of Linda in an attempt to support a single high-level programming model on top of the existing paradigms in order to provide a consistent semantics regardless of the underlying model. Linda's fundamental properties associated with generative communication eliminate the distinction between shared and distributed memory.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A methodology for evaluating load balancing algorithms 一种评估负载平衡算法的方法
B. Joshi, S. Hosseini, K. Vairavan
In general, a load balancing algorithm improves a system performance. Obviously, larger the difference between the task arrival rates at various processors, more the system is imbalanced and more improvement in the system performance is achieved using a load balancing algorithm. The existing works which have used an experimental technique to show the improvement in the system performance under a load balancing algorithm have used an ad hoc procedure to select the task arrival rates for various processors. Thus, their experimental results necessarily may not provide a complete picture of the improvement in the system performance under their load balancing algorithms. The authors present a systematic scheme for the selection of the task arrival rates at various processors such that experimental results reflect a complete picture of the improvement in the system performance under a load balancing algorithm. The idea has been motivated by the well-known Taguchi technique used in quality control.<>
一般来说,负载均衡算法可以提高系统的性能。显然,不同处理器上的任务到达率之间的差异越大,系统就越不平衡,使用负载平衡算法可以实现更多的系统性能改进。已有的研究通过实验技术证明了负载均衡算法对系统性能的改善,并使用了一个特殊的过程来选择不同处理器的任务到达率。因此,他们的实验结果不一定能提供在他们的负载平衡算法下系统性能改善的完整图景。作者提出了一种系统的方案来选择不同处理器上的任务到达率,从而使实验结果完整地反映了负载平衡算法下系统性能的改进。这个想法的灵感来自于著名的田口在质量控制中使用的技术。
{"title":"A methodology for evaluating load balancing algorithms","authors":"B. Joshi, S. Hosseini, K. Vairavan","doi":"10.1109/HPDC.1993.263839","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263839","url":null,"abstract":"In general, a load balancing algorithm improves a system performance. Obviously, larger the difference between the task arrival rates at various processors, more the system is imbalanced and more improvement in the system performance is achieved using a load balancing algorithm. The existing works which have used an experimental technique to show the improvement in the system performance under a load balancing algorithm have used an ad hoc procedure to select the task arrival rates for various processors. Thus, their experimental results necessarily may not provide a complete picture of the improvement in the system performance under their load balancing algorithms. The authors present a systematic scheme for the selection of the task arrival rates at various processors such that experimental results reflect a complete picture of the improvement in the system performance under a load balancing algorithm. The idea has been motivated by the well-known Taguchi technique used in quality control.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114679941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Trading disk capacity for performance 用磁盘容量换取性能
Robert Y. Hou, Y. Patt
Improvements in disk access time have lagged behind improvements in microprocessor and main memory speeds. This disparity has made the storage subsystem a major bottleneck for many applications. Disk arrays that can service multiple disk requests simultaneously are being used to satisfy increasing throughput requirements. Higher throughput rates can be achieved by increasing the number of disks in an array. This increases the number of actuators that are available to service separate requests. It also spreads the data among more disk drives, reducing the seek time as the number of cylinders utilized on each disk drive decreases. The result is an increase in throughput that exceeds the increase in the number of disks. This suggests a tradeoff between the space utilization of disks in an array and the throughput of the array.<>
磁盘访问时间的改进落后于微处理器和主存储器速度的改进。这种差异使得存储子系统成为许多应用程序的主要瓶颈。可以同时处理多个磁盘请求的磁盘阵列正在被用来满足日益增长的吞吐量需求。通过增加阵列中的磁盘数量可以实现更高的吞吐率。这增加了可用于服务单独请求的执行器的数量。它还将数据分散到更多的磁盘驱动器中,随着每个磁盘驱动器上使用的柱面数量的减少,查找时间也会减少。其结果是吞吐量的增加超过了磁盘数量的增加。这表明在阵列中磁盘的空间利用率和阵列的吞吐量之间进行权衡。
{"title":"Trading disk capacity for performance","authors":"Robert Y. Hou, Y. Patt","doi":"10.1109/HPDC.1993.263834","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263834","url":null,"abstract":"Improvements in disk access time have lagged behind improvements in microprocessor and main memory speeds. This disparity has made the storage subsystem a major bottleneck for many applications. Disk arrays that can service multiple disk requests simultaneously are being used to satisfy increasing throughput requirements. Higher throughput rates can be achieved by increasing the number of disks in an array. This increases the number of actuators that are available to service separate requests. It also spreads the data among more disk drives, reducing the seek time as the number of cylinders utilized on each disk drive decreases. The result is an increase in throughput that exceeds the increase in the number of disks. This suggests a tradeoff between the space utilization of disks in an array and the throughput of the array.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123693487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Programming a distributed system using shared objects 使用共享对象编写分布式系统
A. Tanenbaum, H. Bal, M. Kaashoek
Building the hardware for a high-performance distributed computer system is a lot easier than building its software. The authors describe a model for programming distributed systems based on abstract data types that can be replicated on all machines that need them. Read operations are done locally, without requiring network traffic. Writes can be done using a reliable broadcast algorithm if the hardware supports broadcasting; otherwise, a point-to-point protocol is used. The authors have built such a system based on the Amoeba microkernel, and implemented a language, Orca, on top of it. For Orca applications that have a high ratio of reads to writes, they measure good speedups on a system with 16 processors.<>
为高性能分布式计算机系统构建硬件要比构建软件容易得多。作者描述了一个基于抽象数据类型的分布式系统编程模型,该模型可以在所有需要它们的机器上复制。读取操作在本地完成,不需要网络流量。如果硬件支持广播,则可以使用可靠的广播算法完成写操作;否则,使用点对点协议。作者基于Amoeba微内核构建了这样一个系统,并在其上实现了一种语言Orca。对于具有高读写比率的Orca应用程序,它们在具有16个处理器的系统上测量出良好的加速。
{"title":"Programming a distributed system using shared objects","authors":"A. Tanenbaum, H. Bal, M. Kaashoek","doi":"10.1109/HPDC.1993.263863","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263863","url":null,"abstract":"Building the hardware for a high-performance distributed computer system is a lot easier than building its software. The authors describe a model for programming distributed systems based on abstract data types that can be replicated on all machines that need them. Read operations are done locally, without requiring network traffic. Writes can be done using a reliable broadcast algorithm if the hardware supports broadcasting; otherwise, a point-to-point protocol is used. The authors have built such a system based on the Amoeba microkernel, and implemented a language, Orca, on top of it. For Orca applications that have a high ratio of reads to writes, they measure good speedups on a system with 16 processors.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121644029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Distributed computing solutions to the all-pairs shortest path problem 全对最短路径问题的分布式计算解决方案
I. Pramanick
This paper proposes two distributed solutions to the all-pairs shortest path problem, and reports the results of experiments conducted on a network of IBM RISC System/6000s, containing up to seven such workstations. It discusses the issues that become critical in a distributed environment as opposed to a parallel environment, and the results obtained underline the importance of reducing communication between the loosely coupled subtasks in a distributed environment. The results demonstrate that properly designed distributed algorithms, which take into account the limitations (in terms of a slower communication medium and/or the non-dedicated mode of machines) of a distributed computing environment, can yield significant performance benefits.<>
本文提出了全对最短路径问题的两种分布式解决方案,并报告了在IBM RISC系统/6000的网络上进行的实验结果,该网络包含多达七个这样的工作站。本文讨论了与并行环境相反,在分布式环境中变得至关重要的问题,所获得的结果强调了减少分布式环境中松散耦合子任务之间通信的重要性。结果表明,考虑到分布式计算环境的局限性(就较慢的通信介质和/或机器的非专用模式而言),适当设计的分布式算法可以产生显着的性能优势
{"title":"Distributed computing solutions to the all-pairs shortest path problem","authors":"I. Pramanick","doi":"10.1109/HPDC.1993.263841","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263841","url":null,"abstract":"This paper proposes two distributed solutions to the all-pairs shortest path problem, and reports the results of experiments conducted on a network of IBM RISC System/6000s, containing up to seven such workstations. It discusses the issues that become critical in a distributed environment as opposed to a parallel environment, and the results obtained underline the importance of reducing communication between the loosely coupled subtasks in a distributed environment. The results demonstrate that properly designed distributed algorithms, which take into account the limitations (in terms of a slower communication medium and/or the non-dedicated mode of machines) of a distributed computing environment, can yield significant performance benefits.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128208763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving performance by use of adaptive objects: experimentation with a configurable multiprocessor thread package 通过使用自适应对象来提高性能:使用可配置的多处理器线程包进行实验
B. Mukherjee, K. Schwan
Since the mechanisms of an operating system can significantly affect the performance of parallel programs, it is important to customize operating system functionality for specific application programs. The authors first present a model for adaptive objects and the associated mechanisms, then they use this model to implement adaptive locks for multiprocessors which adapt themselves according to user-provided adaptation policies to suit changing application locking patterns. Using a parallel branch and bound program, they demonstrate the performance advantage of adaptive locks over existing locks.<>
由于操作系统的机制可以显著地影响并行程序的性能,因此为特定的应用程序定制操作系统功能非常重要。作者首先提出了一个自适应对象模型和相关机制,然后使用该模型实现了多处理器的自适应锁,该多处理器根据用户提供的自适应策略进行自适应,以适应不断变化的应用程序锁定模式。通过一个并行的分支和绑定程序,他们展示了自适应锁相对于现有锁的性能优势。
{"title":"Improving performance by use of adaptive objects: experimentation with a configurable multiprocessor thread package","authors":"B. Mukherjee, K. Schwan","doi":"10.1109/HPDC.1993.263857","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263857","url":null,"abstract":"Since the mechanisms of an operating system can significantly affect the performance of parallel programs, it is important to customize operating system functionality for specific application programs. The authors first present a model for adaptive objects and the associated mechanisms, then they use this model to implement adaptive locks for multiprocessors which adapt themselves according to user-provided adaptation policies to suit changing application locking patterns. Using a parallel branch and bound program, they demonstrate the performance advantage of adaptive locks over existing locks.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129813047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
An analysis of distributed computing software and hardware for applications in computational physics 分布式计算软件和硬件在计算物理中的应用分析
P. Coddington
The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both standard Ethernet connections and a fast prototype switch, and also on the nCUBE/2, a MIMD parallel computer. The algorithms used range from simple, local, and regular to complex, non-local, and irregular. He describes his experiences with the hardware, software and parallel languages used, and discusses ideas for making distributed parallel computing on workstation networks more easily usable for computational physicists.<>
作者在IBM RS/6000工作站作为分布式并行计算机的网络上实现了一套计算物理代码。他用标准以太网连接和快速原型交换机以及nCUBE/2(一种MIMD并行计算机)比较了该网络上代码的性能。使用的算法范围从简单、局部和规则到复杂、非局部和不规则。他描述了他在使用硬件、软件和并行语言方面的经验,并讨论了使工作站网络上的分布式并行计算更容易被计算物理学家使用的想法
{"title":"An analysis of distributed computing software and hardware for applications in computational physics","authors":"P. Coddington","doi":"10.1109/HPDC.1993.263843","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263843","url":null,"abstract":"The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both standard Ethernet connections and a fast prototype switch, and also on the nCUBE/2, a MIMD parallel computer. The algorithms used range from simple, local, and regular to complex, non-local, and irregular. He describes his experiences with the hardware, software and parallel languages used, and discusses ideas for making distributed parallel computing on workstation networks more easily usable for computational physicists.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130769993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MULTIPAR: an output queue ATM modular switch with multiple phases and replicated planes MULTIPAR:一种具有多阶段和复制平面的输出队列ATM模块化交换机
Jian Ma, K. Rahko
The authors propose a novel output queuing ATM modular switch which has memoryless two-stage interconnection with disjoint-path topology. The goal of achieving the modular switch is to relax the limitation of VLSI implementation, to simplify interstage wiring and synchronization, furthermore to reduce complexity of the overall switch. A pure output queue is constructed by providing multipath in each output port and replicated switching module planes. The switch with certain cell loss requirement can be ensured by choosing a suitable path set of L/sub 1/ and L/sub 2/. For instance, cell loss probability in the switch can be kept less than 10/sup -6/ for various N, under 90% load, if a set of L/sub 1/=9 and L/sub 2/=4 (or L/sub 1/=8 and L/sub 2/=5) is chosen.<>
提出了一种新型的输出排队ATM模块化交换机,该交换机具有无记忆两级互连和不连接路径拓扑结构。实现模块化开关的目标是放宽VLSI实现的限制,简化级间布线和同步,从而降低整体开关的复杂性。通过在每个输出端口提供多路径和复制交换模块平面,构建一个纯输出队列。通过选择合适的L/sub 1/和L/sub 2/路径集,可以保证具有一定小区损耗要求的开关。例如,在90%负载下,如果选择一组L/sub 1/=9和L/sub 2/=4(或L/sub 1/=8和L/sub 2/=5),则开关中的单元丢失概率可以保持在10/sup -6/以下。>
{"title":"MULTIPAR: an output queue ATM modular switch with multiple phases and replicated planes","authors":"Jian Ma, K. Rahko","doi":"10.1109/HPDC.1993.263846","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263846","url":null,"abstract":"The authors propose a novel output queuing ATM modular switch which has memoryless two-stage interconnection with disjoint-path topology. The goal of achieving the modular switch is to relax the limitation of VLSI implementation, to simplify interstage wiring and synchronization, furthermore to reduce complexity of the overall switch. A pure output queue is constructed by providing multipath in each output port and replicated switching module planes. The switch with certain cell loss requirement can be ensured by choosing a suitable path set of L/sub 1/ and L/sub 2/. For instance, cell loss probability in the switch can be kept less than 10/sup -6/ for various N, under 90% load, if a set of L/sub 1/=9 and L/sub 2/=4 (or L/sub 1/=8 and L/sub 2/=5) is chosen.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125056854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A parallel object-oriented framework for stencil algorithms 模板算法的并行面向对象框架
John F. Karpovich, M. Judd, W. Strayer, A. Grimshaw
The authors present an object-oriented framework for constructing parallel implementations of stencil algorithms. This framework simplifies the development process by encapsulating the common aspects of stencil algorithms in a base stencil class so that application-specific derived classes can be easily defined via inheritance and overloading. In addition, the stencil base class contains mechanisms for parallel execution. The result is a high-performance, parallel, application-specific stencil class. The authors present the design rationale for the base class and illustrate the derivation process by defining two subclasses, an image convolution class and a PDE solver. The classes have been implemented in Mentat, an object-oriented parallel programming system that is available on a variety of platforms. Performance results are given for a network of Sun SPARCstation IPCs.<>
作者提出了一个面向对象的框架,用于构建模板算法的并行实现。该框架通过将模板算法的公共方面封装在基模板类中来简化开发过程,以便可以通过继承和重载轻松定义特定于应用程序的派生类。此外,模板基类包含并行执行的机制。结果是一个高性能的、并行的、特定于应用程序的模板类。作者介绍了基类的设计原理,并通过定义两个子类,一个图像卷积类和一个PDE求解器来说明派生过程。这些类是在Mentat中实现的,Mentat是一个面向对象的并行编程系统,可以在各种平台上使用。给出了Sun SPARCstation ipc网络的性能结果。
{"title":"A parallel object-oriented framework for stencil algorithms","authors":"John F. Karpovich, M. Judd, W. Strayer, A. Grimshaw","doi":"10.1109/HPDC.1993.263860","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263860","url":null,"abstract":"The authors present an object-oriented framework for constructing parallel implementations of stencil algorithms. This framework simplifies the development process by encapsulating the common aspects of stencil algorithms in a base stencil class so that application-specific derived classes can be easily defined via inheritance and overloading. In addition, the stencil base class contains mechanisms for parallel execution. The result is a high-performance, parallel, application-specific stencil class. The authors present the design rationale for the base class and illustrate the derivation process by defining two subclasses, an image convolution class and a PDE solver. The classes have been implemented in Mentat, an object-oriented parallel programming system that is available on a variety of platforms. Performance results are given for a network of Sun SPARCstation IPCs.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124918310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1