首页 > 最新文献

Proceedings International Conference on Parallel Processing最新文献

英文 中文
Integrating trust into grid resource management systems 将信任集成到网格资源管理系统中
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040858
Farag Azzedin, Muthucumaru Maheswaran
Grid computing systems that have been the focus of much research in recent years provide a virtual framework for controlled sharing of resources across institutional boundaries. Security is a major concern in any system that enables remote execution. Several techniques can be used for providing security in grid systems including sandboxing, encryption, and other access control and authentication mechanisms. The additional overhead caused by these mechanisms may negate the performance advantages gained by grid computing. Hence, we contend that it is essential for the scheduler to consider the security implications while performing resource allocations. In this paper, we present a trust model for grid systems and show how the model can be used to incorporate security implications into scheduling algorithms. Three scheduling heuristics that can be used in a grid system are modified to incorporate the trust notion and simulations are performed to evaluate the performance.
网格计算系统是近年来许多研究的焦点,它为跨机构边界的受控资源共享提供了一个虚拟框架。在任何支持远程执行的系统中,安全性都是一个主要问题。有几种技术可用于在网格系统中提供安全性,包括沙箱、加密以及其他访问控制和身份验证机制。这些机制造成的额外开销可能会抵消网格计算带来的性能优势。因此,我们认为调度器在执行资源分配时必须考虑安全问题。在本文中,我们提出了一个网格系统的信任模型,并展示了如何使用该模型将安全含义纳入调度算法。改进了三种可用于网格系统的调度启发式算法,将信任概念融入其中,并进行了仿真以评估其性能。
{"title":"Integrating trust into grid resource management systems","authors":"Farag Azzedin, Muthucumaru Maheswaran","doi":"10.1109/ICPP.2002.1040858","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040858","url":null,"abstract":"Grid computing systems that have been the focus of much research in recent years provide a virtual framework for controlled sharing of resources across institutional boundaries. Security is a major concern in any system that enables remote execution. Several techniques can be used for providing security in grid systems including sandboxing, encryption, and other access control and authentication mechanisms. The additional overhead caused by these mechanisms may negate the performance advantages gained by grid computing. Hence, we contend that it is essential for the scheduler to consider the security implications while performing resource allocations. In this paper, we present a trust model for grid systems and show how the model can be used to incorporate security implications into scheduling algorithms. Three scheduling heuristics that can be used in a grid system are modified to incorporate the trust notion and simulations are performed to evaluate the performance.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130882413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 171
Honey, I shrunk the Beowulf! 亲爱的,我缩小了贝奥武夫!
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040868
Wu-chun Feng, Michael S. Warren, E. Weigle
In this paper, we present a novel twist on the Beowulf cluster - the Bladed Beowulf. Designed by RLX Technologies and integrated and configured at Los Alamos National Laboratory, our Bladed Beowulf consists of compute nodes made from commodity off-the-shelf parts mounted on motherboard blades measuring 14.7" /spl times/ 4.7" /spl times/ 0.58". Each motherboard blade (node) contains a 633 MHz Trans-meta TM5600/spl trade/ CPU, 256 MB memory, 10 GB hard disk, and three 100-Mb/s Fast Ethernet network interfaces. Using a chassis provided by RLX, twenty-four such nodes mount side-by-side in a vertical orientation to fit in a rack-mountable 3U space, i.e., 19" in width and 5.25" in height. A Bladed Beowulf can reduce the total cost of ownership (TCO) of a traditional Beowulf by a factor of three while providing Beowulf-like performance. Accordingly, rather than use the traditional definition of price-performance ratio where price is the cost of acquisition, we introduce a new metric called ToPPeR: total price-performance ratio, where total price encompasses TCO. We also propose two related (but more concrete) metrics: performance-space ratio and performance-power ratio.
在本文中,我们提出了贝奥武夫星系团的一个新的转折——刀刃贝奥武夫。我们的刀片贝奥武夫由RLX技术公司设计,并在洛斯阿拉莫斯国家实验室集成和配置,由安装在主板刀片上的商品现货零件制成,尺寸为14.7“/spl倍/ 4.7”/spl倍/ 0.58“。每个主板刀片(节点)包含一个633 MHz跨元TM5600/spl贸易/ CPU, 256 MB内存,10 GB硬盘和3个100 MB /s快速以太网网络接口。使用RLX提供的机箱,24个这样的节点在垂直方向上并排安装,以适应一个可安装的3U空间,即19“宽,5.25”高。叶片贝奥武夫可以将传统贝奥武夫的总拥有成本(TCO)降低三倍,同时提供类似贝奥武夫的性能。因此,我们没有使用传统的性价比定义(即价格是获取成本),而是引入了一个名为ToPPeR的新指标:总性价比,其中总价格包含TCO。我们还提出了两个相关的(但更具体的)指标:性能-空间比和性能-功率比。
{"title":"Honey, I shrunk the Beowulf!","authors":"Wu-chun Feng, Michael S. Warren, E. Weigle","doi":"10.1109/ICPP.2002.1040868","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040868","url":null,"abstract":"In this paper, we present a novel twist on the Beowulf cluster - the Bladed Beowulf. Designed by RLX Technologies and integrated and configured at Los Alamos National Laboratory, our Bladed Beowulf consists of compute nodes made from commodity off-the-shelf parts mounted on motherboard blades measuring 14.7\" /spl times/ 4.7\" /spl times/ 0.58\". Each motherboard blade (node) contains a 633 MHz Trans-meta TM5600/spl trade/ CPU, 256 MB memory, 10 GB hard disk, and three 100-Mb/s Fast Ethernet network interfaces. Using a chassis provided by RLX, twenty-four such nodes mount side-by-side in a vertical orientation to fit in a rack-mountable 3U space, i.e., 19\" in width and 5.25\" in height. A Bladed Beowulf can reduce the total cost of ownership (TCO) of a traditional Beowulf by a factor of three while providing Beowulf-like performance. Accordingly, rather than use the traditional definition of price-performance ratio where price is the cost of acquisition, we introduce a new metric called ToPPeR: total price-performance ratio, where total price encompasses TCO. We also propose two related (but more concrete) metrics: performance-space ratio and performance-power ratio.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122539267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
LDBS: a duplication based scheduling algorithm for heterogeneous computing systems LDBS:基于复制的异构计算系统调度算法
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040891
A. Doğan, F. Özgüner
Finding an optimal solution to the problem of scheduling an application modeled by a directed acyclic graph (DAG) onto a set of heterogeneous machines is known to be an NP-hard problem. In this study, we present a duplication based scheduling algorithm, namely the levelized duplication based scheduling (LDBS) algorithm, which solves this problem efficiently. The primary goal of LDBS is to minimize the schedule length of applications. LDBS can accommodate different duplication heuristics, thanks to its modular design. Specifically, we have designed two different duplication heuristics with different time complexities. The simulation studies confirm that LDBS is a very competitive scheduling algorithm in terms of minimizing the schedule length of applications.
将由有向无环图(DAG)建模的应用程序调度到一组异构机器上的问题的最优解是一个np困难问题。在本研究中,我们提出了一种基于复制的调度算法,即levelized duplication based scheduling (LDBS)算法,有效地解决了这一问题。LDBS的主要目标是最小化应用程序的进度长度。由于其模块化设计,LDBS可以适应不同的复制启发式。具体来说,我们设计了两种具有不同时间复杂度的重复启发式算法。仿真研究表明,在最小化应用程序调度长度方面,LDBS是一种极具竞争力的调度算法。
{"title":"LDBS: a duplication based scheduling algorithm for heterogeneous computing systems","authors":"A. Doğan, F. Özgüner","doi":"10.1109/ICPP.2002.1040891","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040891","url":null,"abstract":"Finding an optimal solution to the problem of scheduling an application modeled by a directed acyclic graph (DAG) onto a set of heterogeneous machines is known to be an NP-hard problem. In this study, we present a duplication based scheduling algorithm, namely the levelized duplication based scheduling (LDBS) algorithm, which solves this problem efficiently. The primary goal of LDBS is to minimize the schedule length of applications. LDBS can accommodate different duplication heuristics, thanks to its modular design. Specifically, we have designed two different duplication heuristics with different time complexities. The simulation studies confirm that LDBS is a very competitive scheduling algorithm in terms of minimizing the schedule length of applications.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126579189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Tolerating network failures in system area networks 容忍系统区域网络中的网络故障
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040866
Jeffrey Tang, A. Bilas
In this paper, we investigate how system area networks can deal with transient and permanent network failures. We design and implement a firmware-level retransmission scheme to tolerate transient failures and an on-demand network mapping scheme to deal with permanent failures. Both schemes are transparent to applications and are conceptually simple and suitable for low-level implementations, e.g. in firmware. We then examine how the retransmission scheme affects system performance and how various protocol parameters impact system behavior. We analyze and evaluate system performance by using a real implementation on a state-of-the art cluster and both micro-benchmarks and real applications from the SPLASH-2 suite.
在本文中,我们研究了系统区域网络如何处理暂态和永久网络故障。我们设计并实现了一种固件级重传方案来容忍瞬态故障,并设计了一种按需网络映射方案来处理永久故障。这两种方案对应用程序都是透明的,并且在概念上简单,适合低级实现,例如在固件中。然后,我们将研究重传方案如何影响系统性能以及各种协议参数如何影响系统行为。我们通过使用最先进的集群上的实际实现以及来自SPLASH-2套件的微基准测试和实际应用程序来分析和评估系统性能。
{"title":"Tolerating network failures in system area networks","authors":"Jeffrey Tang, A. Bilas","doi":"10.1109/ICPP.2002.1040866","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040866","url":null,"abstract":"In this paper, we investigate how system area networks can deal with transient and permanent network failures. We design and implement a firmware-level retransmission scheme to tolerate transient failures and an on-demand network mapping scheme to deal with permanent failures. Both schemes are transparent to applications and are conceptually simple and suitable for low-level implementations, e.g. in firmware. We then examine how the retransmission scheme affects system performance and how various protocol parameters impact system behavior. We analyze and evaluate system performance by using a real implementation on a state-of-the art cluster and both micro-benchmarks and real applications from the SPLASH-2 suite.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122483116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Space and time efficient parallel algorithms and software for EST clustering 空间和时间高效的EST聚类并行算法和软件
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040889
A. Kalyanaraman, S. Aluru, S. Kothari
Expressed sequence tags, ESTs, are DNA molecules experimentally derived from expressed portions of genes. Clustering of ESTs is essential for gene recognition and understanding important genetic variations such as those resulting in diseases. In this paper, we present the design and development of a parallel software system for EST clustering. To our knowledge, this is the first such effort to address the problem of EST clustering in parallel. The novel features of our approach include 1) design of space efficient algorithms to keep the space requirement linear in the size of the input data set, 2) a combination of algorithmic techniques to reduce the total work without sacrificing the quality of EST clustering, and 3) use of parallel processing to reduce the run-time and facilitate the clustering of large datasets. Using a combination of these techniques, we report the clustering of 81,414 Arabidopsis ESTs in under 2.5 minutes on a 64-processor IBM SP, a problem that is estimated to take 9 hours of run-time with a state-of-the-art software, provided the memory required to run the software can be made available.
表达序列标签(est)是通过实验从基因的表达部分获得的DNA分子。est聚类对于基因识别和理解重要的遗传变异(如导致疾病的遗传变异)至关重要。在本文中,我们设计和开发了一个用于EST集群的并行软件系统。据我们所知,这是第一次以并行方式解决EST集群问题。该方法的新特点包括:1)设计空间高效算法,使输入数据集的空间需求保持线性;2)结合多种算法技术,在不牺牲EST聚类质量的情况下减少总工作量;3)使用并行处理,减少运行时间,促进大型数据集的聚类。使用这些技术的组合,我们报告了在64处理器IBM SP上在2.5分钟内对81,414个拟南芥ESTs进行聚类,这个问题使用最先进的软件估计需要9个小时的运行时间,前提是可以提供运行软件所需的内存。
{"title":"Space and time efficient parallel algorithms and software for EST clustering","authors":"A. Kalyanaraman, S. Aluru, S. Kothari","doi":"10.1109/ICPP.2002.1040889","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040889","url":null,"abstract":"Expressed sequence tags, ESTs, are DNA molecules experimentally derived from expressed portions of genes. Clustering of ESTs is essential for gene recognition and understanding important genetic variations such as those resulting in diseases. In this paper, we present the design and development of a parallel software system for EST clustering. To our knowledge, this is the first such effort to address the problem of EST clustering in parallel. The novel features of our approach include 1) design of space efficient algorithms to keep the space requirement linear in the size of the input data set, 2) a combination of algorithmic techniques to reduce the total work without sacrificing the quality of EST clustering, and 3) use of parallel processing to reduce the run-time and facilitate the clustering of large datasets. Using a combination of these techniques, we report the clustering of 81,414 Arabidopsis ESTs in under 2.5 minutes on a 64-processor IBM SP, a problem that is estimated to take 9 hours of run-time with a state-of-the-art software, provided the memory required to run the software can be made available.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131384925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
ZEN: a directive-based language for automatic experiment management of distributed and parallel programs ZEN:用于分布式和并行程序的自动实验管理的基于指令的语言
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040863
R. Prodan, T. Fahringer
This paper describes ZEN, a directive-based language for the specification of arbitrarily complex program executions by varying the problem, system, or machine parameters for parallel and distributed applications. ZEN introduces directives to substitute strings and to insert assignment statements inside arbitrary files, such as program, input, script, or make-files. The programmer thus can invoke experiments for arbitrary value ranges of any problem parameter, including program variables, file names, compiler options, target machines, machine sizes, scheduling strategies, data distributions, etc. The number of experiments can be controlled through ZEN constraint directives. Finally, the programmer may request a large set of performance metrics to be computed for any code region of interest. The scope of ZEN directives can be restricted to arbitrary file or code regions. We implemented a prototype tool for automatic experiment management that is based on ZEN. We report results for the performance analysis of an ocean simulation application and for the parameter study of a computational finance code.
本文描述了ZEN,一种基于指令的语言,用于通过改变并行和分布式应用程序的问题、系统或机器参数来规范任意复杂的程序执行。ZEN引入指令来替换字符串并在任意文件(如程序、输入、脚本或make-files)中插入赋值语句。因此,程序员可以对任何问题参数的任意值范围调用实验,包括程序变量、文件名、编译器选项、目标机器、机器大小、调度策略、数据分布等。实验的数量可以通过ZEN约束指令来控制。最后,程序员可能会要求为任何感兴趣的代码区域计算大量的性能度量。ZEN指令的范围可以限制到任意的文件或代码区域。我们实现了一个基于ZEN的实验自动管理的原型工具。我们报告了海洋模拟应用程序的性能分析结果和计算金融代码的参数研究结果。
{"title":"ZEN: a directive-based language for automatic experiment management of distributed and parallel programs","authors":"R. Prodan, T. Fahringer","doi":"10.1109/ICPP.2002.1040863","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040863","url":null,"abstract":"This paper describes ZEN, a directive-based language for the specification of arbitrarily complex program executions by varying the problem, system, or machine parameters for parallel and distributed applications. ZEN introduces directives to substitute strings and to insert assignment statements inside arbitrary files, such as program, input, script, or make-files. The programmer thus can invoke experiments for arbitrary value ranges of any problem parameter, including program variables, file names, compiler options, target machines, machine sizes, scheduling strategies, data distributions, etc. The number of experiments can be controlled through ZEN constraint directives. Finally, the programmer may request a large set of performance metrics to be computed for any code region of interest. The scope of ZEN directives can be restricted to arbitrary file or code regions. We implemented a prototype tool for automatic experiment management that is based on ZEN. We report results for the performance analysis of an ocean simulation application and for the parameter study of a computational finance code.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134124757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Iterative grid-based computing using mobile agents 基于移动代理的迭代网格计算
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040865
Hairong Kuang, L. Bic, M. Dillencourt
We describe an environment for the distributed solution of iterative grid-based applications. The environment is built using the MESSENGERS mobile agent system. The main advantage of paradigm-oriented distributed computing is that the user only needs to specify the application-specific sequential code, while the underlying infrastructure takes care of the parallelization and distribution. The two paradigms discussed in this papers are: the finite difference method, and individual-based simulation. These paradigms present some interesting challenges, both in terms of performance (because they require frequent synchronized communication between nodes) and in terms of repeatability (because the mapping of the user space onto the network may change due to load balancing or due to changes in the underlying logical network). We describe their use, implementation, and performance within a mobile agent-based environment.
我们描述了一个基于迭代网格应用的分布式解决方案的环境。环境是使用信使移动代理系统构建的。面向范例的分布式计算的主要优点是,用户只需要指定特定于应用程序的顺序代码,而底层基础设施负责并行化和分布。本文讨论的两种范式是:有限差分法和基于个体的模拟。这些范例提出了一些有趣的挑战,无论是在性能方面(因为它们需要节点之间频繁的同步通信)还是在可重复性方面(因为用户空间到网络的映射可能会由于负载平衡或底层逻辑网络的更改而更改)。我们描述了它们在基于移动代理的环境中的使用、实现和性能。
{"title":"Iterative grid-based computing using mobile agents","authors":"Hairong Kuang, L. Bic, M. Dillencourt","doi":"10.1109/ICPP.2002.1040865","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040865","url":null,"abstract":"We describe an environment for the distributed solution of iterative grid-based applications. The environment is built using the MESSENGERS mobile agent system. The main advantage of paradigm-oriented distributed computing is that the user only needs to specify the application-specific sequential code, while the underlying infrastructure takes care of the parallelization and distribution. The two paradigms discussed in this papers are: the finite difference method, and individual-based simulation. These paradigms present some interesting challenges, both in terms of performance (because they require frequent synchronized communication between nodes) and in terms of repeatability (because the mapping of the user space onto the network may change due to load balancing or due to changes in the underlying logical network). We describe their use, implementation, and performance within a mobile agent-based environment.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130354807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Multithreaded isosurface rendering on SMPs using span-space buckets 使用跨空间桶在smp上进行多线程等值面渲染
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040915
Peter Sulatycke, K. Ghose
We present in-core and out-of-core parallel techniques for implementing isosurface rendering based on the notion of span-space buckets. Our in-core technique makes conservative use of the RAM and is amenable to parallelization. The out-of-core variant keeps the amount of data read in the search process to a minimum, visiting only the cells that intersect the isosurface. The out-of-core technique additionally minimizes disk I/O time through in-order seeking, interleaving data records on the disk and by overlapping computational and I/O threads. The overall isosurface rendering time achieved using our out-of-core span space buckets is comparable to that of well-optimized in-core techniques that have enough RAM at their disposal to avoid thrashing. When the RAM size is limited, our out-of-core span-space buckets maintains its performance level while in-core algorithms either start to thrash or must sacrifice performance for a smaller memory footprint.
我们提出了核内和核外并行技术来实现基于跨空间桶概念的等值面渲染。我们的核内技术保守地使用了RAM,并且适合并行化。out-of-core变体将搜索过程中读取的数据量保持在最低限度,只访问与等值面相交的单元格。外核技术还通过顺序查找、交错磁盘上的数据记录以及重叠计算和I/O线程来最小化磁盘I/O时间。使用我们的out-of-core span space bucket获得的整体等面渲染时间与优化良好的in-core技术相当,后者拥有足够的RAM以避免抖动。当RAM大小有限时,我们的外核跨空间桶保持其性能水平,而内核算法要么开始颠簸,要么必须牺牲性能以获得更小的内存占用。
{"title":"Multithreaded isosurface rendering on SMPs using span-space buckets","authors":"Peter Sulatycke, K. Ghose","doi":"10.1109/ICPP.2002.1040915","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040915","url":null,"abstract":"We present in-core and out-of-core parallel techniques for implementing isosurface rendering based on the notion of span-space buckets. Our in-core technique makes conservative use of the RAM and is amenable to parallelization. The out-of-core variant keeps the amount of data read in the search process to a minimum, visiting only the cells that intersect the isosurface. The out-of-core technique additionally minimizes disk I/O time through in-order seeking, interleaving data records on the disk and by overlapping computational and I/O threads. The overall isosurface rendering time achieved using our out-of-core span space buckets is comparable to that of well-optimized in-core techniques that have enough RAM at their disposal to avoid thrashing. When the RAM size is limited, our out-of-core span-space buckets maintains its performance level while in-core algorithms either start to thrash or must sacrifice performance for a smaller memory footprint.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134515656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A technique for adaptation to available resources on clusters independent of synchronization methods used 一种适应集群上可用资源的技术,与所使用的同步方法无关
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040895
Umit Rencuzogullari, S. Dwarkadas
Clusters of workstations (COW) offer high performance relative to their cost. Generally these clusters operate as autonomous systems running independent copies of the operating system, where access to machines is not controlled and all users enjoy the same access privileges. While these features are desirable and reduce operating costs, they create adverse effects on parallel applications running on these clusters. Load imbalances are common for parallel applications on COWs due to: 1) variable amount of load on nodes caused by an inherent lack of parallelism, 2) variable resource availability on nodes, and 3) independent scheduling decisions made by the independent schedulers on each node. Our earlier study has shown that an approach combining static program analysis, dynamic load balancing, and scheduler cooperation is effective in countering the adverse effects mentioned above. In our current study, we investigate the scalability of our approach as the number of processors is increased. We further relax the requirement of global synchronization, avoiding the need to use barriers and allowing the use of any other synchronization primitives while still achieving dynamic load balancing. The use of alternative synchronization primitives avoids the inherent vulnerability of barriers to load imbalance. It also allows load balancing to take place at any point in the course of execution, rather than only at a synchronization point, potentially reducing the time the application runs imbalanced. Moreover, load readjustment decisions are made in a distributed fashion, thus preventing any need for processes to globally synchronize in order to redistribute load.
工作站集群(COW)提供了相对于其成本而言的高性能。通常,这些集群作为运行操作系统独立副本的自治系统运行,其中对机器的访问不受控制,所有用户都享有相同的访问权限。虽然这些特性是可取的,并且可以降低操作成本,但它们会对在这些集群上运行的并行应用程序产生不利影响。负载不平衡在奶牛上的并行应用程序中很常见,这是因为:1)由于固有的并行性缺乏导致节点上的负载数量变化,2)节点上的资源可用性变化,以及3)由每个节点上的独立调度器做出的独立调度决策。我们早期的研究表明,结合静态程序分析、动态负载平衡和调度器合作的方法可以有效地对抗上述不利影响。在我们当前的研究中,我们研究了随着处理器数量的增加,我们的方法的可扩展性。我们进一步放宽了对全局同步的要求,避免了使用屏障的需要,并允许在实现动态负载平衡的同时使用任何其他同步原语。可选同步原语的使用避免了屏障对负载不平衡的固有脆弱性。它还允许在执行过程中的任何点进行负载平衡,而不仅仅是在同步点,这可能会减少应用程序不平衡运行的时间。此外,负载调整决策以分布式方式进行,从而避免了进程为了重新分配负载而进行全局同步的任何需要。
{"title":"A technique for adaptation to available resources on clusters independent of synchronization methods used","authors":"Umit Rencuzogullari, S. Dwarkadas","doi":"10.1109/ICPP.2002.1040895","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040895","url":null,"abstract":"Clusters of workstations (COW) offer high performance relative to their cost. Generally these clusters operate as autonomous systems running independent copies of the operating system, where access to machines is not controlled and all users enjoy the same access privileges. While these features are desirable and reduce operating costs, they create adverse effects on parallel applications running on these clusters. Load imbalances are common for parallel applications on COWs due to: 1) variable amount of load on nodes caused by an inherent lack of parallelism, 2) variable resource availability on nodes, and 3) independent scheduling decisions made by the independent schedulers on each node. Our earlier study has shown that an approach combining static program analysis, dynamic load balancing, and scheduler cooperation is effective in countering the adverse effects mentioned above. In our current study, we investigate the scalability of our approach as the number of processors is increased. We further relax the requirement of global synchronization, avoiding the need to use barriers and allowing the use of any other synchronization primitives while still achieving dynamic load balancing. The use of alternative synchronization primitives avoids the inherent vulnerability of barriers to load imbalance. It also allows load balancing to take place at any point in the course of execution, rather than only at a synchronization point, potentially reducing the time the application runs imbalanced. Moreover, load readjustment decisions are made in a distributed fashion, thus preventing any need for processes to globally synchronize in order to redistribute load.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131618006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Power aware scheduling for AND/OR graphs in multiprocessor real-time systems 多处理器实时系统中与/或图形的功耗感知调度
Pub Date : 2002-08-18 DOI: 10.1109/ICPP.2002.1040917
Dakai Zhu, Nevine AbouGhazaleh, D. Mossé, R. Melhem
Power aware computing has become popular recently and many techniques have been proposed to manage the energy consumption for traditional real-time applications. We have previously proposed (2001) two greedy slack sharing scheduling algorithms for such applications on multi-processor systems. In this paper, we are concerned mainly with real-time applications that have different execution paths consisting of different number of tasks. The AND/OR graph model is used to represent the application data dependence and control flow. The contribution of this paper is twofold. First, we extend our greedy slack sharing algorithm for traditional applications to deal with applications represented by AND/OR graphs. Then, using the statistical information about the applications, we propose a few variations of speculative scheduling algorithms that intend to save energy by reducing the number of speed changes (and thus the overhead) while ensuring that the applications meet the timing constraints. The performance of the algorithms is analyzed with respect to energy savings. The results obtained show that the greedy scheme is better than some speculative schemes and that the greedy scheme is good enough when a reasonable minimal speed exists in the system.
近年来,功耗感知计算越来越受欢迎,人们提出了许多技术来管理传统实时应用的能耗。我们在2001年提出了两种贪婪松弛共享调度算法,用于多处理器系统上的此类应用。在本文中,我们主要关注具有由不同数量的任务组成的不同执行路径的实时应用程序。使用AND/OR图模型来表示应用程序的数据依赖关系和控制流。本文的贡献是双重的。首先,我们扩展了传统应用的贪婪松弛共享算法,以处理由AND/OR图表示的应用。然后,使用有关应用程序的统计信息,我们提出了一些推测调度算法的变体,这些算法旨在通过减少速度变化的数量(以及开销)来节省能源,同时确保应用程序满足时间约束。从节能的角度分析了算法的性能。结果表明,贪心方案优于一些投机方案,当系统中存在一个合理的最小速度时,贪心方案是足够好的。
{"title":"Power aware scheduling for AND/OR graphs in multiprocessor real-time systems","authors":"Dakai Zhu, Nevine AbouGhazaleh, D. Mossé, R. Melhem","doi":"10.1109/ICPP.2002.1040917","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040917","url":null,"abstract":"Power aware computing has become popular recently and many techniques have been proposed to manage the energy consumption for traditional real-time applications. We have previously proposed (2001) two greedy slack sharing scheduling algorithms for such applications on multi-processor systems. In this paper, we are concerned mainly with real-time applications that have different execution paths consisting of different number of tasks. The AND/OR graph model is used to represent the application data dependence and control flow. The contribution of this paper is twofold. First, we extend our greedy slack sharing algorithm for traditional applications to deal with applications represented by AND/OR graphs. Then, using the statistical information about the applications, we propose a few variations of speculative scheduling algorithms that intend to save energy by reducing the number of speed changes (and thus the overhead) while ensuring that the applications meet the timing constraints. The performance of the algorithms is analyzed with respect to energy savings. The results obtained show that the greedy scheme is better than some speculative schemes and that the greedy scheme is good enough when a reasonable minimal speed exists in the system.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114873402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
期刊
Proceedings International Conference on Parallel Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1