首页 > 最新文献

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)最新文献

英文 中文
FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI FTC-Charm++:用于Charm++和MPI的基于内存检查点的容错运行时
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392606
G. Zheng, L. Shi, L. Kalé
As high performance clusters continue to grow in size, the mean time between failures shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challenging factors for application scalability. The traditional disk-based method of dealing with faults is to checkpoint the state of the entire application periodically to reliable storage and restart from the recent checkpoint. The recovery of the application from faults involves (often manually) restarting applications on all processors and having it read the data from disks on all processors. The restart can therefore take minutes after it has been initiated. Such a strategy requires that the failed processor can be replaced so that the number of processors at checkpoint-time and recovery-time are the same. We present FTC-Charms ++, a fault-tolerant runtime based on a scheme for fast and scalable in-memory checkpoint and restart. At restart, when there is no extra processor, the program can continue to run on the remaining processors while minimizing the performance penalty due to losing processors. The method is useful for applications whose memory footprint is small at the checkpoint state, while a variation of this scheme - in-disk checkpoint/restart can be applied to applications with large memory footprint. The scheme does not require any individual component to be fault-free. We have implemented this scheme for Charms++ and AMPI (an adaptive version of MPl). This work describes the scheme and shows performance data on a cluster using 128 processors.
随着高性能集群规模的持续增长,平均故障间隔时间会缩短。因此,容错和可靠性问题正成为应用程序可伸缩性的挑战因素之一。传统的基于磁盘的故障处理方法是将整个应用程序的状态定期检查点到可靠的存储,并从最近的检查点重新启动。从故障中恢复应用程序需要(通常是手动地)重新启动所有处理器上的应用程序,并让它从所有处理器上的磁盘读取数据。因此,重启在启动后可能需要几分钟。这种策略要求可以替换故障的处理器,以便在检查点时间和恢复时间的处理器数量相同。我们提出了FTC-Charms ++,一个基于快速和可扩展的内存检查点和重启方案的容错运行时。在重新启动时,当没有额外的处理器时,程序可以继续在剩余的处理器上运行,同时最小化由于丢失处理器而造成的性能损失。该方法对于在检查点状态下内存占用很小的应用程序很有用,而该方案的一个变体——磁盘内检查点/重启——可以应用于内存占用很大的应用程序。该方案不要求任何单个组件是无故障的。我们已经在Charms++和AMPI (MPl的自适应版本)中实现了该方案。本文描述了该方案,并展示了使用128个处理器的集群上的性能数据。
{"title":"FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI","authors":"G. Zheng, L. Shi, L. Kalé","doi":"10.1109/CLUSTR.2004.1392606","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392606","url":null,"abstract":"As high performance clusters continue to grow in size, the mean time between failures shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challenging factors for application scalability. The traditional disk-based method of dealing with faults is to checkpoint the state of the entire application periodically to reliable storage and restart from the recent checkpoint. The recovery of the application from faults involves (often manually) restarting applications on all processors and having it read the data from disks on all processors. The restart can therefore take minutes after it has been initiated. Such a strategy requires that the failed processor can be replaced so that the number of processors at checkpoint-time and recovery-time are the same. We present FTC-Charms ++, a fault-tolerant runtime based on a scheme for fast and scalable in-memory checkpoint and restart. At restart, when there is no extra processor, the program can continue to run on the remaining processors while minimizing the performance penalty due to losing processors. The method is useful for applications whose memory footprint is small at the checkpoint state, while a variation of this scheme - in-disk checkpoint/restart can be applied to applications with large memory footprint. The scheme does not require any individual component to be fault-free. We have implemented this scheme for Charms++ and AMPI (an adaptive version of MPl). This work describes the scheme and shows performance data on a cluster using 128 processors.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115135368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 213
Simplifying administration through dynamic reconfiguration. in a cooperative cluster storage system 通过动态重新配置简化管理。在协作集群存储系统中
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392615
Renaud Lachaize, J. Hansen
Cluster storage systems where storage devices are distributed across a large number of nodes are able to reduce the I/O bottleneck problems present in most centralized storage systems. However, such distributed storage devices are hard to manage efficiently. In this paper, we examine the use of explicit, component-based (command and data) paths between hosts and disks as a vehicle for performing nondisruptive storage system reconfiguration. We describe the mechanisms necessary to perform reconfigurations and show how they can be used to handle two management tasks: migration between network technologies and rebuilding a disk in a mirror. Our approach is validated through initial performance measurements of these two tasks using a prototype implementation. The results show that online reconfiguration is possible at a modest cost
存储设备分布在大量节点上的集群存储系统能够减少大多数集中式存储系统中存在的I/O瓶颈问题。然而,这种分布式存储设备很难有效地管理。在本文中,我们研究了在主机和磁盘之间使用显式的、基于组件的(命令和数据)路径作为执行非中断存储系统重构的工具。我们描述了执行重新配置所需的机制,并展示了如何使用它们来处理两个管理任务:网络技术之间的迁移和在镜像中重建磁盘。我们的方法是通过使用原型实现对这两个任务的初始性能度量来验证的。结果表明,在线重构是可能的,成本适中
{"title":"Simplifying administration through dynamic reconfiguration. in a cooperative cluster storage system","authors":"Renaud Lachaize, J. Hansen","doi":"10.1109/CLUSTR.2004.1392615","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392615","url":null,"abstract":"Cluster storage systems where storage devices are distributed across a large number of nodes are able to reduce the I/O bottleneck problems present in most centralized storage systems. However, such distributed storage devices are hard to manage efficiently. In this paper, we examine the use of explicit, component-based (command and data) paths between hosts and disks as a vehicle for performing nondisruptive storage system reconfiguration. We describe the mechanisms necessary to perform reconfigurations and show how they can be used to handle two management tasks: migration between network technologies and rebuilding a disk in a mirror. Our approach is validated through initial performance measurements of these two tasks using a prototype implementation. The results show that online reconfiguration is possible at a modest cost","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123112392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
State of InfiniBand in designing HPC clusters, storage/file systems, and datacenters [datacenters read as data centers] InfiniBand在高性能计算集群、存储/文件系统和数据中心(数据中心读作数据中心)设计中的现状
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392594
D. Panda
Summary forn only given. The tutorial aims to familiarize with IBA, its benefits, available IBA hardware/software solutions, and the latest trends in designing high-end computing, networking, and storage systems with IBA, and providing a critical assessment of whether IBA is ready for prime-time or not.
仅提供摘要形式。本教程旨在熟悉IBA、它的优点、可用的IBA硬件/软件解决方案,以及使用IBA设计高端计算、网络和存储系统的最新趋势,并对IBA是否已准备就绪进行关键评估。
{"title":"State of InfiniBand in designing HPC clusters, storage/file systems, and datacenters [datacenters read as data centers]","authors":"D. Panda","doi":"10.1109/CLUSTR.2004.1392594","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392594","url":null,"abstract":"Summary forn only given. The tutorial aims to familiarize with IBA, its benefits, available IBA hardware/software solutions, and the latest trends in designing high-end computing, networking, and storage systems with IBA, and providing a critical assessment of whether IBA is ready for prime-time or not.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124272728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI 改进的消息日志记录与改进的容错MPI协调检查点
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392609
Pierre Lemarinier, Aurélien Bouteiller, T. Hérault, Géraud Krawezik, F. Cappello
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection and recovery for message passing systems with different impact on application performance and the capacity to tolerate a high fault rate. In a recent paper, we have demonstrated that the main differences between pessimistic sender based message logging and coordinated checkpointing are: 1) the communication latency and 2) the performance penalty in case of faults. Pessimistic message logging increases the latency, due to additional blocking control messages. When faults occur at a high rate, coordinated checkpointing implies a higher performance penalty than message logging due to a higher stress on the checkpoint server. We extend this study to improved versions of message logging and coordinated checkpoint protocols which respectively reduces the latency overhead of pessimistic message logging and the server stress of coordinated checkpoint. We detail the protocols and their implementation into the new MPICH-V fault tolerant framework. We compare their performance against the previous versions and we compare the novel message logging protocols against the improved coordinated checkpointing one using the NAS benchmark on a typical high performance cluster equipped with a high speed network. The contribution of This work is twofold: a) an original message logging protocol and an improved coordinated checkpointing protocol and b) the comparison between them.
对于使用MPI库的关键高性能应用程序来说,容错是一个非常重要的问题。有几种协议为消息传递系统提供自动和透明的故障检测和恢复,这些协议对应用程序性能和容忍高故障率的能力有不同的影响。在最近的一篇论文中,我们证明了基于悲观发送者的消息日志记录和协调检查点之间的主要区别是:1)通信延迟和2)发生故障时的性能损失。悲观消息日志记录增加了延迟,因为有额外的阻塞控制消息。当故障频繁发生时,协调检查点意味着比消息日志记录更大的性能损失,因为检查点服务器上的压力更大。我们将研究扩展到消息日志和协调检查点协议的改进版本,它们分别减少了悲观消息日志的延迟开销和协调检查点的服务器压力。我们详细介绍了协议及其在新的MPICH-V容错框架中的实现。我们将它们的性能与以前的版本进行比较,并在配备高速网络的典型高性能集群上使用NAS基准测试,将新的消息日志协议与改进的协调检查点协议进行比较。这项工作的贡献是双重的:a)一个原始的消息记录协议和一个改进的协调检查点协议;b)它们之间的比较。
{"title":"Improved message logging versus improved coordinated checkpointing for fault tolerant MPI","authors":"Pierre Lemarinier, Aurélien Bouteiller, T. Hérault, Géraud Krawezik, F. Cappello","doi":"10.1109/CLUSTR.2004.1392609","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392609","url":null,"abstract":"Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection and recovery for message passing systems with different impact on application performance and the capacity to tolerate a high fault rate. In a recent paper, we have demonstrated that the main differences between pessimistic sender based message logging and coordinated checkpointing are: 1) the communication latency and 2) the performance penalty in case of faults. Pessimistic message logging increases the latency, due to additional blocking control messages. When faults occur at a high rate, coordinated checkpointing implies a higher performance penalty than message logging due to a higher stress on the checkpoint server. We extend this study to improved versions of message logging and coordinated checkpoint protocols which respectively reduces the latency overhead of pessimistic message logging and the server stress of coordinated checkpoint. We detail the protocols and their implementation into the new MPICH-V fault tolerant framework. We compare their performance against the previous versions and we compare the novel message logging protocols against the improved coordinated checkpointing one using the NAS benchmark on a typical high performance cluster equipped with a high speed network. The contribution of This work is twofold: a) an original message logging protocol and an improved coordinated checkpointing protocol and b) the comparison between them.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125306726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
GRID-enabled bioinformatics applications for comparative genomic analysis at the CBBC 基于网格的生物信息学应用于CBBC的比较基因组分析
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392652
A. Hunter, D. Schibeci, H. L. Hiew, M. Bellgard
Summary form only given. Bioinformatics is an important application area for grid computing. The grid computing issues required to tackle current bioinformatics challenges include processing power, large-scale data access and management, security, application integration, data integrity and curation, control/automation/tracking of workflows, data format consistency and resource discovery. In this poster, we describe preliminary steps taken to develop a grid environment to advance bioinformatics research. We developed a system called Grendel, with the aims of providing bioinformatics researchers transparent access to basic computational resources used in their research. Grendel is a platform and language independent Web-services based system for distributed resource management utilising Sun Grid Engine that provides a single entry point for computational tasks while keeping the actual resources transparent to the user. Grendel is developed in Java and deployed using the Tomcat. Client libraries have been developed in Perl and Java to provide access to computation resource exported via Grendel.
只提供摘要形式。生物信息学是网格计算的一个重要应用领域。解决当前生物信息学挑战所需的网格计算问题包括处理能力、大规模数据访问和管理、安全性、应用程序集成、数据完整性和管理、工作流控制/自动化/跟踪、数据格式一致性和资源发现。在这张海报中,我们描述了开发网格环境以推进生物信息学研究的初步步骤。我们开发了一个名为Grendel的系统,目的是为生物信息学研究人员提供透明的访问他们研究中使用的基本计算资源的途径。Grendel是一个独立于平台和语言的基于web服务的分布式资源管理系统,它利用Sun Grid引擎为计算任务提供了一个单一的入口点,同时保持实际资源对用户透明。Grendel是用Java开发的,并使用Tomcat进行部署。用Perl和Java开发了客户端库,以提供对通过Grendel导出的计算资源的访问。
{"title":"GRID-enabled bioinformatics applications for comparative genomic analysis at the CBBC","authors":"A. Hunter, D. Schibeci, H. L. Hiew, M. Bellgard","doi":"10.1109/CLUSTR.2004.1392652","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392652","url":null,"abstract":"Summary form only given. Bioinformatics is an important application area for grid computing. The grid computing issues required to tackle current bioinformatics challenges include processing power, large-scale data access and management, security, application integration, data integrity and curation, control/automation/tracking of workflows, data format consistency and resource discovery. In this poster, we describe preliminary steps taken to develop a grid environment to advance bioinformatics research. We developed a system called Grendel, with the aims of providing bioinformatics researchers transparent access to basic computational resources used in their research. Grendel is a platform and language independent Web-services based system for distributed resource management utilising Sun Grid Engine that provides a single entry point for computational tasks while keeping the actual resources transparent to the user. Grendel is developed in Java and deployed using the Tomcat. Client libraries have been developed in Perl and Java to provide access to computation resource exported via Grendel.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124747240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Management of grid jobs and data within SAMGrid 管理SAMGrid中的网格作业和数据
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392634
A. Baranovski, G. Garzoglio, I. Terekhov, A. Roy, T. Tannenbaum
When designing SAMGrid, a project for distributing high-energy physics computations on a grid, we discovered that it was challenging to decide where to place user's jobs. Jobs typically need to access hundreds of files, and each site has a different subset of the files. Our data system SAM knows what portion of a user's data may be at each site, but does not know how to submit grid jobs. Our job submission system Condor-G knows how to submit grid jobs, but originally it required users to choose grid sites and gave them no assistance in choosing. This work describes how we enhanced Condor-G to interact with SAM to make good decisions about where jobs should be executed, and thereby improve the performance of grid jobs that access large amounts of data. All these enhancements are general enough to be applicable to grid computing beyond the data-intensive computing with SAMGrid.
在设计SAMGrid(一个在网格上分布高能物理计算的项目)时,我们发现决定在哪里放置用户的作业是一个挑战。作业通常需要访问数百个文件,每个站点都有不同的文件子集。我们的数据系统SAM知道用户数据的哪一部分可能位于每个站点,但不知道如何提交网格作业。我们的作业提交系统Condor-G知道如何提交网格作业,但最初它要求用户选择网格站点,并没有提供选择帮助。这项工作描述了我们如何增强Condor-G来与SAM交互,从而对应该在哪里执行作业做出正确的决策,从而提高访问大量数据的网格作业的性能。所有这些增强都足够通用,适用于使用SAMGrid进行数据密集型计算以外的网格计算。
{"title":"Management of grid jobs and data within SAMGrid","authors":"A. Baranovski, G. Garzoglio, I. Terekhov, A. Roy, T. Tannenbaum","doi":"10.1109/CLUSTR.2004.1392634","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392634","url":null,"abstract":"When designing SAMGrid, a project for distributing high-energy physics computations on a grid, we discovered that it was challenging to decide where to place user's jobs. Jobs typically need to access hundreds of files, and each site has a different subset of the files. Our data system SAM knows what portion of a user's data may be at each site, but does not know how to submit grid jobs. Our job submission system Condor-G knows how to submit grid jobs, but originally it required users to choose grid sites and gave them no assistance in choosing. This work describes how we enhanced Condor-G to interact with SAM to make good decisions about where jobs should be executed, and thereby improve the performance of grid jobs that access large amounts of data. All these enhancements are general enough to be applicable to grid computing beyond the data-intensive computing with SAMGrid.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114707099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM 在Myrinet/GM上可扩展的、高性能的、基于nic的全对全广播
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392610
Weikuan Yu, D. Panda, Darius Buntinas
All-to-all broadcast is one of the common collective operations that involve dense communication between all processes in a parallel program. Previously, programmable network interface cards (NICs) have been leveraged to efficiently support collective operations, including barrier, broadcast, and reduce. This work explores the characteristics of all-to-all broadcast and proposes new algorithms to exploit the potential advantages of NIC programmability. Along with these algorithms, salient strategies have been used to provide scalable topology management, global buffer management, efficient communication processing, and message reliability. The algorithms have been incorporated into a NIC-based collective protocol over Myrinet/GM. The NIC-based all-to-all broadcast operations improve all-to-all broadcast bandwidth over 16 nodes by a factor of 3, compared to host-based all-to-all broadcast operation. Furthermore, the NIC-based operations have been demonstrated to achieve better scalability to large systems and very low host CPU utilization.
所有对所有广播是一种常见的集体操作,它涉及并行程序中所有进程之间的密集通信。以前,可编程网络接口卡(nic)已被用来有效地支持集合操作,包括屏障、广播和减少。这项工作探讨了所有到所有广播的特点,并提出了新的算法来利用NIC可编程性的潜在优势。除了这些算法之外,还使用了一些重要的策略来提供可伸缩的拓扑管理、全局缓冲区管理、有效的通信处理和消息可靠性。这些算法被整合到基于nic的Myrinet/GM的集体协议中。与基于主机的全对全广播操作相比,基于网卡的全对全广播操作在16个节点上的带宽提高了3倍。此外,基于nic的操作已被证明可以实现对大型系统的更好的可伸缩性和非常低的主机CPU利用率。
{"title":"Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM","authors":"Weikuan Yu, D. Panda, Darius Buntinas","doi":"10.1109/CLUSTR.2004.1392610","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392610","url":null,"abstract":"All-to-all broadcast is one of the common collective operations that involve dense communication between all processes in a parallel program. Previously, programmable network interface cards (NICs) have been leveraged to efficiently support collective operations, including barrier, broadcast, and reduce. This work explores the characteristics of all-to-all broadcast and proposes new algorithms to exploit the potential advantages of NIC programmability. Along with these algorithms, salient strategies have been used to provide scalable topology management, global buffer management, efficient communication processing, and message reliability. The algorithms have been incorporated into a NIC-based collective protocol over Myrinet/GM. The NIC-based all-to-all broadcast operations improve all-to-all broadcast bandwidth over 16 nodes by a factor of 3, compared to host-based all-to-all broadcast operation. Furthermore, the NIC-based operations have been demonstrated to achieve better scalability to large systems and very low host CPU utilization.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134536845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Computation-at-risk: employing the grid for computational risk management 计算风险:采用网格进行计算风险管理
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392633
S. Kleban, S. Clearwater
This work expands upon our earlier work involving the concept of computation-at-risk (CaR). In particular, CaR refers to the risk that certain computations may not get done within a timely manner. We examine a number of CaR distributions on several large clusters. The important contribution of This work is that it shows that there exist CaR-reducing strategies and by employing such strategies, a facility can significantly reduce the risk of inefficient resource utilization. Grids are shown to be one means for employing a CaR-reducing strategy. For example, we show that a CaR-reducing strategy applied to a common queue can have a dramatic effect on the wait times for jobs on a grid of clusters. In particular, we defined a CaR Sharpe rule that provides a decision rule for determining the best machine in a grid to place a new job.
这项工作扩展了我们早期涉及风险计算(CaR)概念的工作。特别是,CaR指的是某些计算可能无法及时完成的风险。我们研究了几个大型集群上的许多CaR分布。这项工作的重要贡献在于,它表明存在减少car的策略,并且通过采用这种策略,设施可以显着降低资源利用效率低下的风险。网格被证明是采用汽车减少策略的一种手段。例如,我们展示了将car减少策略应用于公共队列可以对集群网格上作业的等待时间产生巨大影响。特别是,我们定义了一个CaR Sharpe规则,该规则提供了一个决策规则,用于确定网格中放置新作业的最佳机器。
{"title":"Computation-at-risk: employing the grid for computational risk management","authors":"S. Kleban, S. Clearwater","doi":"10.1109/CLUSTR.2004.1392633","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392633","url":null,"abstract":"This work expands upon our earlier work involving the concept of computation-at-risk (CaR). In particular, CaR refers to the risk that certain computations may not get done within a timely manner. We examine a number of CaR distributions on several large clusters. The important contribution of This work is that it shows that there exist CaR-reducing strategies and by employing such strategies, a facility can significantly reduce the risk of inefficient resource utilization. Grids are shown to be one means for employing a CaR-reducing strategy. For example, we show that a CaR-reducing strategy applied to a common queue can have a dramatic effect on the wait times for jobs on a grid of clusters. In particular, we defined a CaR Sharpe rule that provides a decision rule for determining the best machine in a grid to place a new job.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130829454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A comparison of local and gang scheduling on a Beowulf cluster 贝奥武夫集群的本地调度与组调度比较
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392601
P. Strazdins, Johannes Uhlmann
Gang scheduling and related techniques are widely believed to be necessary for efficient job scheduling on distributed memory parallel computers. This is because they minimize context switching overheads and permit the parallel job currently running to progress at the fastest possible rate. However, in the case of cluster computers, and particularly those with COTS networks, these benefits can be outweighed in the multiple jobs time-sharing context by the loss the ability to utilize the CPU for other jobs when the current job is waiting for messages. Experiments on a Linux Beowulf cluster with 100 Mb fast Ethernet switches are made comparing the SCore buddy-based gang scheduling with local scheduling (provided by the Linux 2.4 kernel with MPI implemented over TCP/IP). Results for communication-intensive numerical applications on 16 nodes reveal that gang scheduling results in 'slowdowns ' up to a factor of two greater for 8 simultaneous jobs. This phenomenon is not due to any deficiencies in SCore but due to the relative costs of context switching versus message overhead, and we expect similar results holds for any gang scheduling implementation. A performance analysis of local scheduling indicates that cache pollution due to context switching is more significant than the direct context switching overhead on the applications studied. When this is taken into account, local scheduling behaviour comes close to achieving ideal slowdowns for finer-grained computations such as Linpack. The performance models also indicate that similar trends are to be expected for clusters with faster networks.
群调度及其相关技术被广泛认为是高效调度分布式存储并行计算机作业的必要条件。这是因为它们最大限度地减少了上下文切换开销,并允许当前运行的并行作业以尽可能快的速度进行。然而,在集群计算机的情况下,特别是那些使用COTS网络的情况下,在多作业分时上下文中,由于当前作业正在等待消息时无法将CPU用于其他作业,因此这些好处可能会被抵消。在一个具有100 Mb快速以太网交换机的Linux Beowulf集群上进行了实验,比较了基于SCore伙伴的队列调度和本地调度(由Linux 2.4内核提供,通过TCP/IP实现MPI)。对16个节点上的通信密集型数值应用程序的结果表明,对于8个同时进行的作业,组调度导致的“减速”高达两倍。这种现象不是由于SCore的任何缺陷,而是由于上下文切换与消息开销的相对成本,我们预计任何组调度实现都会出现类似的结果。对本地调度的性能分析表明,在所研究的应用程序中,由于上下文切换造成的缓存污染比直接上下文切换带来的开销更为显著。当考虑到这一点时,本地调度行为接近于实现诸如Linpack等细粒度计算的理想减速。性能模型还表明,对于具有更快网络的集群,也会出现类似的趋势。
{"title":"A comparison of local and gang scheduling on a Beowulf cluster","authors":"P. Strazdins, Johannes Uhlmann","doi":"10.1109/CLUSTR.2004.1392601","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392601","url":null,"abstract":"Gang scheduling and related techniques are widely believed to be necessary for efficient job scheduling on distributed memory parallel computers. This is because they minimize context switching overheads and permit the parallel job currently running to progress at the fastest possible rate. However, in the case of cluster computers, and particularly those with COTS networks, these benefits can be outweighed in the multiple jobs time-sharing context by the loss the ability to utilize the CPU for other jobs when the current job is waiting for messages. Experiments on a Linux Beowulf cluster with 100 Mb fast Ethernet switches are made comparing the SCore buddy-based gang scheduling with local scheduling (provided by the Linux 2.4 kernel with MPI implemented over TCP/IP). Results for communication-intensive numerical applications on 16 nodes reveal that gang scheduling results in 'slowdowns ' up to a factor of two greater for 8 simultaneous jobs. This phenomenon is not due to any deficiencies in SCore but due to the relative costs of context switching versus message overhead, and we expect similar results holds for any gang scheduling implementation. A performance analysis of local scheduling indicates that cache pollution due to context switching is more significant than the direct context switching overhead on the applications studied. When this is taken into account, local scheduling behaviour comes close to achieving ideal slowdowns for finer-grained computations such as Linpack. The performance models also indicate that similar trends are to be expected for clusters with faster networks.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134528140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
MPIIMGEN - a code transformer that parallelizes image processing codes to run on a cluster of workstations MPIIMGEN——一个代码转换器,它将图像处理代码并行化,以便在工作站集群上运行
Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392596
U. V. Vinod, P. K. Baruah
An enormous body of image and video processing software has been written for conventional (sequential) desktop computers. These implement a wide range of operations, such as convolution, histogram equalization and template matching. These applications usually have a tremendous potential for parallelism. However a significant barrier in exploiting such parallelism is the difficulty of writing parallel software. In this work, the design and implementation of MPIIMGEN -- a code transformer that automatically transforms these sequential image processing codes into parallel codes that are capable of running on a cluster of workstations is presented. This tool uses a pattern driven approach to parallelize the sequential codes.
为传统的(顺序的)台式计算机编写了大量的图像和视频处理软件。这些实现了广泛的操作,如卷积,直方图均衡和模板匹配。这些应用程序通常具有巨大的并行性潜力。然而,开发这种并行性的一个重要障碍是编写并行软件的困难。在这项工作中,MPIIMGEN的设计和实现——一个代码转换器,自动将这些顺序图像处理代码转换成能够在工作站集群上运行的并行代码。该工具使用模式驱动的方法来并行化顺序代码。
{"title":"MPIIMGEN - a code transformer that parallelizes image processing codes to run on a cluster of workstations","authors":"U. V. Vinod, P. K. Baruah","doi":"10.1109/CLUSTR.2004.1392596","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392596","url":null,"abstract":"An enormous body of image and video processing software has been written for conventional (sequential) desktop computers. These implement a wide range of operations, such as convolution, histogram equalization and template matching. These applications usually have a tremendous potential for parallelism. However a significant barrier in exploiting such parallelism is the difficulty of writing parallel software. In this work, the design and implementation of MPIIMGEN -- a code transformer that automatically transforms these sequential image processing codes into parallel codes that are capable of running on a cluster of workstations is presented. This tool uses a pattern driven approach to parallelize the sequential codes.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114721323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1