首页 > 最新文献

Proceedings. IEEE International Conference on Cluster Computing最新文献

英文 中文
Message from the HeteroPar 2007 chair 来自2007年HeteroPar主席的信息
Pub Date : 2007-09-17 DOI: 10.1109/CLUSTR.2007.4629275
Olivier Beaumont
This is the sixth edition of the International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (HeteroPar 2007). The response to the call for papers was very good: we received 17 strong submissions from 9 countries including Austria, China, France, Germany, Hungary, Ireland, Japan, Spain and Tunisia. All submitted papers were distributed to four members of the Program Committee for evaluation. The PC members either reviewed the papers themselves, or solicited external reviewers. The reviewing process went quite smoothly, and each paper received at least 3 reviews. The final decisions on acceptance/rejection were based upon the reviews resulting in selection of 9 papers. We congratulate the authors of accepted papers, and we regret that some potentially interesting papers could not be accepted, mainly due to unsatisfactory quality of presentation of the research results. The presentation of the accepted papers is organized in 3 sessions: • Scheduling on Heterogeneous Platforms; • Applications; • New Trends in Heterogeneous Computing. Altogether, the papers cover a broad spectrum of topics presenting new ideas, dedicated algorithms, and tools for efficient use of heterogeneous networks of computers.
这是异构网络并行计算算法、模型和工具国际研讨会(HeteroPar 2007)的第六版。论文征稿的反响非常好:我们收到了来自奥地利、中国、法国、德国、匈牙利、爱尔兰、日本、西班牙和突尼斯等9个国家的17份强有力的论文。所有提交的论文被分发给四名项目委员会成员进行评估。PC成员要么自己评审论文,要么请外部评审。审稿过程相当顺利,每篇论文至少有3篇审稿。接受/拒绝的最终决定是基于9篇论文的评审结果。我们祝贺被录用论文的作者,并对一些可能有趣的论文不能被录用感到遗憾,主要是由于研究结果的呈现质量不令人满意。被接受论文的展示分为三个部分:•异构平台上的调度;•应用程序;•异构计算的新趋势。总之,这些论文涵盖了广泛的主题,提出了新的想法、专用算法和有效使用异构计算机网络的工具。
{"title":"Message from the HeteroPar 2007 chair","authors":"Olivier Beaumont","doi":"10.1109/CLUSTR.2007.4629275","DOIUrl":"https://doi.org/10.1109/CLUSTR.2007.4629275","url":null,"abstract":"This is the sixth edition of the International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (HeteroPar 2007). The response to the call for papers was very good: we received 17 strong submissions from 9 countries including Austria, China, France, Germany, Hungary, Ireland, Japan, Spain and Tunisia. All submitted papers were distributed to four members of the Program Committee for evaluation. The PC members either reviewed the papers themselves, or solicited external reviewers. The reviewing process went quite smoothly, and each paper received at least 3 reviews. The final decisions on acceptance/rejection were based upon the reviews resulting in selection of 9 papers. We congratulate the authors of accepted papers, and we regret that some potentially interesting papers could not be accepted, mainly due to unsatisfactory quality of presentation of the research results. The presentation of the accepted papers is organized in 3 sessions: • Scheduling on Heterogeneous Platforms; • Applications; • New Trends in Heterogeneous Computing. Altogether, the papers cover a broad spectrum of topics presenting new ideas, dedicated algorithms, and tools for efficient use of heterogeneous networks of computers.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85752342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measurement-based power profiling of data center equipment 数据中心设备基于测量的功率分析
Pub Date : 2007-09-17 DOI: 10.1109/CLUSTR.2007.4629270
T. Mukherjee, G. Varsamopoulos, S. Gupta, S. Rungta
Power-aware and thermal-aware techniques such as power-throttling and workload manipulation have been developed to counter the increasing power density in the current data centers. The basis for any such power-aware and/or thermal-aware technique, however, depends heavily on the equipment's power consumption model assumed. The goal of this paper is to perform power-profiling of different systems-namely, the Dell PowerEdge 1855 and 1955-based on actual power measurements. Gamut (Generic Application eMUlaTor) benchmark, double-precision matrix multiplication, and convolution of two vectors are used for varying the CPU utilization and Disk I/O.
功率感知和热感知技术(如功率节流和工作负载操作)已经开发出来,以应对当前数据中心中不断增加的功率密度。然而,任何此类功率感知和/或热感知技术的基础在很大程度上取决于所假设的设备功耗模型。本文的目标是根据实际功率测量对不同系统(即Dell PowerEdge 1855和1955)进行功率分析。Gamut (Generic Application eMUlaTor)基准测试、双精度矩阵乘法和两个向量的卷积用于改变CPU利用率和磁盘I/O。
{"title":"Measurement-based power profiling of data center equipment","authors":"T. Mukherjee, G. Varsamopoulos, S. Gupta, S. Rungta","doi":"10.1109/CLUSTR.2007.4629270","DOIUrl":"https://doi.org/10.1109/CLUSTR.2007.4629270","url":null,"abstract":"Power-aware and thermal-aware techniques such as power-throttling and workload manipulation have been developed to counter the increasing power density in the current data centers. The basis for any such power-aware and/or thermal-aware technique, however, depends heavily on the equipment's power consumption model assumed. The goal of this paper is to perform power-profiling of different systems-namely, the Dell PowerEdge 1855 and 1955-based on actual power measurements. Gamut (Generic Application eMUlaTor) benchmark, double-precision matrix multiplication, and convolution of two vectors are used for varying the CPU utilization and Disk I/O.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87318097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Some work in progress at IBM's Austin Research Lab IBM的奥斯汀研究实验室正在进行一些研究
Pub Date : 2007-09-17 DOI: 10.1109/CLUSTR.2007.4629272
T. Keller
Summary form only given. Power-related research activities at IBM's Austin Research Lab (ARL) include low power circuitry, power-efficient microprocessor designs, and server systems power measurement and management at many different levels. Researchers are given the opportunity to see their contributions at all levels of product design, since IBM produces its own microprocessors for its System p, i, and z platforms as well as producing System x platforms with other vendors' microprocessors, as well as marketing Tivoli systems management middleware.
只提供摘要形式。在IBM的Austin research Lab (ARL)中,与电源相关的研究活动包括低功耗电路、节能微处理器设计以及许多不同级别的服务器系统电源测量和管理。研究人员有机会在产品设计的各个层面看到他们的贡献,因为IBM为其System p、i和z平台生产自己的微处理器,并使用其他供应商的微处理器生产System x平台,以及销售Tivoli系统管理中间件。
{"title":"Some work in progress at IBM's Austin Research Lab","authors":"T. Keller","doi":"10.1109/CLUSTR.2007.4629272","DOIUrl":"https://doi.org/10.1109/CLUSTR.2007.4629272","url":null,"abstract":"Summary form only given. Power-related research activities at IBM's Austin Research Lab (ARL) include low power circuitry, power-efficient microprocessor designs, and server systems power measurement and management at many different levels. Researchers are given the opportunity to see their contributions at all levels of product design, since IBM produces its own microprocessors for its System p, i, and z platforms as well as producing System x platforms with other vendors' microprocessors, as well as marketing Tivoli systems management middleware.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79871008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On developing a fast, cost-effective and non-invasive method to derive data center thermal maps 开发一种快速、经济、无创的数据中心热图导出方法
Pub Date : 2007-09-17 DOI: 10.1109/CLUSTR.2007.4629269
Michael Jonas, G. Varsamopoulos, S. Gupta
Ongoing research has demonstrated the potential benefits of thermal-aware load placement in data centers to both reduce cooling costs and component failure rates. However, thermal-aware load placement techniques have not been widely deployed in existing data centers. This is mainly because they rely on a thermal map or profile of the data center, the derivation of which is an interruptive process to the data center operation. We propose a noninvasive solution of producing a thermal map; it consists of training a neural network with observed data from actual data center operation. Our results show that gathering the data and selecting a training set is a fast process, while the neural network with no hidden layers achieves the lowest mean squared error.
正在进行的研究已经证明了在数据中心放置热感知负载的潜在好处,既可以降低冷却成本,又可以降低组件故障率。然而,热感知负载放置技术尚未在现有数据中心广泛部署。这主要是因为它们依赖于数据中心的热图或概要,其推导是数据中心操作的中断过程。我们提出了一种无创生成热图的解决方案;它包括用实际数据中心运行中观察到的数据训练神经网络。我们的研究结果表明,收集数据和选择训练集是一个快速的过程,而没有隐藏层的神经网络获得了最小的均方误差。
{"title":"On developing a fast, cost-effective and non-invasive method to derive data center thermal maps","authors":"Michael Jonas, G. Varsamopoulos, S. Gupta","doi":"10.1109/CLUSTR.2007.4629269","DOIUrl":"https://doi.org/10.1109/CLUSTR.2007.4629269","url":null,"abstract":"Ongoing research has demonstrated the potential benefits of thermal-aware load placement in data centers to both reduce cooling costs and component failure rates. However, thermal-aware load placement techniques have not been widely deployed in existing data centers. This is mainly because they rely on a thermal map or profile of the data center, the derivation of which is an interruptive process to the data center operation. We propose a noninvasive solution of producing a thermal map; it consists of training a neural network with observed data from actual data center operation. Our results show that gathering the data and selecting a training set is a fast process, while the neural network with no hidden layers achieves the lowest mean squared error.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88145819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Motivating co-ordination of power management solutions in data centers 推动数据中心电源管理解决方案的协调
Pub Date : 2007-09-17 DOI: 10.1109/CLUSTR.2007.4629268
R. Raghavendra, Parthasarathy Ranganathan, V. Talwar, Xiaoyun Zhu, Zhikui Wang
Power and cooling are emerging to be key challenges in data center environments. A recent IDC report estimated the worldwide spending on enterprise power and cooling to be more than $30 billion and likely to even surpass spending on new server hardware. Server rated power consumptions have increased by nearly 10X over the past ten years. This has led to increased spending on cooling and power delivery equipment. A 30,000 square feet 10MW data center can need up to five million dollars of cooling infrastructure; similarly, power delivery beyond 60 Amps per rack can pose fundamental issues. The increased power also has implications on electricity costs, with many data centers reporting millions of dollars for annual usage. From an environmental point of view, the Department of Energy's 2007 estimate of 59 billion KWhrs spent in U.S. servers and data centers translates to several million tons of coal consumption and greenhouse gas emission per year. The U.S. Congress recently passed Public Law 109-431, directing the Environmental Protection Agency (EPA) to study enterprise energy use, and several industry consortiums such as the GreenGrid have been formed to address these issues. In addition, power and cooling can also impact compaction and reliability.
电源和冷却正在成为数据中心环境中的关键挑战。IDC最近的一份报告估计,全球在企业电源和冷却方面的支出将超过300亿美元,甚至可能超过在新服务器硬件上的支出。服务器的额定功耗在过去十年中增长了近10倍。这导致冷却和电力输送设备的支出增加。一个3万平方英尺的10兆瓦数据中心可能需要高达500万美元的冷却基础设施;同样,每个机架超过60安培的功率传输可能会带来根本问题。电力的增加也对电力成本产生了影响,许多数据中心每年的用电量高达数百万美元。从环保的角度来看,美国能源部2007年估计,美国服务器和数据中心每年消耗590亿千瓦时,相当于数百万吨的煤炭消耗和温室气体排放。美国国会最近通过了第109-431号公法,指示环境保护署(EPA)研究企业的能源使用情况,并成立了一些工业联盟,如“绿色电网”,以解决这些问题。此外,功率和冷却也会影响压实和可靠性。
{"title":"Motivating co-ordination of power management solutions in data centers","authors":"R. Raghavendra, Parthasarathy Ranganathan, V. Talwar, Xiaoyun Zhu, Zhikui Wang","doi":"10.1109/CLUSTR.2007.4629268","DOIUrl":"https://doi.org/10.1109/CLUSTR.2007.4629268","url":null,"abstract":"Power and cooling are emerging to be key challenges in data center environments. A recent IDC report estimated the worldwide spending on enterprise power and cooling to be more than $30 billion and likely to even surpass spending on new server hardware. Server rated power consumptions have increased by nearly 10X over the past ten years. This has led to increased spending on cooling and power delivery equipment. A 30,000 square feet 10MW data center can need up to five million dollars of cooling infrastructure; similarly, power delivery beyond 60 Amps per rack can pose fundamental issues. The increased power also has implications on electricity costs, with many data centers reporting millions of dollars for annual usage. From an environmental point of view, the Department of Energy's 2007 estimate of 59 billion KWhrs spent in U.S. servers and data centers translates to several million tons of coal consumption and greenhouse gas emission per year. The U.S. Congress recently passed Public Law 109-431, directing the Environmental Protection Agency (EPA) to study enterprise energy use, and several industry consortiums such as the GreenGrid have been formed to address these issues. In addition, power and cooling can also impact compaction and reliability.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88752892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Experience Of OS Optimization For Linpack On Dawning4000A 黎明4000a上Linpack操作系统优化的经验
Pub Date : 2005-01-01 DOI: 10.1109/CLUSTR.2005.347095
Ying-chao Zhou, Dan Meng, Xiao-cheng Zhou, Yao Chen
{"title":"Experience Of OS Optimization For Linpack On Dawning4000A","authors":"Ying-chao Zhou, Dan Meng, Xiao-cheng Zhou, Yao Chen","doi":"10.1109/CLUSTR.2005.347095","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347095","url":null,"abstract":"","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76375405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disaster tolerant Wolfpack geo-clusters 容灾狼群地理集群
Pub Date : 2002-09-23 DOI: 10.1109/CLUSTR.2002.1137750
Richard S. Wilkins, Xing Du, Robert A. Cochran, M. Popp
Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application's data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard's high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows "stretching" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array's firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.
计算机系统集群以提高应用程序可用性已成为一种常见的行业实践。虽然它确实增加了应用程序及其数据对用户的可用性,但它并不能解决灾难(洪水、龙卷风、地震、恐怖主义、内乱等)导致整个集群及其所服务的应用程序和数据不可用的问题。应用程序数据存储的远程镜像允许从灾难中恢复,但仍可能导致不可接受的长时间停机。本文描述了一种在地理上扩展Intel架构服务器的标准Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS)集群以实现容灾的方法。服务器节点及其存储可以放置在两个(或更多)遥远的站点上,以防止单个灾难导致整个集群崩溃。使用惠普高端存储阵列的远程镜像功能来维护标准集群语义和易用性。描述了在服务器节点之间移动或故障转移应用程序时控制数据镜像行为的附加软件的设计。此外,还描述了允许以对集群软件透明的方式在站点之间“拉伸”集群仲裁磁盘的软件,以及用于从站点间通信的完全丢失中提供快速恢复的外部仲裁节点的软件。阵列的固件镜像选项(即同步或异步I/O镜像)提供的灵活性允许基于单个应用程序的数据安全要求最佳地使用站点间链路带宽。
{"title":"Disaster tolerant Wolfpack geo-clusters","authors":"Richard S. Wilkins, Xing Du, Robert A. Cochran, M. Popp","doi":"10.1109/CLUSTR.2002.1137750","DOIUrl":"https://doi.org/10.1109/CLUSTR.2002.1137750","url":null,"abstract":"Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application's data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard's high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows \"stretching\" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array's firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75199691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An architecture for integrated resource management of MPI jobs 用于MPI作业的集成资源管理的体系结构
Pub Date : 2002-09-23 DOI: 10.1109/CLUSTR.2002.1137769
S. Sistare, Jack A. Test, D. Plauger
We present a new architecture for the integration of distributed resource management systems and parallel run-time environments such as MPI. The architecture solves the long-standing problem of achieving a tight integration between the two in a clean and robust manner that fully enables the functionality of both systems, including resource limit enforcement and accounting. We also present a more uniform command interface to the user, which simplifies the task of running parallel jobs and tools under a resource manager. The architecture is extensible and allows new systems to be incorporated. We describe the properties that a resource management system must have to work in this architecture, and find that these are ubiquitous in the resource management world. Using the Sun/spl trade/ Cluster Runtime Environment, we show the generality of the approach by implementing tight integrations with PBS, LSF and Sun Grid Engine software, and we demonstrate the advantages of a tight integration. No modifications or enhancements to these resource management systems were required, which is in marked contrast to ad-hoc approaches which typically require such changes.
我们提出了一种集成分布式资源管理系统和并行运行时环境(如MPI)的新体系结构。该体系结构解决了一个长期存在的问题,即以一种干净而健壮的方式实现两者之间的紧密集成,充分启用两个系统的功能,包括资源限制执行和会计。我们还向用户提供了一个更加统一的命令界面,这简化了在资源管理器下运行并行作业和工具的任务。该体系结构是可扩展的,并允许合并新系统。我们描述了资源管理系统在此体系结构中必须具有的属性,并发现这些属性在资源管理领域中无处不在。使用Sun/spl交易/集群运行时环境,我们通过实现与PBS、LSF和Sun Grid Engine软件的紧密集成来展示该方法的通用性,并且我们展示了紧密集成的优势。不需要修改或加强这些资源管理系统,这与通常需要这种改变的临时方法形成鲜明对比。
{"title":"An architecture for integrated resource management of MPI jobs","authors":"S. Sistare, Jack A. Test, D. Plauger","doi":"10.1109/CLUSTR.2002.1137769","DOIUrl":"https://doi.org/10.1109/CLUSTR.2002.1137769","url":null,"abstract":"We present a new architecture for the integration of distributed resource management systems and parallel run-time environments such as MPI. The architecture solves the long-standing problem of achieving a tight integration between the two in a clean and robust manner that fully enables the functionality of both systems, including resource limit enforcement and accounting. We also present a more uniform command interface to the user, which simplifies the task of running parallel jobs and tools under a resource manager. The architecture is extensible and allows new systems to be incorporated. We describe the properties that a resource management system must have to work in this architecture, and find that these are ubiquitous in the resource management world. Using the Sun/spl trade/ Cluster Runtime Environment, we show the generality of the approach by implementing tight integrations with PBS, LSF and Sun Grid Engine software, and we demonstrate the advantages of a tight integration. No modifications or enhancements to these resource management systems were required, which is in marked contrast to ad-hoc approaches which typically require such changes.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75007259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
User-level remote data access in overlay metacomputers 覆盖元计算机中的用户级远程数据访问
Pub Date : 2002-09-23 DOI: 10.1109/CLUSTR.2002.1137787
Jeff Siegel, P. Lu
A practical problem faced by users of metacomputers and computational grids is: If my computation can move from one system to another, how can I ensure that my data will still be available to my computation? Depending on the level of software, technical, and administrative support available, a data grid or a distributed file system would be reasonable solutions. However, it is not always possible (or practical) to have a diverse group of systems administrators agree to adopt a common infrastructure to support remote data access. Yet, having transparent access to any remote data is an important, practical capability. We have developed the Trellis File System (Trellis FS) to allow programs to access data files on any file system and on any host on a network that can be named by a Secure Copy Locator (SCL) or a Uniform Resource Locator (URL). Without requiring any new protocols or infrastructure, Trellis can be used on practically any POSIX-based system on the Internet. Read access, write access, sparse access, local caching of data, prefetching, and authentication are supported. Trellis is implemented as a user-level C library, which mimics the standard stream I/O functions, and is highly portable. Trellis is not a replacement for traditional file systems or data grids; it provides new capabilities by overlaying on top of other file systems, including grid-based file systems. And, by building upon an already-existing infrastructure (i.e., Secure Shell and Secure Copy), Trellis can be used in situations where a suitable data grid or distributed file system does not yet exist.
元计算机和计算网格用户面临的一个实际问题是:如果我的计算可以从一个系统转移到另一个系统,我如何确保我的数据仍然对我的计算可用?根据可用的软件、技术和管理支持的级别,数据网格或分布式文件系统将是合理的解决方案。然而,让一组不同的系统管理员同意采用一个通用的基础设施来支持远程数据访问并不总是可能的(或实际的)。然而,对任何远程数据进行透明访问是一项重要的实用功能。我们开发了Trellis文件系统(Trellis FS),允许程序访问任何文件系统和网络上的任何主机上的数据文件,这些文件系统和主机可以通过安全复制定位器(SCL)或统一资源定位器(URL)命名。不需要任何新的协议或基础设施,Trellis几乎可以在互联网上任何基于posix的系统上使用。支持读访问、写访问、稀疏访问、数据本地缓存、预取和鉴权。Trellis是作为一个用户级C库实现的,它模仿了标准的流I/O函数,并且具有很高的可移植性。网格不是传统文件系统或数据网格的替代品;它通过覆盖其他文件系统(包括基于网格的文件系统)来提供新功能。而且,通过建立在已经存在的基础设施(即,Secure Shell和Secure Copy)上,Trellis可以用于还不存在合适的数据网格或分布式文件系统的情况。
{"title":"User-level remote data access in overlay metacomputers","authors":"Jeff Siegel, P. Lu","doi":"10.1109/CLUSTR.2002.1137787","DOIUrl":"https://doi.org/10.1109/CLUSTR.2002.1137787","url":null,"abstract":"A practical problem faced by users of metacomputers and computational grids is: If my computation can move from one system to another, how can I ensure that my data will still be available to my computation? Depending on the level of software, technical, and administrative support available, a data grid or a distributed file system would be reasonable solutions. However, it is not always possible (or practical) to have a diverse group of systems administrators agree to adopt a common infrastructure to support remote data access. Yet, having transparent access to any remote data is an important, practical capability. We have developed the Trellis File System (Trellis FS) to allow programs to access data files on any file system and on any host on a network that can be named by a Secure Copy Locator (SCL) or a Uniform Resource Locator (URL). Without requiring any new protocols or infrastructure, Trellis can be used on practically any POSIX-based system on the Internet. Read access, write access, sparse access, local caching of data, prefetching, and authentication are supported. Trellis is implemented as a user-level C library, which mimics the standard stream I/O functions, and is highly portable. Trellis is not a replacement for traditional file systems or data grids; it provides new capabilities by overlaying on top of other file systems, including grid-based file systems. And, by building upon an already-existing infrastructure (i.e., Secure Shell and Secure Copy), Trellis can be used in situations where a suitable data grid or distributed file system does not yet exist.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81796092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A distributed data implementation of parallel full CI program 一个分布式数据实现的并行全CI程序
Pub Date : 2002-09-23 DOI: 10.1109/CLUSTR.2002.1137786
Zhengting Gan, Y. Alexeev, R. Kendall, M. Gordon
A distributed data parallel full CI program is described The implementation of the FCI algorithm is organized in a combined Cl driven approach With extra computation we were able to avoid redundant communication, and convert the collective communication into more efficient point-to-point communication. The network performance is further optimized by improved DDI library. Examples show very good speedup performance on 16 node PC clusters. The application of the code is also demonstrated.
描述了一个分布式数据并行全CI程序,FCI算法的实现采用组合Cl驱动的方式组织,通过额外的计算可以避免冗余通信,并将集体通信转换为更高效的点对点通信。改进的DDI库进一步优化了网络性能。示例显示了在16节点PC集群上非常好的加速性能。还演示了该代码的应用。
{"title":"A distributed data implementation of parallel full CI program","authors":"Zhengting Gan, Y. Alexeev, R. Kendall, M. Gordon","doi":"10.1109/CLUSTR.2002.1137786","DOIUrl":"https://doi.org/10.1109/CLUSTR.2002.1137786","url":null,"abstract":"A distributed data parallel full CI program is described The implementation of the FCI algorithm is organized in a combined Cl driven approach With extra computation we were able to avoid redundant communication, and convert the collective communication into more efficient point-to-point communication. The network performance is further optimized by improved DDI library. Examples show very good speedup performance on 16 node PC clusters. The application of the code is also demonstrated.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78934633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings. IEEE International Conference on Cluster Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1