首页 > 最新文献

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing最新文献

英文 中文
Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing 耦合多物理场仿真的高效运行环境:动态资源分配和负载平衡
S. Ko, Nayong Kim, Joohyun Kim, A. Thota, S. Jha
Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of the physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.
耦合多物理场模拟,如混合CFD-MD模拟,代表了越来越重要的科学应用类别。通常,感兴趣的物理问题需要使用高端计算机,例如TeraGrid资源,这些资源通常只能通过批处理队列访问。批处理队列系统本身并不是为了支持作业的协调调度而开发的——这反过来又需要支持耦合多物理场模拟所需的并发执行。在本文中,我们开发并演示了一种新的方法来克服缺乏与耦合运行相关的协调作业提交需求的本机支持。我们建立了从我们的解决方案中产生的性能优势,该解决方案是Pilot-Job概念的概括- Pilot-Job本身并不新鲜,但首次应用于耦合模拟。我们的解决方案不仅克服了最初的协同调度问题,而且提供了一种动态的资源分配机制。对这种动态资源的支持对于负载平衡机制至关重要,我们开发并演示了负载平衡机制在减少解决问题的总时间方面是有效的。我们确定,使用Big Jobs的性能优势与机器的大小以及所研究的物理模型的大小是不变的。Pilot-Job抽象是使用SAGA开发的,它提供了一个与基础设施无关的实现,并且可以无缝地执行和利用分布式资源。
{"title":"Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing","authors":"S. Ko, Nayong Kim, Joohyun Kim, A. Thota, S. Jha","doi":"10.1109/CCGRID.2010.107","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.107","url":null,"abstract":"Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of the physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124977535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
The Lightweight Approach to Use Grid Services with Grid Widgets on Grid WebOS 在网格WebOS上使用网格服务和网格小部件的轻量级方法
Yi-Lun Pan, Chang-Hsing Wu, Chia-Yen Liu, Hsi-En Yu, Weicheng Huang
To bridge the gap between computing grid environment and users, various Grid Widgets are developed by the Grid development team in the National Center for High-performance Computing (NCHC). These widgets are implemented to provide users with seamless and scalable access to Grid resources. Currently, this effort integrates the de facto Grid middleware, Web-based Operating System (WebOS), and automatic resource allocation mechanism to form a virtual computer in distributed computing environment. With the capability of automatic resource allocation and the feature of dynamic load prediction, the Resource Broker (RB) improves the performance of the dynamic scheduling over conventional scheduling policies. With this extremely lightweight and flexible approach to acquire Grid services, the barrier for users to access geographically distributed heterogeneous Grid resources is largely reduced. The Grid Widgets can also be customized and configured to meet the demands of the users.
为了弥合计算网格环境和用户之间的差距,国家高性能计算中心(NCHC)的网格开发团队开发了各种网格小部件。实现这些小部件是为了向用户提供对网格资源的无缝和可伸缩访问。目前,该工作集成了事实上的网格中间件、基于web的操作系统(WebOS)和自动资源分配机制,形成了分布式计算环境中的虚拟计算机。资源代理(resource Broker, RB)具有自动分配资源的能力和动态负荷预测的特性,比传统调度策略提高了动态调度的性能。通过这种极其轻量级和灵活的方法来获取网格服务,用户访问地理上分布的异构网格资源的障碍大大减少了。网格小部件还可以定制和配置,以满足用户的需求。
{"title":"The Lightweight Approach to Use Grid Services with Grid Widgets on Grid WebOS","authors":"Yi-Lun Pan, Chang-Hsing Wu, Chia-Yen Liu, Hsi-En Yu, Weicheng Huang","doi":"10.1109/CCGRID.2010.25","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.25","url":null,"abstract":"To bridge the gap between computing grid environment and users, various Grid Widgets are developed by the Grid development team in the National Center for High-performance Computing (NCHC). These widgets are implemented to provide users with seamless and scalable access to Grid resources. Currently, this effort integrates the de facto Grid middleware, Web-based Operating System (WebOS), and automatic resource allocation mechanism to form a virtual computer in distributed computing environment. With the capability of automatic resource allocation and the feature of dynamic load prediction, the Resource Broker (RB) improves the performance of the dynamic scheduling over conventional scheduling policies. With this extremely lightweight and flexible approach to acquire Grid services, the barrier for users to access geographically distributed heterogeneous Grid resources is largely reduced. The Grid Widgets can also be customized and configured to meet the demands of the users.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121904228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experiments with Memory-to-Memory Coupling for End-to-End Fusion Simulation Workflows 面向端到端融合仿真工作流的内存-内存耦合实验
C. Docan, Fan Zhang, M. Parashar, J. Cummings, N. Podhorszki, S. Klasky
Scientific applications are striving to accurately simulate multiple interacting physical processes that comprise complex phenomena being modeled. Efficient and scalable parallel implementations of these coupled simulations present challenging interaction and coordination requirements, especially when the coupled physical processes are computationally heterogeneous and progress at different speeds. In this paper, we present the design, implementation and evaluation of a memory-to-memory coupling framework for coupled scientific simulations on high-performance parallel computing platforms. The framework is driven by the coupling requirements of the Center for Plasma Edge Simulation, and it provides simple coupling abstractions as well as efficient asynchronous (RDMA-based) memory-to-memory data transport mechanisms that complement existing parallel programming systems and data sharing frameworks. The framework enables flexible coupling behaviors that are asynchronous in time and space, and it supports dynamic coupling between heterogeneous simulation processes without enforcing any synchronization constraints. We evaluate the performance and scalability of the coupling framework using a specific coupling scenario, on the Jaguar Cray XT5 system at Oak Ridge National Laboratory.
科学应用正在努力精确地模拟包含正在建模的复杂现象的多种相互作用的物理过程。这些耦合模拟的高效和可伸缩的并行实现提出了具有挑战性的交互和协调要求,特别是当耦合物理过程在计算上是异构的并且以不同的速度进行时。在本文中,我们提出了一个内存到内存耦合框架的设计、实现和评估,用于高性能并行计算平台上的耦合科学仿真。该框架是由等离子体边缘仿真中心的耦合需求驱动的,它提供了简单的耦合抽象以及高效的异步(基于rdma的)内存到内存数据传输机制,以补充现有的并行编程系统和数据共享框架。该框架支持在时间和空间上异步的灵活耦合行为,并且支持异构模拟过程之间的动态耦合,而不强制任何同步约束。在橡树岭国家实验室的Jaguar Cray XT5系统上,我们使用一个特定的耦合场景来评估耦合框架的性能和可扩展性。
{"title":"Experiments with Memory-to-Memory Coupling for End-to-End Fusion Simulation Workflows","authors":"C. Docan, Fan Zhang, M. Parashar, J. Cummings, N. Podhorszki, S. Klasky","doi":"10.1109/CCGRID.2010.101","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.101","url":null,"abstract":"Scientific applications are striving to accurately simulate multiple interacting physical processes that comprise complex phenomena being modeled. Efficient and scalable parallel implementations of these coupled simulations present challenging interaction and coordination requirements, especially when the coupled physical processes are computationally heterogeneous and progress at different speeds. In this paper, we present the design, implementation and evaluation of a memory-to-memory coupling framework for coupled scientific simulations on high-performance parallel computing platforms. The framework is driven by the coupling requirements of the Center for Plasma Edge Simulation, and it provides simple coupling abstractions as well as efficient asynchronous (RDMA-based) memory-to-memory data transport mechanisms that complement existing parallel programming systems and data sharing frameworks. The framework enables flexible coupling behaviors that are asynchronous in time and space, and it supports dynamic coupling between heterogeneous simulation processes without enforcing any synchronization constraints. We evaluate the performance and scalability of the coupling framework using a specific coupling scenario, on the Jaguar Cray XT5 system at Oak Ridge National Laboratory.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123916973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Elastic Site: Using Clouds to Elastically Extend Site Resources 弹性站点:使用云来弹性地扩展站点资源
Paul Marshall, K. Keahey, Timothy Freeman
Infrastructure-as-a-Service (IaaS) cloud computing offers new possibilities to scientific communities. One of the most significant is the ability to elastically provision and relinquish new resources in response to changes in demand. In our work, we develop a model of an “elastic site” that efficiently adapts services provided within a site, such as batch schedulers, storage archives, or Web services to take advantage of elastically provisioned resources. We describe the system architecture along with the issues involved with elastic provisioning, such as security, privacy, and various logistical considerations. To avoid over- or under-provisioning the resources we propose three different policies to efficiently schedule resource deployment based on demand. We have implemented a resource manager, built on the Nimbus toolkit to dynamically and securely extend existing physical clusters into the cloud. Our elastic site manager interfaces directly with local resource managers, such as Torque. We have developed and evaluated policies for resource provisioning on a Nimbus-based cloud at the University of Chicago, another at Indiana University, and Amazon EC2. We demonstrate a dynamic and responsive elastic cluster, capable of responding effectively to a variety of job submission patterns. We also demonstrate that we can process 10 times faster by expanding our cluster up to 150 EC2 nodes.
基础设施即服务(IaaS)云计算为科学界提供了新的可能性。其中最重要的一点是有能力根据需求的变化灵活地提供和放弃新的资源。在我们的工作中,我们开发了一个“弹性站点”模型,该模型可以有效地适应站点内提供的服务,例如批调度程序、存储归档或Web服务,以利用弹性供应的资源。我们将描述系统架构以及弹性供应所涉及的问题,例如安全性、隐私性和各种后勤考虑。为了避免资源供应过剩或不足,我们提出了三种不同的策略来根据需求有效地调度资源部署。我们已经实现了一个资源管理器,构建在Nimbus工具包上,可以动态地、安全地将现有的物理集群扩展到云中。我们的弹性站点管理器直接与本地资源管理器(如Torque)接口。我们已经在芝加哥大学、印第安纳大学和Amazon EC2的一个基于nimbus的云上开发并评估了资源配置策略。我们演示了一个动态和响应的弹性集群,能够有效地响应各种作业提交模式。我们还演示了通过将集群扩展到150个EC2节点,我们可以将处理速度提高10倍。
{"title":"Elastic Site: Using Clouds to Elastically Extend Site Resources","authors":"Paul Marshall, K. Keahey, Timothy Freeman","doi":"10.1109/CCGRID.2010.80","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.80","url":null,"abstract":"Infrastructure-as-a-Service (IaaS) cloud computing offers new possibilities to scientific communities. One of the most significant is the ability to elastically provision and relinquish new resources in response to changes in demand. In our work, we develop a model of an “elastic site” that efficiently adapts services provided within a site, such as batch schedulers, storage archives, or Web services to take advantage of elastically provisioned resources. We describe the system architecture along with the issues involved with elastic provisioning, such as security, privacy, and various logistical considerations. To avoid over- or under-provisioning the resources we propose three different policies to efficiently schedule resource deployment based on demand. We have implemented a resource manager, built on the Nimbus toolkit to dynamically and securely extend existing physical clusters into the cloud. Our elastic site manager interfaces directly with local resource managers, such as Torque. We have developed and evaluated policies for resource provisioning on a Nimbus-based cloud at the University of Chicago, another at Indiana University, and Amazon EC2. We demonstrate a dynamic and responsive elastic cluster, capable of responding effectively to a variety of job submission patterns. We also demonstrate that we can process 10 times faster by expanding our cluster up to 150 EC2 nodes.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"343 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124313818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 283
Performance Analysis of Diffusion Tensor Imaging in an Academic Production Grid 学术生产网格中扩散张量成像性能分析
D. Krefting, R. Lützkendorf, Kathrin Peter, J. Bernarding
Analysis of diffusion weighted magnetic resonance images serves increasingly for non-invasive tracking of nerve fibers in the human brain, both in clinical diagnosis and basic research. Diffusion-tensor imaging (DTI) enables in-vivo research on the internal structure of the central nervous system, an estimation of the interconnection of functional areas and diagnosis of brain tumors and de-myelinating diseases. But modeling the local diffusion parameters is computationally expensive and on standard desktop computers runtimes of up to days are common. A workflow based grid implementation of the algorithm with slice-based parallelization has shown significant speedup. However, in production use, the implementation frequently delayed and even failed, discouraging the medical collaborators to take up the management of the data processing themselves. Therefore a comprehensive analysis of possible sources for errors and delays as well as their real impact in the respective infrastructure is vital to enable clinical researchers to fully exploit the benefits of the Healthgrid application. In this manuscript, we tested different implementations of the DTI analysis with respect to robustness and runtime. Based on the results, concrete application improvements as well as general suggestions for the layout and maintenance of Healthgrids are concluded.
磁共振弥散加权图像的分析在临床诊断和基础研究中越来越多地用于人脑神经纤维的无创跟踪。弥散张量成像(Diffusion-tensor imaging, DTI)可以在体内研究中枢神经系统的内部结构,估计功能区域的相互联系,以及诊断脑肿瘤和脱髓鞘疾病。但是,对局部扩散参数进行建模在计算上是昂贵的,在标准台式计算机上运行长达几天的时间是很常见的。基于工作流的网格实现和基于切片的并行化显示出显著的加速。然而,在生产使用中,实施经常延迟甚至失败,使医疗合作者不愿意自己承担数据处理的管理工作。因此,全面分析错误和延迟的可能来源,以及它们对各自基础设施的实际影响,对于临床研究人员充分利用Healthgrid应用程序的好处至关重要。在本文中,我们就鲁棒性和运行时测试了DTI分析的不同实现。在此基础上,提出了具体的应用改进方案,并对健康网的布局和维护提出了一般性建议。
{"title":"Performance Analysis of Diffusion Tensor Imaging in an Academic Production Grid","authors":"D. Krefting, R. Lützkendorf, Kathrin Peter, J. Bernarding","doi":"10.1109/CCGRID.2010.21","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.21","url":null,"abstract":"Analysis of diffusion weighted magnetic resonance images serves increasingly for non-invasive tracking of nerve fibers in the human brain, both in clinical diagnosis and basic research. Diffusion-tensor imaging (DTI) enables in-vivo research on the internal structure of the central nervous system, an estimation of the interconnection of functional areas and diagnosis of brain tumors and de-myelinating diseases. But modeling the local diffusion parameters is computationally expensive and on standard desktop computers runtimes of up to days are common. A workflow based grid implementation of the algorithm with slice-based parallelization has shown significant speedup. However, in production use, the implementation frequently delayed and even failed, discouraging the medical collaborators to take up the management of the data processing themselves. Therefore a comprehensive analysis of possible sources for errors and delays as well as their real impact in the respective infrastructure is vital to enable clinical researchers to fully exploit the benefits of the Healthgrid application. In this manuscript, we tested different implementations of the DTI analysis with respect to robustness and runtime. Based on the results, concrete application improvements as well as general suggestions for the layout and maintenance of Healthgrids are concluded.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122733833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An Effective Architecture for Automated Appliance Management System Applying Ontology-Based Cloud Discovery 应用基于本体的云发现的自动化设备管理系统的有效架构
A. V. Dastjerdi, Sayed Gholam Hassan Tabatabaei, R. Buyya
Cloud computing is a computing paradigm which allows access of computing elements and storages on-demand over the Internet. Virtual Appliances, pre-configured, ready-to-run applications are emerging as a breakthrough technology to solve the complexities of service deployment on Cloud infrastructure. However, an automated approach to deploy required appliances on the most suitable Cloud infrastructure is neglected by previous works which is the focus of this work. In this paper, we propose an effective architecture using ontology-based discovery to provide QoS aware deployment of appliances on Cloud service providers. In addition, we test our approach on a case study and the result shows the efficiency and effectiveness of the proposed work.
云计算是一种计算范式,它允许在互联网上按需访问计算元素和存储。虚拟设备是一种预配置的、随时可运行的应用程序,它正在成为解决云基础设施上服务部署复杂性的突破性技术。然而,在最合适的云基础设施上部署所需设备的自动化方法被以前的工作所忽视,而这正是本文的重点。在本文中,我们提出了一种有效的架构,使用基于本体的发现来为云服务提供商的设备提供QoS感知部署。此外,我们在一个案例研究中测试了我们的方法,结果显示了所提出工作的效率和有效性。
{"title":"An Effective Architecture for Automated Appliance Management System Applying Ontology-Based Cloud Discovery","authors":"A. V. Dastjerdi, Sayed Gholam Hassan Tabatabaei, R. Buyya","doi":"10.1109/CCGRID.2010.87","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.87","url":null,"abstract":"Cloud computing is a computing paradigm which allows access of computing elements and storages on-demand over the Internet. Virtual Appliances, pre-configured, ready-to-run applications are emerging as a breakthrough technology to solve the complexities of service deployment on Cloud infrastructure. However, an automated approach to deploy required appliances on the most suitable Cloud infrastructure is neglected by previous works which is the focus of this work. In this paper, we propose an effective architecture using ontology-based discovery to provide QoS aware deployment of appliances on Cloud service providers. In addition, we test our approach on a case study and the result shows the efficiency and effectiveness of the proposed work.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116813584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Selective Recovery from Failures in a Task Parallel Programming Model 任务并行编程模型中的故障选择性恢复
James Dinan, Arjun Singri, P. Sadayappan, S. Krishnamoorthy
We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tracking mechanism. Compared with conventional checkpoint/restart techniques, this system offers a recovery penalty that is proportional to the degree of failure rather than the system size. We evaluate this system using the Self Consistent Field (SCF) kernel which forms an important component in ab initio methods for computational chemistry. Experimental results indicate that fault tolerant task pools are robust in the presence of an arbitrary number of failures and that they offer low overhead in the absence of faults.
我们提出了一个容错任务池执行环境,该环境能够使用轻量级的分布式任务完成跟踪机制执行细粒度选择性重启。与传统的检查点/重启技术相比,该系统提供的恢复损失与故障程度成正比,而不是与系统大小成正比。我们使用自洽场核(SCF)来评估这个系统,它是计算化学从头算方法的一个重要组成部分。实验结果表明,容错任务池在存在任意数量故障的情况下具有鲁棒性,在没有故障的情况下具有较低的开销。
{"title":"Selective Recovery from Failures in a Task Parallel Programming Model","authors":"James Dinan, Arjun Singri, P. Sadayappan, S. Krishnamoorthy","doi":"10.1109/CCGRID.2010.34","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.34","url":null,"abstract":"We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tracking mechanism. Compared with conventional checkpoint/restart techniques, this system offers a recovery penalty that is proportional to the degree of failure rather than the system size. We evaluate this system using the Self Consistent Field (SCF) kernel which forms an important component in ab initio methods for computational chemistry. Experimental results indicate that fault tolerant task pools are robust in the presence of an arbitrary number of failures and that they offer low overhead in the absence of faults.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129029704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 云上数据密集型应用的动态负载均衡组播
Tatsuhiro Chiba, M. Burger, T. Kielmann, S. Matsuoka
Data-intensive parallel applications on clouds need to deploy large data sets from the cloud's storage facility to all compute nodes as fast as possible. Many multicast algorithms have been proposed for clusters and grid environments. The most common approach is to construct one or more spanning trees based on the network topology and network monitoring data in order to maximize available bandwidth and avoid bottleneck links. However, delivering optimal performance becomes difficult once the available bandwidth changes dynamically. In this paper, we focus on Amazon EC2/S3 (the most commonly used cloud platform today) and propose two high performance multicast algorithms. These algorithms make it possible to efficiently transfer large amounts of data stored in Amazon S3 to multiple Amazon EC2 nodes. The three salient features of our algorithms are (1) to construct an overlay network on clouds without network topology information, (2) to optimize the total throughput dynamically, and (3) to increase the download throughput by letting nodes cooperate with each other. The two algorithms differ in the way nodes cooperate: the first `non-steal' algorithm lets each node download an equal share of all data, while the second `steal' algorithm uses work stealing to counter the effect of heterogeneous download bandwidth. As a result, all nodes can download files from S3 quickly, even when the network performance changes while the algorithm is running. We evaluate our algorithms on EC2/S3, and show that they are scalable and consistently achieve high throughput. Both algorithms perform much better than having each node downloading all data directly from S3.
云上的数据密集型并行应用程序需要尽可能快地将大型数据集从云存储设施部署到所有计算节点。针对集群和网格环境,已经提出了许多组播算法。最常见的方法是根据网络拓扑结构和网络监控数据构造一个或多个生成树,以最大限度地利用可用带宽并避免瓶颈链路。然而,一旦可用带宽发生动态变化,交付最佳性能就变得困难了。在本文中,我们关注Amazon EC2/S3(当今最常用的云平台),并提出了两种高性能多播算法。这些算法可以有效地将存储在Amazon S3中的大量数据传输到多个Amazon EC2节点。我们的算法有三个显著特点:(1)在没有网络拓扑信息的云上构建覆盖网络;(2)动态优化总吞吐量;(3)通过节点之间的相互协作来提高下载吞吐量。这两种算法在节点合作的方式上有所不同:第一种“非窃取”算法让每个节点下载所有数据的同等份额,而第二种“窃取”算法使用工作窃取来抵消异构下载带宽的影响。因此,即使在算法运行时网络性能发生变化,所有节点也可以快速地从S3下载文件。我们在EC2/S3上评估了我们的算法,并表明它们是可扩展的,并且始终实现高吞吐量。这两种算法的性能都比每个节点直接从S3下载所有数据要好得多。
{"title":"Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds","authors":"Tatsuhiro Chiba, M. Burger, T. Kielmann, S. Matsuoka","doi":"10.1109/CCGRID.2010.63","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.63","url":null,"abstract":"Data-intensive parallel applications on clouds need to deploy large data sets from the cloud's storage facility to all compute nodes as fast as possible. Many multicast algorithms have been proposed for clusters and grid environments. The most common approach is to construct one or more spanning trees based on the network topology and network monitoring data in order to maximize available bandwidth and avoid bottleneck links. However, delivering optimal performance becomes difficult once the available bandwidth changes dynamically. In this paper, we focus on Amazon EC2/S3 (the most commonly used cloud platform today) and propose two high performance multicast algorithms. These algorithms make it possible to efficiently transfer large amounts of data stored in Amazon S3 to multiple Amazon EC2 nodes. The three salient features of our algorithms are (1) to construct an overlay network on clouds without network topology information, (2) to optimize the total throughput dynamically, and (3) to increase the download throughput by letting nodes cooperate with each other. The two algorithms differ in the way nodes cooperate: the first `non-steal' algorithm lets each node download an equal share of all data, while the second `steal' algorithm uses work stealing to counter the effect of heterogeneous download bandwidth. As a result, all nodes can download files from S3 quickly, even when the network performance changes while the algorithm is running. We evaluate our algorithms on EC2/S3, and show that they are scalable and consistently achieve high throughput. Both algorithms perform much better than having each node downloading all data directly from S3.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129072012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Cluster Computing as an Assembly Process: Coordination with S-Net 作为装配过程的集群计算:与S-Net的协调
C. Grelck, Jukka Julku, F. Penczek, A. Shafarenko
This poster will present a coordination language for distributed computing and will discuss its application to cluster computing. It will introduce a programming technique of cluster computing whereby application components are completely dissociated from the communication/coordination infrastructure (unlike MPI-style message passing), and there is no shared memory either, whether virtual or physical (unlike Open-MP). Cluster computing is thus presented as something that happens as late as the assembly stage: components are integrated into an application using a new form of network glue: Single-Input, Single-Output (SISO) asynchronous, no deterministic coordination.
这张海报将介绍分布式计算的协调语言,并讨论其在集群计算中的应用。它将引入一种集群计算的编程技术,其中应用程序组件与通信/协调基础设施完全分离(与mpi风格的消息传递不同),并且也没有共享内存,无论是虚拟的还是物理的(与Open-MP不同)。因此,集群计算在组装阶段才会出现:组件使用一种新的网络粘合形式集成到应用程序中:单输入、单输出(SISO)异步、无确定性协调。
{"title":"Cluster Computing as an Assembly Process: Coordination with S-Net","authors":"C. Grelck, Jukka Julku, F. Penczek, A. Shafarenko","doi":"10.1109/CCGRID.2010.103","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.103","url":null,"abstract":"This poster will present a coordination language for distributed computing and will discuss its application to cluster computing. It will introduce a programming technique of cluster computing whereby application components are completely dissociated from the communication/coordination infrastructure (unlike MPI-style message passing), and there is no shared memory either, whether virtual or physical (unlike Open-MP). Cluster computing is thus presented as something that happens as late as the assembly stage: components are integrated into an application using a new form of network glue: Single-Input, Single-Output (SISO) asynchronous, no deterministic coordination.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121800839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-FFT Vectorization for the Cell Multicore Processor Cell多核处理器的多fft矢量化
J. Barhen, T. Humble, P. Mitra, M. Traweek
The emergence of streaming multicore processors with multi-SIMD architectures and ultra-low power operation combined with real-time compute and I/O reconfigurability opens unprecedented opportunities for executing sophisticated signal processing algorithms faster and within a much lower energy budget. Here, we present an unconventional FFT implementation scheme for the IBM Cell, named transverse vectorization. It is shown to outperform (both in terms of timing or GFLOP throughput) the fastest FFT results reported to date in the open literature.
具有多simd架构和超低功耗操作的流多核处理器的出现,结合了实时计算和I/O可重构性,为在更低的能源预算下更快地执行复杂的信号处理算法提供了前所未有的机会。在这里,我们为IBM Cell提出了一种非常规的FFT实现方案,称为横向矢量化。它被证明优于(在时序或GFLOP吞吐量方面)迄今为止在公开文献中报道的最快FFT结果。
{"title":"Multi-FFT Vectorization for the Cell Multicore Processor","authors":"J. Barhen, T. Humble, P. Mitra, M. Traweek","doi":"10.1109/CCGRID.2010.78","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.78","url":null,"abstract":"The emergence of streaming multicore processors with multi-SIMD architectures and ultra-low power operation combined with real-time compute and I/O reconfigurability opens unprecedented opportunities for executing sophisticated signal processing algorithms faster and within a much lower energy budget. Here, we present an unconventional FFT implementation scheme for the IBM Cell, named transverse vectorization. It is shown to outperform (both in terms of timing or GFLOP throughput) the fastest FFT results reported to date in the open literature.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122303248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1