2008 Workshop on Many-Task Computing on Grids and Supercomputers最新文献

英文中文

Embarrassingly parallel jobs are not embarrassingly easy to schedule on the grid 令人尴尬的是，并行作业在网格上的调度并不容易

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777910

E. Afgan, P. Bangalore

Embarrassingly parallel applications represent an important workload in today's grid environments. Scheduling and execution of this class of applications is considered mostly a trivial and well-understood process on homogeneous clusters. However, while grid environments provide the necessary computational resources, associated resource heterogeneity represents a new challenge for efficient task execution for these types of applications across multiple resources. This paper presents a set of examples illustrating how execution characteristics of individual tasks, and consequently a job, are affected by the choice of task execution resources, task invocation parameters, and task input data attributes. It is the aim of this work to highlight this relationship between an application and an execution resource to promote development of better metascheduling techniques for the grid. By exploiting this relationship, application throughput can be maximized, also resulting in higher resource utilization. In order to achieve such benefits, a set of job scheduling and execution concerns is derived leading toward a computational pipeline for scheduling embarrassingly parallel applications in grid environments.

令人尴尬的是，并行应用程序在当今的网格环境中代表了重要的工作负载。在同构集群上，这类应用程序的调度和执行通常被认为是一个简单且易于理解的过程。然而，尽管网格环境提供了必要的计算资源，但相关的资源异构性对这些类型的应用程序跨多个资源的高效任务执行提出了新的挑战。本文提供了一组示例，说明了任务执行资源、任务调用参数和任务输入数据属性的选择如何影响单个任务和作业的执行特征。这项工作的目的是强调应用程序和执行资源之间的这种关系，以促进更好的网格元调度技术的开发。通过利用这种关系，可以最大化应用程序吞吐量，从而提高资源利用率。为了获得这样的好处，一组作业调度和执行问题被衍生出来，导致在网格环境中调度令人尴尬的并行应用程序的计算管道。

{"title":"Embarrassingly parallel jobs are not embarrassingly easy to schedule on the grid","authors":"E. Afgan, P. Bangalore","doi":"10.1109/MTAGS.2008.4777910","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777910","url":null,"abstract":"Embarrassingly parallel applications represent an important workload in today's grid environments. Scheduling and execution of this class of applications is considered mostly a trivial and well-understood process on homogeneous clusters. However, while grid environments provide the necessary computational resources, associated resource heterogeneity represents a new challenge for efficient task execution for these types of applications across multiple resources. This paper presents a set of examples illustrating how execution characteristics of individual tasks, and consequently a job, are affected by the choice of task execution resources, task invocation parameters, and task input data attributes. It is the aim of this work to highlight this relationship between an application and an execution resource to promote development of better metascheduling techniques for the grid. By exploiting this relationship, application throughput can be maximized, also resulting in higher resource utilization. In order to achieve such benefits, a set of job scheduling and execution concerns is derived leading toward a computational pipeline for scheduling embarrassingly parallel applications in grid environments.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Design and evaluation of a collective IO model for loosely coupled petascale programming 松散耦合千万亿级编程的集体IO模型的设计和评估

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777908

Zhao Zhang, Allan Espinosa, K. Iskra, I. Raicu, Ian T Foster, M. Wilde

Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.

松耦合编程是一种强大的范例，用于从千万亿级系统上的科学程序快速创建高级应用程序，通常使用脚本语言。这种范式是多任务计算(MTC)的一种形式，它侧重于在程序之间以普通文件而不是消息的形式传递数据。虽然它具有分离生产者和消费者以及允许并行执行现有应用程序而无需重新编码的显著优点，但其使用共享文件系统的典型实现给整个系统以及将分析和使用下游数据的用户带来了很高的性能负担。以前的工作已经通过松耦合程序实现了很大的速度提升，但这是通过对所有共享文件系统访问进行仔细的手动调优实现的。在这项工作中，我们评估了基于文件的MTC的原型集体IO模型。该模型能够高效、方便地将输入数据文件分发到计算节点，并从中收集输出结果。它消除了这种手动调优的需要，并且使使用松耦合模型的大规模集群的编程变得更加容易。我们的方法受到内存中并行编程的集体操作方法的启发，建立在快速本地文件系统上，为并行脚本提供高速本地文件缓存，使用广播方法处理公共输入数据的分布，并使用高效的分散/收集和缓存技术进行输入和输出。我们描述了原型模型的设计，它在蓝色基因/P超级计算机上的实现，并在合成基准和大规模分子动力学应用上对其性能进行了初步测量。

{"title":"Design and evaluation of a collective IO model for loosely coupled petascale programming","authors":"Zhao Zhang, Allan Espinosa, K. Iskra, I. Raicu, Ian T Foster, M. Wilde","doi":"10.1109/MTAGS.2008.4777908","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777908","url":null,"abstract":"Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126692234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

System support for many task computing 系统支持许多任务计算

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777907

E. V. Hensbergen, R. Minnich

The popularity of large scale systems such as Blue Gene has extended their reach beyond HPC into the realm of commercial computing. There is a desire in both communities to broaden the scope of these machines from tightly-coupled scientific applications running on MPI frameworks to more general-purpose workloads. Our approach deals with issues of scale by leveraging the huge number of nodes to distribute operating systems services and components across the machine, tightly coupling the operating system and the interconnects to take maximum advantage of the unique capabilities of the HPC system. We plan on provisioning nodes to provide workload execution, aggregation, and system services, and dynamically re-provisioning nodes as necessary to accommodate changes, failure, and redundancy. By incorporating aggregation as a first-class system construct, we will provide dynamic hierarchical organization and management of all system resources. In this paper, we will go into the design principles of our approach using file systems, workload distribution and system monitoring as illustrative examples. Our end goal is to provide a cohesive distributed system which can broaden the class of applications for large scale systems and also make them more approachable for a larger class of developers and end users.

像Blue Gene这样的大规模系统的流行已经将它们的范围从高性能计算扩展到了商业计算领域。两个社区都希望将这些机器的范围从在MPI框架上运行的紧密耦合的科学应用程序扩展到更通用的工作负载。我们的方法通过利用大量节点在机器上分布操作系统服务和组件来处理规模问题，将操作系统和互连紧密耦合，以最大限度地利用HPC系统的独特功能。我们计划配置节点来提供工作负载执行、聚合和系统服务，并根据需要动态地重新配置节点，以适应更改、故障和冗余。通过将聚合作为一级系统结构，我们将提供所有系统资源的动态分层组织和管理。在本文中，我们将使用文件系统、工作负载分布和系统监视作为说明性示例，深入探讨我们的方法的设计原则。我们的最终目标是提供一个内聚的分布式系统，它可以扩展大规模系统的应用程序类别，并使它们更容易为更大类别的开发人员和最终用户所接受。

{"title":"System support for many task computing","authors":"E. V. Hensbergen, R. Minnich","doi":"10.1109/MTAGS.2008.4777907","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777907","url":null,"abstract":"The popularity of large scale systems such as Blue Gene has extended their reach beyond HPC into the realm of commercial computing. There is a desire in both communities to broaden the scope of these machines from tightly-coupled scientific applications running on MPI frameworks to more general-purpose workloads. Our approach deals with issues of scale by leveraging the huge number of nodes to distribute operating systems services and components across the machine, tightly coupling the operating system and the interconnects to take maximum advantage of the unique capabilities of the HPC system. We plan on provisioning nodes to provide workload execution, aggregation, and system services, and dynamically re-provisioning nodes as necessary to accommodate changes, failure, and redundancy. By incorporating aggregation as a first-class system construct, we will provide dynamic hierarchical organization and management of all system resources. In this paper, we will go into the design principles of our approach using file systems, workload distribution and system monitoring as illustrative examples. Our end goal is to provide a cohesive distributed system which can broaden the class of applications for large scale systems and also make them more approachable for a larger class of developers and end users.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A lightweight execution framework for massive independent tasks 用于大量独立任务的轻量级执行框架

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777911

Hui Li, Huashan Yu, Xiaoming Li

This paper presents a lightweight framework for executing many independent tasks efficiently on grids of heterogeneous computational nodes. It dynamically groups tasks of different granularities and dispatches the groups onto distributed computational resources concurrently. Three strategies have been devised to improve the efficiency of computation and resource utilization. One strategy is to pack up to thousands of tasks into one request. Another is to share the effort in resource discovery and allocation among requests by separating resource allocations from request submissions. The third strategy is to pack variable numbers of tasks into different requests, where the task number is a function of the destination resource's computability. This framework has been implemented in Gracie, a computational grid software platform developed by Peking University, and used for executing bioinformatics tasks. We describe its architecture, evaluate its strategies, and compare its performance with GRAM. Analyzing the experiment results, we found that Gracie outperforms GRAM significantly for execution of sets of small tasks, which is aligned with the intuitive advantage of our approaches built in Gracie.

本文提出了一个轻量级框架，用于在异构计算节点网格上高效地执行许多独立任务。它动态地对不同粒度的任务进行分组，并将这些分组并发地分配到分布式计算资源上。为了提高计算效率和资源利用率，设计了三种策略。一种策略是将多达数千个任务打包到一个请求中。另一种方法是通过将资源分配与请求提交分离，在请求之间分担资源发现和分配的工作。第三种策略是将可变数量的任务打包到不同的请求中，其中任务数是目标资源可计算性的函数。该框架已在北京大学开发的计算网格软件平台Gracie中实现，用于执行生物信息学任务。我们描述了它的架构，评估了它的策略，并将它的性能与GRAM进行了比较。分析实验结果，我们发现Gracie在执行一组小任务时明显优于GRAM，这与我们在Gracie中构建的方法的直观优势是一致的。

{"title":"A lightweight execution framework for massive independent tasks","authors":"Hui Li, Huashan Yu, Xiaoming Li","doi":"10.1109/MTAGS.2008.4777911","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777911","url":null,"abstract":"This paper presents a lightweight framework for executing many independent tasks efficiently on grids of heterogeneous computational nodes. It dynamically groups tasks of different granularities and dispatches the groups onto distributed computational resources concurrently. Three strategies have been devised to improve the efficiency of computation and resource utilization. One strategy is to pack up to thousands of tasks into one request. Another is to share the effort in resource discovery and allocation among requests by separating resource allocations from request submissions. The third strategy is to pack variable numbers of tasks into different requests, where the task number is a function of the destination resource's computability. This framework has been implemented in Gracie, a computational grid software platform developed by Peking University, and used for executing bioinformatics tasks. We describe its architecture, evaluate its strategies, and compare its performance with GRAM. Analyzing the experiment results, we found that Gracie outperforms GRAM significantly for execution of sets of small tasks, which is aligned with the intuitive advantage of our approaches built in Gracie.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117210279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

ViGs: A grid simulation and monitoring tool for ATLAS workflows ViGs: ATLAS工作流程的网格模拟和监控工具

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777909

A. T. Thor, G. Záruba, David Levine, K. De, T. Wenaus

With the recent success in transmitting the first beam through Large Hadron Collider (LHC), generation of vast amount of data from experiments would soon follow in the near future. The data generated that will need to be processed will be enormous, averaging 15 petabytes per year which will be analyzed and processed by one- to two-hundred-thousand jobs per day. These jobs must be scheduled, processed and managed on computers distributed over many countries worldwide. The ability to construct computer clusters on such a virtually unbounded scale will result in increased throughput, removing the barrier of a single computing architecture and operating system, while adding the ability to process jobs across different administrative boundaries, and encouraging collaborations. To date, setting up large scale grids has been mostly accomplished by setting up experimental medium-sized clusters and using trial-and-error methods to test them. However, this is not only an arduous task but is also economically inefficient. Moreover, as the performance of a grid computing architecture is closely tied with its networking infrastructure across the entire virtual organization, such trial-and-error approaches will not provide representative data. A simulation environment, on the other hand, may be ideal for this evaluation purpose as virtually all factors within a simulated VO (virtual organization) can easily be modified for evaluation. Thus we introduceldquovirtual grid simulatorrdquo (ViGs), developed as a large scale grid environment simulator, with the goal of studying the performance, behavioral, and scalability aspects of a working grid environment, while catering to the needs for an underlying networking infrastructure.

随着最近通过大型强子对撞机(LHC)传输第一束光束的成功，从实验中产生的大量数据将很快在不久的将来随之而来。产生的需要处理的数据将是巨大的，平均每年15拍字节，每天需要分析和处理10万到20万个工作。这些工作必须在分布在全球许多国家的计算机上进行调度、处理和管理。在这种几乎无界的规模上构建计算机集群的能力将提高吞吐量，消除单一计算体系结构和操作系统的障碍，同时增加跨不同管理边界处理作业的能力，并鼓励协作。迄今为止，建立大规模网格主要是通过建立实验性的中型集群并使用试错方法来测试它们。然而，这不仅是一项艰巨的任务，而且在经济上效率低下。此外，由于网格计算体系结构的性能与整个虚拟组织的网络基础设施密切相关，因此这种试错方法将无法提供具有代表性的数据。另一方面，模拟环境可能是实现此评估目的的理想环境，因为模拟VO(虚拟组织)中的几乎所有因素都可以很容易地修改以进行评估。因此，我们介绍了虚拟网格模拟器(ViGs)，它是作为一个大规模网格环境模拟器开发的，其目标是研究工作网格环境的性能、行为和可伸缩性方面，同时满足底层网络基础设施的需求。

{"title":"ViGs: A grid simulation and monitoring tool for ATLAS workflows","authors":"A. T. Thor, G. Záruba, David Levine, K. De, T. Wenaus","doi":"10.1109/MTAGS.2008.4777909","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777909","url":null,"abstract":"With the recent success in transmitting the first beam through Large Hadron Collider (LHC), generation of vast amount of data from experiments would soon follow in the near future. The data generated that will need to be processed will be enormous, averaging 15 petabytes per year which will be analyzed and processed by one- to two-hundred-thousand jobs per day. These jobs must be scheduled, processed and managed on computers distributed over many countries worldwide. The ability to construct computer clusters on such a virtually unbounded scale will result in increased throughput, removing the barrier of a single computing architecture and operating system, while adding the ability to process jobs across different administrative boundaries, and encouraging collaborations. To date, setting up large scale grids has been mostly accomplished by setting up experimental medium-sized clusters and using trial-and-error methods to test them. However, this is not only an arduous task but is also economically inefficient. Moreover, as the performance of a grid computing architecture is closely tied with its networking infrastructure across the entire virtual organization, such trial-and-error approaches will not provide representative data. A simulation environment, on the other hand, may be ideal for this evaluation purpose as virtually all factors within a simulated VO (virtual organization) can easily be modified for evaluation. Thus we introduceldquovirtual grid simulatorrdquo (ViGs), developed as a large scale grid environment simulator, with the goal of studying the performance, behavioral, and scalability aspects of a working grid environment, while catering to the needs for an underlying networking infrastructure.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134271627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Many-task computing for grids and supercomputers 网格和超级计算机的多任务计算

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777912

I. Raicu, Ian T Foster, Yong Zhao

Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many task computing differs from high throughput computing in the emphasis of using large number of computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where primary metrics are measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. Many task computing denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Many task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in high performance computing, drawing attention to the many computations that are heterogeneous but not ldquohappilyrdquo parallel.

多任务计算旨在弥合高吞吐量计算和高性能计算两种计算范式之间的差距。许多任务计算与高吞吐量计算的不同之处在于强调在短时间内使用大量计算资源来完成许多计算任务(即包括依赖和独立任务)，其中主要指标以秒为单位(例如FLOPS，任务/秒，MB/s I/O速率)，而不是每月操作(例如作业)。许多任务计算是指由多个不同的活动组成的高性能计算，通过文件系统操作进行耦合。任务可以是小的也可以是大的，单处理器的也可以是多处理器的，计算密集型的也可以是数据密集型的。任务集可以是静态的或动态的，同构的或异构的，松散耦合的或紧密耦合的。任务数量、计算量和数据量的总和可能非常大。许多任务计算包括松散耦合的应用程序，这些应用程序通常是通信密集型的，但不能使用高性能计算中常见的标准消息传递接口自然地表达，这引起了人们对许多异构计算的注意，但这些计算并不十分并行。

{"title":"Many-task computing for grids and supercomputers","authors":"I. Raicu, Ian T Foster, Yong Zhao","doi":"10.1109/MTAGS.2008.4777912","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777912","url":null,"abstract":"Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many task computing differs from high throughput computing in the emphasis of using large number of computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where primary metrics are measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. Many task computing denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Many task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in high performance computing, drawing attention to the many computations that are heterogeneous but not ldquohappilyrdquo parallel.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131220661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 327

Exploring data parallelism and locality in wide area networks 探索广域网中的数据并行性和局部性

2008 Workshop on Many-Task Computing on Grids and Supercomputers

Pub Date : 2008-11-01 DOI: 10.1109/MTAGS.2008.4777906

Yunhong Gu, R. Grossman

Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply given the right programming structure. Work to date, for example MapReduce and Hadoop, has focused on systems within a data center. In this paper, we present Sphere, a cloud computing system that targets distributed data-intensive applications over wide area networks. Sphere uses a data-parallel computing model that views the processing of distributed datasets as applying a group of operators to each element in the datasets. As a cloud computing system, application developers can use the Sphere API to write very simple code to process distributed datasets in parallel, while the details, including but not limited to, data locations, server heterogeneity, load balancing, and fault tolerance, are transparent to developers. Unlike MapReduce or Hadoop, Sphere supports distributed data processing on a global scale by exploiting data parallelism and locality in systems over wide area networks.

云计算已经证明，只要给定正确的编程结构，就可以在商品集群上处理非常大的数据集。迄今为止的工作，例如MapReduce和Hadoop，主要集中在数据中心内的系统上。在本文中，我们提出了Sphere，一个针对广域网上分布式数据密集型应用的云计算系统。Sphere使用数据并行计算模型，该模型将分布式数据集的处理视为对数据集中的每个元素应用一组操作符。作为一个云计算系统，应用程序开发人员可以使用Sphere API编写非常简单的代码来并行处理分布式数据集，而细节(包括但不限于数据位置、服务器异构性、负载平衡和容错)对开发人员是透明的。与MapReduce或Hadoop不同，Sphere通过利用广域网系统中的数据并行性和局部性，支持全球范围内的分布式数据处理。

引用次数: 20

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 Workshop on Many-Task Computing on Grids and Supercomputers

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀