首页 > 最新文献

Proceedings of the 9th Workshop on Scientific Cloud Computing最新文献

英文 中文
Batch and online anomaly detection for scientific applications in a Kubernetes environment Kubernetes环境中科学应用的批处理和在线异常检测
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217883
S. Hariri, M. C. Kind
We present a cloud based anomaly detection service framework that uses a containerized Spark cluster and ancillary user interfaces all managed by Kubernetes. The stack of technology put together allows for fast, reliable, resilient and easily scalable service for either batch or streaming data. At the heart of the service, we utilize an improved version of the algorithm Isolation Forest called Extended Isolation Forest for robust and efficient anomaly detection. We showcase the design and a normal workflow of our infrastructure which is ready to deploy on any Kubernetes cluster without extra technical knowledge. With exposed APIs and simple graphical interfaces, users can load any data and detect anomalies on the loaded set or on newly presented data points using a batch or a streaming mode. With the latter, users can subscribe and get notifications on the desired output. Our aim is to develop and apply these techniques to use with scientific data. In particular we are interested in finding anomalous objects within the overwhelming set of images and catalogs produced by current and future astronomical surveys, but that can be easily adopted to other fields.
我们提出了一个基于云的异常检测服务框架,它使用一个容器化的Spark集群和辅助用户界面,所有这些都由Kubernetes管理。这些技术组合在一起,可以为批处理或流数据提供快速、可靠、有弹性和易于扩展的服务。在服务的核心,我们使用了隔离林算法的改进版本,称为扩展隔离林,用于鲁棒和高效的异常检测。我们展示了基础设施的设计和正常工作流,它可以部署在任何Kubernetes集群上,而不需要额外的技术知识。通过公开的api和简单的图形界面,用户可以加载任何数据,并使用批处理或流模式检测加载集或新呈现的数据点上的异常情况。使用后者,用户可以订阅并获得所需输出的通知。我们的目标是开发和应用这些技术来处理科学数据。我们特别感兴趣的是在当前和未来的天文调查产生的大量图像和目录中发现异常物体,但这很容易被用于其他领域。
{"title":"Batch and online anomaly detection for scientific applications in a Kubernetes environment","authors":"S. Hariri, M. C. Kind","doi":"10.1145/3217880.3217883","DOIUrl":"https://doi.org/10.1145/3217880.3217883","url":null,"abstract":"We present a cloud based anomaly detection service framework that uses a containerized Spark cluster and ancillary user interfaces all managed by Kubernetes. The stack of technology put together allows for fast, reliable, resilient and easily scalable service for either batch or streaming data. At the heart of the service, we utilize an improved version of the algorithm Isolation Forest called Extended Isolation Forest for robust and efficient anomaly detection. We showcase the design and a normal workflow of our infrastructure which is ready to deploy on any Kubernetes cluster without extra technical knowledge. With exposed APIs and simple graphical interfaces, users can load any data and detect anomalies on the loaded set or on newly presented data points using a batch or a streaming mode. With the latter, users can subscribe and get notifications on the desired output. Our aim is to develop and apply these techniques to use with scientific data. In particular we are interested in finding anomalous objects within the overwhelming set of images and catalogs produced by current and future astronomical surveys, but that can be easily adopted to other fields.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114350984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Early Experience Using Amazon Batch for Scientific Workflows 使用Amazon批处理科学工作流的早期经验
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217885
Kyle M. D. Sweeney, D. Thain
Recent technological trends have pushed many products and technologies into the cloud, relying less on local computational services, and instead purchasing computation a la carte from cloud service providers. These providers focus more on delivering technologies which are service based rather than throughput based. With the advent of Amazon Batch, a new high throughput service, we wished to see how capable it was for running scientific workflows compared to existing cloud services. To that end, we developed a testing suite which created workflows focusing on increasing shared file sizes, increasing unique file sizes, and increasing number of tasks, and ran the workflows on Amazon Batch plus two other similar configurations for comparison: EC2 workers and Work Queue on EC2. We found that while there is a significant delay in sending jobs to Amazon Batch and running raw EC2 workers, there is little overhead in the actual running of the task, and similar performance to using Work Queue on EC2 when the workflow does not require large input files. Additionally, when performing real a workflow, Batch achieved a speedup over Work Queue workers on EC2 instances of 1.18x.1
最近的技术趋势将许多产品和技术推向了云端,减少了对本地计算服务的依赖,而是从云服务提供商那里按需购买计算。这些提供商更多地关注于交付基于服务而不是基于吞吐量的技术。随着Amazon Batch(一种新的高吞吐量服务)的出现,我们希望看到与现有的云服务相比,它在运行科学工作流方面有多大的能力。为此,我们开发了一个测试套件,它创建的工作流专注于增加共享文件大小、增加唯一文件大小和增加任务数量,并在Amazon Batch上运行工作流以及另外两个类似的配置进行比较:EC2上的EC2 worker和EC2上的Work Queue。我们发现,虽然在将作业发送到Amazon Batch和运行原始EC2 worker时存在明显的延迟,但在任务的实际运行中几乎没有开销,并且当工作流不需要大的输入文件时,与在EC2上使用Work Queue的性能相似。此外,在执行实际工作流时,Batch在1.18x.1的EC2实例上实现了比Work Queue worker更快的速度
{"title":"Early Experience Using Amazon Batch for Scientific Workflows","authors":"Kyle M. D. Sweeney, D. Thain","doi":"10.1145/3217880.3217885","DOIUrl":"https://doi.org/10.1145/3217880.3217885","url":null,"abstract":"Recent technological trends have pushed many products and technologies into the cloud, relying less on local computational services, and instead purchasing computation a la carte from cloud service providers. These providers focus more on delivering technologies which are service based rather than throughput based. With the advent of Amazon Batch, a new high throughput service, we wished to see how capable it was for running scientific workflows compared to existing cloud services. To that end, we developed a testing suite which created workflows focusing on increasing shared file sizes, increasing unique file sizes, and increasing number of tasks, and ran the workflows on Amazon Batch plus two other similar configurations for comparison: EC2 workers and Work Queue on EC2. We found that while there is a significant delay in sending jobs to Amazon Batch and running raw EC2 workers, there is little overhead in the actual running of the task, and similar performance to using Work Queue on EC2 when the workflow does not require large input files. Additionally, when performing real a workflow, Batch achieved a speedup over Work Queue workers on EC2 instances of 1.18x.1","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132019661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Libra 纤维
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217882
Illyoung Choi, A. Ponsero, K. Youens-Clark, Matthew Bomhoff, B. Hurwitz, J. Hartman
Big-data analytics platforms, such as Hadoop, are appealing for scientific computation because they are ubiquitous, well-supported, and well-understood. Unfortunately, load-balancing is a common challenge of implementing large-scale scientific computing applications on these platforms. In this paper we present the design and implementation of Libra, a Hadoop-based tool for comparative metagenomics (comparing samples of genetic material collected from the environment). We describe the computation that Libra performs and how that computation is implemented using Hadoop tasks, including the techniques used by Libra to ensure that the task workloads are balanced despite nonuniform sample sizes and skewed distributions of genetic material in the samples. On a 10-machine Hadoop cluster Libra can analyze the entire Tara Ocean Viromes of ~4.2 billion reads in fewer than 20 hours.
{"title":"Libra","authors":"Illyoung Choi, A. Ponsero, K. Youens-Clark, Matthew Bomhoff, B. Hurwitz, J. Hartman","doi":"10.1145/3217880.3217882","DOIUrl":"https://doi.org/10.1145/3217880.3217882","url":null,"abstract":"Big-data analytics platforms, such as Hadoop, are appealing for scientific computation because they are ubiquitous, well-supported, and well-understood. Unfortunately, load-balancing is a common challenge of implementing large-scale scientific computing applications on these platforms. In this paper we present the design and implementation of Libra, a Hadoop-based tool for comparative metagenomics (comparing samples of genetic material collected from the environment). We describe the computation that Libra performs and how that computation is implemented using Hadoop tasks, including the techniques used by Libra to ensure that the task workloads are balanced despite nonuniform sample sizes and skewed distributions of genetic material in the samples. On a 10-machine Hadoop cluster Libra can analyze the entire Tara Ocean Viromes of ~4.2 billion reads in fewer than 20 hours.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125034325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Integration of Containers into Scientific Workflows 容器与科学工作流程的有效集成
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217887
Kyle M. D. Sweeney, D. Thain
Containers offer a powerful way to create portability for scientific applications. However yet incorporating them into workflows requires careful consideration, as straightforward approaches can increase network usage and runtime. We identified three issues in this process: container composition, containerizing workers or jobs, and container image translation. To tackle composition, we define data into three types: OS data, Read-Only, andWorking data, and define dynamic and static composition. Using the static composition (creating a single container for each job) leads to massive waste in sending duplicate data over the network. Dynamic composition (sending the data types separately) enables caching on worker nodes. To answer running workers or jobs inside a container, we looked at the costs of running inside of a container. Finally, when using different types of container technologies simultaneously, we found it's better to convert to the target image types before sending the container images, instead of repeating the same conversion at the job nodes, leading to more wasted time.
容器为科学应用程序创建可移植性提供了一种强大的方式。然而,将它们合并到工作流中需要仔细考虑,因为直接的方法会增加网络使用和运行时间。我们在这个过程中确定了三个问题:容器组成、集装箱工人或工作以及容器图像翻译。为了解决组合问题,我们将数据定义为三种类型:操作系统数据、只读数据和工作数据,并定义了动态和静态组合。使用静态组合(为每个作业创建一个容器)会导致通过网络发送重复数据的大量浪费。动态组合(分别发送数据类型)支持在工作节点上进行缓存。为了回答在容器内运行工人或工作的问题,我们研究了在容器内运行的成本。最后,当同时使用不同类型的容器技术时,我们发现最好在发送容器映像之前转换为目标映像类型,而不是在作业节点上重复相同的转换,这会导致更多的时间浪费。
{"title":"Efficient Integration of Containers into Scientific Workflows","authors":"Kyle M. D. Sweeney, D. Thain","doi":"10.1145/3217880.3217887","DOIUrl":"https://doi.org/10.1145/3217880.3217887","url":null,"abstract":"Containers offer a powerful way to create portability for scientific applications. However yet incorporating them into workflows requires careful consideration, as straightforward approaches can increase network usage and runtime. We identified three issues in this process: container composition, containerizing workers or jobs, and container image translation. To tackle composition, we define data into three types: OS data, Read-Only, andWorking data, and define dynamic and static composition. Using the static composition (creating a single container for each job) leads to massive waste in sending duplicate data over the network. Dynamic composition (sending the data types separately) enables caching on worker nodes. To answer running workers or jobs inside a container, we looked at the costs of running inside of a container. Finally, when using different types of container technologies simultaneously, we found it's better to convert to the target image types before sending the container images, instead of repeating the same conversion at the job nodes, leading to more wasted time.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132626210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Faodel
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217888
C. Ulmer, Shyamali Mukherjee, G. Templet, Scott Levy, J. Lofstead, Patrick M. Widener, T. Kordenbrock, Margaret Lawson
Composition of computational science applications, whether into ad hoc pipelines for analysis of simulation data or into well-defined and repeatable workflows, is becoming commonplace. In order to scale well as projected system and data sizes increase, developers will have to address a number of looming challenges. Increased contention for parallel filesystem bandwidth, accomodating in situ and ex situ processing, and the advent of decentralized programming models will all complicate application composition for next-generation systems. In this paper, we introduce a set of data services, Faodel, which provide scalable data management for workflows and composed applications. Faodel allows workflow components to directly and efficiently exchange data in semantically appropriate forms, rather than those dictated by the storage hierarchy or programming model in use. We describe the architecture of Faodel and present preliminary performance results demonstrating its potential for scalability in workflow scenarios.
{"title":"Faodel","authors":"C. Ulmer, Shyamali Mukherjee, G. Templet, Scott Levy, J. Lofstead, Patrick M. Widener, T. Kordenbrock, Margaret Lawson","doi":"10.1145/3217880.3217888","DOIUrl":"https://doi.org/10.1145/3217880.3217888","url":null,"abstract":"Composition of computational science applications, whether into ad hoc pipelines for analysis of simulation data or into well-defined and repeatable workflows, is becoming commonplace. In order to scale well as projected system and data sizes increase, developers will have to address a number of looming challenges. Increased contention for parallel filesystem bandwidth, accomodating in situ and ex situ processing, and the advent of decentralized programming models will all complicate application composition for next-generation systems. In this paper, we introduce a set of data services, Faodel, which provide scalable data management for workflows and composed applications. Faodel allows workflow components to directly and efficiently exchange data in semantically appropriate forms, rather than those dictated by the storage hierarchy or programming model in use. We describe the architecture of Faodel and present preliminary performance results demonstrating its potential for scalability in workflow scenarios.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121238380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
High Availability on Jetstream: Practices and Lessons Learned Jetstream的高可用性:实践和经验教训
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217884
John Michael Lowe, Jeremy Fischer, Sanjana Sudarshan, George W. Turner, C. Stewart, David Y. Hancock
Research computing has traditionally used high performance computing (HPC) clusters and has been a service not given to high availability without a doubling of computational and storage capacity. System maintenance such as security patching, firmware updates, and other system upgrades generally meant that the system would be unavailable for the duration of the work unless one has redundant HPC systems and storage. While efforts were often made to limit downtimes, when it became necessary, maintenance windows might be one to two hours or as much as an entire day. As the National Science Foundation (NSF) began funding non-traditional research systems, looking at ways to provide higher availability for researchers became one focus for service providers. One of the design elements of Jetstream was to have geographic dispersion to maximize availability. This was the first step in a number of design elements intended to make Jetstream exceed the NSF's availability requirements. We will examine the design steps employed, the components of the system and how the availability for each was considered in deployment, how maintenance is handled, and the lessons learned from the design and implementation of the Jetstream cloud.
研究计算传统上使用高性能计算(HPC)集群,并且在计算和存储容量加倍的情况下不提供高可用性的服务。系统维护(如安全补丁、固件更新和其他系统升级)通常意味着系统在工作期间不可用,除非有冗余的HPC系统和存储。虽然通常会努力限制停机时间,但在必要时,维护窗口可能是一到两个小时,甚至长达一整天。随着美国国家科学基金会(NSF)开始资助非传统研究系统,寻找为研究人员提供更高可用性的方法成为服务提供商关注的焦点之一。Jetstream的设计要素之一是地理分散,以最大限度地提高可用性。这是一系列设计元素的第一步,旨在使Jetstream超出NSF的可用性要求。我们将研究所采用的设计步骤、系统的组件以及在部署中如何考虑每个组件的可用性、如何处理维护以及从Jetstream云的设计和实现中吸取的经验教训。
{"title":"High Availability on Jetstream: Practices and Lessons Learned","authors":"John Michael Lowe, Jeremy Fischer, Sanjana Sudarshan, George W. Turner, C. Stewart, David Y. Hancock","doi":"10.1145/3217880.3217884","DOIUrl":"https://doi.org/10.1145/3217880.3217884","url":null,"abstract":"Research computing has traditionally used high performance computing (HPC) clusters and has been a service not given to high availability without a doubling of computational and storage capacity. System maintenance such as security patching, firmware updates, and other system upgrades generally meant that the system would be unavailable for the duration of the work unless one has redundant HPC systems and storage. While efforts were often made to limit downtimes, when it became necessary, maintenance windows might be one to two hours or as much as an entire day. As the National Science Foundation (NSF) began funding non-traditional research systems, looking at ways to provide higher availability for researchers became one focus for service providers. One of the design elements of Jetstream was to have geographic dispersion to maximize availability. This was the first step in a number of design elements intended to make Jetstream exceed the NSF's availability requirements. We will examine the design steps employed, the components of the system and how the availability for each was considered in deployment, how maintenance is handled, and the lessons learned from the design and implementation of the Jetstream cloud.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120867985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Predicting Amazon Spot Prices with LSTM Networks 利用LSTM网络预测亚马逊现货价格
Pub Date : 2018-06-11 DOI: 10.1145/3217880.3217881
Matt Baughman, C. Haas, R. Wolski, Ian T Foster, K. Chard
Amazon spot instances provide preemptable computing capacity at a cost that is often significantly lower than comparable on-demand or reserved instances. Spot instances are charged at the current spot price: a fluctuating market price based on supply and demand for spot instance capacity. However, spot instances are inherently volatile, the spot price changes over time, and instances can be revoked by Amazon with as little as two minutes' warning. Given the potential discount---up to 90% in some cases---there has been significant interest in the scientific cloud computing community to leverage spot instances for workloads that are either fault-tolerant or not time-sensitive. However, cost-effective use of spot instances requires accurate prediction of spot prices in the future. We explore here the use of long/short-term memory (LSTM) recurrent neural networks for spot price prediction. We describe our model and compare it against a baseline ARIMA model using historical spot pricing data. Our results show that our LSTM approach can reduce training error by as much as 95%.
Amazon现货实例提供可抢占的计算能力,其成本通常比可比较的按需或预留实例低得多。现货实例按当前现货价格收费:基于现货实例容量的供需波动的市场价格。然而,现货实例本质上是不稳定的,现货价格随着时间的推移而变化,并且实例可以在两分钟的警告下被亚马逊撤销。考虑到潜在的折扣(在某些情况下高达90%),科学云计算社区对利用现场实例来处理容错或时间不敏感的工作负载非常感兴趣。然而,要想有效利用现货价格,就需要对未来的现货价格做出准确的预测。我们在这里探索使用长/短期记忆(LSTM)递归神经网络进行现货价格预测。我们描述了我们的模型,并使用历史现货定价数据将其与基线ARIMA模型进行比较。结果表明,LSTM方法可以减少高达95%的训练误差。
{"title":"Predicting Amazon Spot Prices with LSTM Networks","authors":"Matt Baughman, C. Haas, R. Wolski, Ian T Foster, K. Chard","doi":"10.1145/3217880.3217881","DOIUrl":"https://doi.org/10.1145/3217880.3217881","url":null,"abstract":"Amazon spot instances provide preemptable computing capacity at a cost that is often significantly lower than comparable on-demand or reserved instances. Spot instances are charged at the current spot price: a fluctuating market price based on supply and demand for spot instance capacity. However, spot instances are inherently volatile, the spot price changes over time, and instances can be revoked by Amazon with as little as two minutes' warning. Given the potential discount---up to 90% in some cases---there has been significant interest in the scientific cloud computing community to leverage spot instances for workloads that are either fault-tolerant or not time-sensitive. However, cost-effective use of spot instances requires accurate prediction of spot prices in the future. We explore here the use of long/short-term memory (LSTM) recurrent neural networks for spot price prediction. We describe our model and compare it against a baseline ARIMA model using historical spot pricing data. Our results show that our LSTM approach can reduce training error by as much as 95%.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134032228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Partitioning SKA Dataflows for Optimal Graph Execution 划分SKA数据流以实现最佳图执行
Pub Date : 2018-05-19 DOI: 10.1145/3217880.3217886
Chen Wu, A. Wicenec, R. Tobar
Optimizing data-intensive workflow execution is essential to many modern scientific projects such as the Square Kilometre Array (SKA), which will be the largest radio telescope in the world, collecting terabytes of data per second for the next few decades. At the core of the SKA Science Data Processor is the graph execution engine, scheduling tens of thousands of algorithmic components to ingest and transform millions of parallel data chunks in order to solve a series of large-scale inverse problems within the power budget. To tackle this challenge, we have developed the Data Activated Liu Graph Engine (DALiuGE) to manage data processing pipelines for several SKA pathfinder projects. In this paper, we discuss the DALiuGE graph scheduling subsystem. By extending previous studies on graph scheduling and partitioning, we lay the foundation on which we can develop polynomial time optimization methods that minimize both workflow execution time and resource footprint while satisfying resource constraints imposed by individual algorithms. We show preliminary results obtained from three radio astronomy data pipelines.
优化数据密集型工作流程的执行对于许多现代科学项目至关重要,例如平方公里阵列(SKA),它将是世界上最大的射电望远镜,在未来几十年里每秒收集数tb的数据。SKA科学数据处理器的核心是图形执行引擎,调度数以万计的算法组件来摄取和转换数以百万计的并行数据块,以便在功率预算内解决一系列大规模的逆问题。为了应对这一挑战,我们开发了数据激活刘图引擎(DALiuGE)来管理几个SKA探路者项目的数据处理管道。本文讨论了大流格图调度子系统。通过扩展先前对图调度和分区的研究,我们为开发多项式时间优化方法奠定了基础,该方法可以在满足单个算法施加的资源约束的同时最小化工作流执行时间和资源占用。我们展示了从三个射电天文数据管道获得的初步结果。
{"title":"Partitioning SKA Dataflows for Optimal Graph Execution","authors":"Chen Wu, A. Wicenec, R. Tobar","doi":"10.1145/3217880.3217886","DOIUrl":"https://doi.org/10.1145/3217880.3217886","url":null,"abstract":"Optimizing data-intensive workflow execution is essential to many modern scientific projects such as the Square Kilometre Array (SKA), which will be the largest radio telescope in the world, collecting terabytes of data per second for the next few decades. At the core of the SKA Science Data Processor is the graph execution engine, scheduling tens of thousands of algorithmic components to ingest and transform millions of parallel data chunks in order to solve a series of large-scale inverse problems within the power budget. To tackle this challenge, we have developed the Data Activated Liu Graph Engine (DALiuGE) to manage data processing pipelines for several SKA pathfinder projects. In this paper, we discuss the DALiuGE graph scheduling subsystem. By extending previous studies on graph scheduling and partitioning, we lay the foundation on which we can develop polynomial time optimization methods that minimize both workflow execution time and resource footprint while satisfying resource constraints imposed by individual algorithms. We show preliminary results obtained from three radio astronomy data pipelines.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"437 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123579794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Proceedings of the 9th Workshop on Scientific Cloud Computing 第九届科学云计算研讨会论文集
Pub Date : 2012-06-18 DOI: 10.1145/3217880
Yogesh L. Simmhan, Gabriel Antoniu, C. Goble, L. Ramakrishnan
It is our pleasure to welcome you to the 6th Workshop on Scientific Cloud Computing (ScienceCloud). ScienceCloud continues to provide the scientific community with the premier forum for discussing new research, development, and deployment efforts in hosting scientific computing workloads on cloud computing infrastructures. The focus of the workshop is on the use of cloud-based technologies to meet new compute-intensive and data-intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. ScienceCloud provides a unique opportunity for interaction and cross-pollination between researchers and practitioners developing applications, algorithms, software, hardware and networking, emphasizing scientific computing for such cloud platforms. The call for papers attracted submissions from across the world. The program committee reviewed and accepted three of six full paper submissions (50%) and three of four short paper submissions (75%). We are delighted to include a keynote and panel involving leading scientific cloud computing researchers. We encourage attendees to attend these presentations: Challenges of Running Scientific Workflows in Cloud Environments, Ewa Deelman (Information Sciences Institute, University of Southern California) Real-time Scientific Data Stream Processing, Manish Parashar (Rutgers, the State University of New Jersey), Doug Thain (University of Notre Dame), Ioan Raicu (Illinois Institute of Technology), Rui Zhang (IBM Research)
我们很高兴欢迎您参加第六届科学云计算研讨会(ScienceCloud)。ScienceCloud继续为科学界提供首要论坛,用于讨论在云计算基础设施上托管科学计算工作负载的新研究、开发和部署工作。研讨会的重点是使用基于云的技术来应对当前超级计算机、网格和高性能计算集群无法很好地服务的新的计算密集型和数据密集型科学挑战。ScienceCloud为开发应用程序、算法、软件、硬件和网络的研究人员和实践者之间的互动和交叉授粉提供了独特的机会,强调了此类云平台的科学计算。论文征集吸引了来自世界各地的投稿。项目委员会审查并接受了六篇论文中的三篇(50%)和四篇短文中的三篇(75%)。我们很高兴能邀请到顶尖的科学云计算研究人员参加主题演讲和小组讨论。我们鼓励与会者参加这些演讲:在云环境中运行科学工作流的挑战,Ewa Deelman(南加州大学信息科学研究所)实时科学数据流处理,Manish Parashar(罗格斯大学,新泽西州立大学),Doug Thain(圣母大学),Ioan Raicu(伊利诺伊理工学院),Rui Zhang (IBM研究院)
{"title":"Proceedings of the 9th Workshop on Scientific Cloud Computing","authors":"Yogesh L. Simmhan, Gabriel Antoniu, C. Goble, L. Ramakrishnan","doi":"10.1145/3217880","DOIUrl":"https://doi.org/10.1145/3217880","url":null,"abstract":"It is our pleasure to welcome you to the 6th Workshop on Scientific Cloud Computing (ScienceCloud). ScienceCloud continues to provide the scientific community with the premier forum for discussing new research, development, and deployment efforts in hosting scientific computing workloads on cloud computing infrastructures. The focus of the workshop is on the use of cloud-based technologies to meet new compute-intensive and data-intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. ScienceCloud provides a unique opportunity for interaction and cross-pollination between researchers and practitioners developing applications, algorithms, software, hardware and networking, emphasizing scientific computing for such cloud platforms. \u0000 \u0000The call for papers attracted submissions from across the world. The program committee reviewed and accepted three of six full paper submissions (50%) and three of four short paper submissions (75%). \u0000 \u0000We are delighted to include a keynote and panel involving leading scientific cloud computing researchers. We encourage attendees to attend these presentations: \u0000Challenges of Running Scientific Workflows in Cloud Environments, Ewa Deelman (Information Sciences Institute, University of Southern California) \u0000Real-time Scientific Data Stream Processing, Manish Parashar (Rutgers, the State University of New Jersey), Doug Thain (University of Notre Dame), Ioan Raicu (Illinois Institute of Technology), Rui Zhang (IBM Research)","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133169915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 9th Workshop on Scientific Cloud Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1