Enabling Seamless Execution of Computational and Data Science Workflows on HPC and Cloud with the Popper Container-native Automation Engine

2020 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC) Pub Date : 2020-11-01 DOI:10.1109/CANOPIEHPC51917.2020.00007

Jayjeet Chakraborty, C. Maltzahn, I. Jimenez

{"title":"Enabling Seamless Execution of Computational and Data Science Workflows on HPC and Cloud with the Popper Container-native Automation Engine","authors":"Jayjeet Chakraborty, C. Maltzahn, I. Jimenez","doi":"10.1109/CANOPIEHPC51917.2020.00007","DOIUrl":null,"url":null,"abstract":"The problem of reproducibility and replication in scientific research is quite prevalent to date. Researchers working in fields of computational science often find it difficult to reproduce experiments from artifacts like code, data, diagrams, and results which are left behind by the previous researchers. The code developed on one machine often fails to run on other machines due to differences in hardware architecture, OS, software dependencies, among others. This is accompanied by the difficulty in understanding how artifacts are organized, as well as in using them in the correct order. Software containers (also known as Linux containers) can be used to address some of these problems, and thus researchers and developers have built scientific workflow engines that execute the steps of a workflow in separate containers. Existing container-native workflow engines assume the availability of infrastructure deployed in the cloud or HPC centers. In this paper, we present Popper, a container-native workflow engine that does not assume the presence of a Kubernetes cluster or any service other than a container engine such as Docker or Podman. We introduce the design and architecture of Popper and describe how it abstracts away the complexity of multiple container engines and resource managers, enabling users to focus only on writing workflow logic. With Popper, researchers can build and validate workflows easily in almost any environment of their choice including local machines, Slurm based HPC clusters, CI services, or Kubernetes based cloud computing environments. To exemplify the suitability of this workflow engine, we present a case study where we take an example from machine learning and seamlessly execute it in multiple environments by implementing a Popper workflow for it.","PeriodicalId":204303,"journal":{"name":"2020 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CANOPIEHPC51917.2020.00007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The problem of reproducibility and replication in scientific research is quite prevalent to date. Researchers working in fields of computational science often find it difficult to reproduce experiments from artifacts like code, data, diagrams, and results which are left behind by the previous researchers. The code developed on one machine often fails to run on other machines due to differences in hardware architecture, OS, software dependencies, among others. This is accompanied by the difficulty in understanding how artifacts are organized, as well as in using them in the correct order. Software containers (also known as Linux containers) can be used to address some of these problems, and thus researchers and developers have built scientific workflow engines that execute the steps of a workflow in separate containers. Existing container-native workflow engines assume the availability of infrastructure deployed in the cloud or HPC centers. In this paper, we present Popper, a container-native workflow engine that does not assume the presence of a Kubernetes cluster or any service other than a container engine such as Docker or Podman. We introduce the design and architecture of Popper and describe how it abstracts away the complexity of multiple container engines and resource managers, enabling users to focus only on writing workflow logic. With Popper, researchers can build and validate workflows easily in almost any environment of their choice including local machines, Slurm based HPC clusters, CI services, or Kubernetes based cloud computing environments. To exemplify the suitability of this workflow engine, we present a case study where we take an example from machine learning and seamlessly execute it in multiple environments by implementing a Popper workflow for it.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用Popper容器原生自动化引擎在HPC和云上实现计算和数据科学工作流的无缝执行

迄今为止，科学研究中的可再现性和重复性问题相当普遍。在计算科学领域工作的研究人员经常发现很难从以前的研究人员留下的代码、数据、图表和结果等人工制品中重现实验。由于硬件架构、操作系统、软件依赖关系等方面的差异，在一台机器上开发的代码常常无法在其他机器上运行。这伴随着理解工件如何组织的困难，以及以正确的顺序使用它们的困难。软件容器(也称为Linux容器)可以用来解决其中的一些问题，因此研究人员和开发人员已经构建了科学的工作流引擎，可以在单独的容器中执行工作流的步骤。现有的容器原生工作流引擎假定部署在云或HPC中心的基础设施的可用性。在本文中，我们介绍了Popper，一个容器原生工作流引擎，它不假设Kubernetes集群的存在，也不假设除了容器引擎(如Docker或Podman)之外的任何服务。我们介绍了Popper的设计和架构，并描述了它是如何抽象出多个容器引擎和资源管理器的复杂性，使用户能够只专注于编写工作流逻辑。使用Popper，研究人员可以在他们选择的几乎任何环境中轻松构建和验证工作流，包括本地机器、基于Slurm的HPC集群、CI服务或基于Kubernetes的云计算环境。为了举例说明这个工作流引擎的适用性，我们提出了一个案例研究，我们从机器学习中获取一个例子，并通过实现Popper工作流在多个环境中无缝地执行它。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊