首页 > 最新文献

2018 IEEE 14th International Conference on e-Science (e-Science)最新文献

英文 中文
Remote Cloud-Based Automated Stroke Rehabilitation Assessment Using Wearables 使用可穿戴设备的远程云端自动中风康复评估
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00063
Shane Halloran, J. Shi, Yu Guan, Xi Chen, Michael Dunne-Willows, J. Eyre
We outline a system enabling accurate remote assessment of stroke rehabilitation levels using wrist worn accelerometer time series data. The system is built based on features generated from clustering models across sliding windows in the data and makes use of computation in the cloud. Predictive models are built using advanced machine learning techniques.
我们概述了一个系统,可以使用手腕上佩戴的加速度计时间序列数据准确地远程评估中风康复水平。该系统基于数据中跨滑动窗口的聚类模型生成的特征,并利用云计算。预测模型是使用先进的机器学习技术建立的。
{"title":"Remote Cloud-Based Automated Stroke Rehabilitation Assessment Using Wearables","authors":"Shane Halloran, J. Shi, Yu Guan, Xi Chen, Michael Dunne-Willows, J. Eyre","doi":"10.1109/eScience.2018.00063","DOIUrl":"https://doi.org/10.1109/eScience.2018.00063","url":null,"abstract":"We outline a system enabling accurate remote assessment of stroke rehabilitation levels using wrist worn accelerometer time series data. The system is built based on features generated from clustering models across sliding windows in the data and makes use of computation in the cloud. Predictive models are built using advanced machine learning techniques.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"47 1","pages":"302-302"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83615575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Utilizing a Transparency-Driven Environment Toward Trusted Automatic Genre Classification: A Case Study in Journalism History 利用透明驱动的环境实现可信的自动体裁分类:新闻史案例研究
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00137
A. Bilgin, L. Hollink, J. V. Ossenbruggen, E. T. K. Sang, Kim Smeenk, Frank Harbers, M. Broersma
With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field. We demonstrate the gradually increasing understanding of journalism historians through a real-world use case study on automatic genre classification of newspaper articles. This study is a first step towards trusted usage of machine learning pipelines in a responsible way.
随着现实世界任务中未标记数据的日益增多,研究人员不得不依赖于黑箱计算模型给出的预测。然而,一个经常被忽视的事实是,这些模型可能因为错误的原因而在准确性上得分很高。在本文中,我们提出了通过各种表示形式实现模型透明度的实际影响分析。为此,我们开发了一个环境,使非计算机科学家能够在自己的研究领域成为实践数据科学家。我们通过对报纸文章自动体裁分类的真实案例研究,展示了新闻历史学家逐渐增加的理解。这项研究是以负责任的方式可靠地使用机器学习管道的第一步。
{"title":"Utilizing a Transparency-Driven Environment Toward Trusted Automatic Genre Classification: A Case Study in Journalism History","authors":"A. Bilgin, L. Hollink, J. V. Ossenbruggen, E. T. K. Sang, Kim Smeenk, Frank Harbers, M. Broersma","doi":"10.1109/eScience.2018.00137","DOIUrl":"https://doi.org/10.1109/eScience.2018.00137","url":null,"abstract":"With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field. We demonstrate the gradually increasing understanding of journalism historians through a real-world use case study on automatic genre classification of newspaper articles. This study is a first step towards trusted usage of machine learning pipelines in a responsible way.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"9 1","pages":"486-496"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83536962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Power Asymmetries of eHumanities Infrastructures 人文基础设施的权力不对称
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00103
Max Kemman
Digital research infrastructures simultaneously enable and confine the research practices of scholars, constituting a power relation. This power relation can be characterised as a power asymmetry, with scholars dependent on the developers of infrastructures. In order to reduce this power asymmetry, infrastructures are developed in collaboration between scholars and computational researchers. Through an analysis of over twenty interviews, I will investigate the role of knowledge asymmetry, the ignorance of how a collaborator performs their tasks, and how this relates to power asymmetry in eScience collaborations in digital history. I will moreover consider how these asymmetries pose a challenge in the development and adoption of research infrastructures in the humanities.
数字化研究基础设施在促进和制约学者研究实践的同时,构成了一种权力关系。这种权力关系可以被描述为一种权力不对称,学者依赖于基础设施的开发商。为了减少这种权力不对称,基础设施是由学者和计算研究人员合作开发的。通过对20多个访谈的分析,我将调查知识不对称的作用,对合作者如何执行任务的无知,以及这与数字历史中eScience合作中的权力不对称的关系。此外,我将考虑这些不对称如何对人文学科研究基础设施的发展和采用构成挑战。
{"title":"Power Asymmetries of eHumanities Infrastructures","authors":"Max Kemman","doi":"10.1109/eScience.2018.00103","DOIUrl":"https://doi.org/10.1109/eScience.2018.00103","url":null,"abstract":"Digital research infrastructures simultaneously enable and confine the research practices of scholars, constituting a power relation. This power relation can be characterised as a power asymmetry, with scholars dependent on the developers of infrastructures. In order to reduce this power asymmetry, infrastructures are developed in collaboration between scholars and computational researchers. Through an analysis of over twenty interviews, I will investigate the role of knowledge asymmetry, the ignorance of how a collaborator performs their tasks, and how this relates to power asymmetry in eScience collaborations in digital history. I will moreover consider how these asymmetries pose a challenge in the development and adoption of research infrastructures in the humanities.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"85 1","pages":"370-371"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83902118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preserving Reproducibility: Provenance and Executable Containers in DataONE Data Packages 保持再现性:DataONE数据包中的来源和可执行容器
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00019
Bryce D. Mecum, Matthew B. Jones, D. Vieglais, C. Willis
Many data packaging standards are available to researchers and data repository operators and the choice to use an existing standard or create a new one is challenging. We introduce the DataONE Data Package standard which is based on the existing OAI-ORE Resource Map standard. We describe the functionality Data Package provides, implementation considerations, compare it to existing standards, and discuss future extensions to the standard including the ability to describe execution environments via WholeTale "Tales"" and alternate serialization formats.
研究人员和数据存储库操作人员可以使用许多数据打包标准,选择使用现有标准还是创建新标准具有挑战性。在现有OAI-ORE资源图标准的基础上,提出了DataONE数据包标准。我们描述了Data Package提供的功能,实现方面的考虑,将其与现有标准进行比较,并讨论了该标准的未来扩展,包括通过WholeTale“Tales”和替代序列化格式描述执行环境的能力。
{"title":"Preserving Reproducibility: Provenance and Executable Containers in DataONE Data Packages","authors":"Bryce D. Mecum, Matthew B. Jones, D. Vieglais, C. Willis","doi":"10.1109/eScience.2018.00019","DOIUrl":"https://doi.org/10.1109/eScience.2018.00019","url":null,"abstract":"Many data packaging standards are available to researchers and data repository operators and the choice to use an existing standard or create a new one is challenging. We introduce the DataONE Data Package standard which is based on the existing OAI-ORE Resource Map standard. We describe the functionality Data Package provides, implementation considerations, compare it to existing standards, and discuss future extensions to the standard including the ability to describe execution environments via WholeTale \"Tales\"\" and alternate serialization formats.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"16 1","pages":"45-49"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77125113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Development of the OMUSE/AMUSE Modeling System OMUSE/AMUSE建模系统的开发
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00105
F. I. Pelupessy, B. V. Werkhoven, G. Oord, S. Zwart, A. V. Elteren, H. Dijkstra
The Oceanographic Multipurpose Software Environment (OMUSE, [1]) is an open source framework developed for oceanographic and other earth system modelling applications. OMUSE provides a homogeneous environment to interface with numerical simulation codes. It was developed at the IMAU (Utrecht) using coupling technology developed for astrophysical applications in the AMUSE project at Leiden Observatory[2,3]. OMUSE simplifies the use and deployment of numerical simulations codes. Furthermore, the design of the OMUSE interfaces (figure 1) allow codes that represent different physics or span different ranges of physical scales to be easily combined in novel numerical experiments. The use cases for OMUSE range from running simple numerical experiments with single codes and the addition of data analysis tools in model runs, to setting up fairly complicated and strongly coupled solvers for problems that are intrinsically multi-scale and/or require different physics. Here, we will present the design of OMUSE as well as give examples of the types of the couplings that can be implemented using OMUSE. The example provided by AMUSE and OMUSE suggests that application of the same interfacing philosophy to a more extensive set of disciplines is possible. In order to facilitate this a better separation of the core framework and domain specific code is necessary. We will present ongoing work to support meteorological and hydrological applications and the use of the framework as the computational core in the eWatercycle project [4]. For this, adaptations are made to improve the interoperability with existing interface efforts (such as the BMI) and we discuss developments regarding the encapsulation of OMUSE/AMUSE and its component models in containers. This will facilitate the installation for first time users, removing a barrier in this respect. In addition to this we anticipate this to also offer more flexible deployment options for the framework.
海洋学多用途软件环境(OMUSE,[1])是为海洋学和其他地球系统建模应用开发的开源框架。OMUSE提供了一个与数值模拟代码接口的同质环境。它是由IMAU (Utrecht)利用在莱顿天文台的AMUSE项目中为天体物理应用开发的耦合技术开发的[2,3]。OMUSE简化了数值模拟代码的使用和部署。此外,OMUSE接口的设计(图1)允许在新颖的数值实验中轻松组合代表不同物理或跨越不同物理尺度范围的代码。OMUSE的用例范围从使用单个代码运行简单的数值实验和在模型运行中添加数据分析工具,到为本质上是多尺度和/或需要不同物理的问题设置相当复杂和强耦合的求解器。在这里,我们将介绍OMUSE的设计,并给出可以使用OMUSE实现的耦合类型的示例。AMUSE和OMUSE提供的例子表明,将相同的接口哲学应用于更广泛的学科是可能的。为了促进这一点,有必要更好地分离核心框架和特定领域的代码。我们将介绍正在进行的工作,以支持气象和水文应用,并在eWatercycle项目中使用框架作为计算核心[4]。为此,进行了一些调整以改进与现有接口(如BMI)的互操作性,我们讨论了有关在容器中封装OMUSE/AMUSE及其组件模型的开发。这将有助于首次用户的安装,消除这方面的障碍。除此之外,我们还希望为框架提供更灵活的部署选项。
{"title":"Development of the OMUSE/AMUSE Modeling System","authors":"F. I. Pelupessy, B. V. Werkhoven, G. Oord, S. Zwart, A. V. Elteren, H. Dijkstra","doi":"10.1109/eScience.2018.00105","DOIUrl":"https://doi.org/10.1109/eScience.2018.00105","url":null,"abstract":"The Oceanographic Multipurpose Software Environment (OMUSE, [1]) is an open source framework developed for oceanographic and other earth system modelling applications. OMUSE provides a homogeneous environment to interface with numerical simulation codes. It was developed at the IMAU (Utrecht) using coupling technology developed for astrophysical applications in the AMUSE project at Leiden Observatory[2,3]. OMUSE simplifies the use and deployment of numerical simulations codes. Furthermore, the design of the OMUSE interfaces (figure 1) allow codes that represent different physics or span different ranges of physical scales to be easily combined in novel numerical experiments. The use cases for OMUSE range from running simple numerical experiments with single codes and the addition of data analysis tools in model runs, to setting up fairly complicated and strongly coupled solvers for problems that are intrinsically multi-scale and/or require different physics. Here, we will present the design of OMUSE as well as give examples of the types of the couplings that can be implemented using OMUSE. The example provided by AMUSE and OMUSE suggests that application of the same interfacing philosophy to a more extensive set of disciplines is possible. In order to facilitate this a better separation of the core framework and domain specific code is necessary. We will present ongoing work to support meteorological and hydrological applications and the use of the framework as the computational core in the eWatercycle project [4]. For this, adaptations are made to improve the interoperability with existing interface efforts (such as the BMI) and we discuss developments regarding the encapsulation of OMUSE/AMUSE and its component models in containers. This will facilitate the installation for first time users, removing a barrier in this respect. In addition to this we anticipate this to also offer more flexible deployment options for the framework.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"50 1","pages":"374-374"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87023537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Understanding Evolving Communities in Transnational Board Interlock Networks 了解跨国董事会联锁网络中不断发展的社区
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00069
D. V. Kuppevelt, Frank W. Takes, E. Heemskerk
n/a
N/A
{"title":"Understanding Evolving Communities in Transnational Board Interlock Networks","authors":"D. V. Kuppevelt, Frank W. Takes, E. Heemskerk","doi":"10.1109/eScience.2018.00069","DOIUrl":"https://doi.org/10.1109/eScience.2018.00069","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"18 1","pages":"312-313"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85894928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research Software Discovery: An Overview 研究软件发现:概述
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00016
A. Struck
Research software is an integral part of scientific investigations. The paper identifies challenges, risks and new opportunities in research software publication and discovery. The diverse code discovery landscape is mapped and agents with their business models identified. Examples for discovery tools and strategies are given to support the classification. Reproducibility of research and reuse of code may improve if software discovery was easier. Researchers conducting a search for existing software in the context of a state-of-the-art report or a software management plan could use this paper as a guideline for their information retrieval strategy.
研究软件是科学研究不可缺少的一部分。本文指出了研究软件出版和发现的挑战、风险和新机遇。映射了不同的代码发现场景,并确定了代理及其业务模型。给出了支持分类的发现工具和策略的示例。如果软件发现更容易,研究的再现性和代码的重用可能会得到改善。在最新报告或软件管理计划的背景下进行现有软件搜索的研究人员可以使用本文作为其信息检索策略的指导方针。
{"title":"Research Software Discovery: An Overview","authors":"A. Struck","doi":"10.1109/eScience.2018.00016","DOIUrl":"https://doi.org/10.1109/eScience.2018.00016","url":null,"abstract":"Research software is an integral part of scientific investigations. The paper identifies challenges, risks and new opportunities in research software publication and discovery. The diverse code discovery landscape is mapped and agents with their business models identified. Examples for discovery tools and strategies are given to support the classification. Reproducibility of research and reuse of code may improve if software discovery was easier. Researchers conducting a search for existing software in the context of a state-of-the-art report or a software management plan could use this paper as a guideline for their information retrieval strategy.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"45 1","pages":"33-37"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89738074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Big Provenance Stream Processing for Data Intensive Computations 用于数据密集计算的大来源流处理
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00039
Isuru Suriarachchi, S. Withana, Beth Plale
In the business and research landscape of today, data analysis consumes public and proprietary data from numerous sources, and utilizes any one or more of popular data-parallel frameworks such as Hadoop, Spark and Flink. In the Data Lake setting these frameworks co-exist. Our earlier work has shown that data provenance in Data Lakes can aid with both traceability and management. The sheer volume of fine-grained provenance generated in a multi-framework application motivates the need for on-the-fly provenance processing. We introduce a new parallel stream processing algorithm that reduces fine-grained provenance while preserving backward and forward provenance. The algorithm is resilient to provenance events arriving out-of-order. It is evaluated using several strategies for partitioning a provenance stream. The evaluation shows that the parallel algorithm performs well in processing out-of-order provenance streams, with good scalability and accuracy.
在当今的商业和研究领域,数据分析消耗来自众多来源的公共和专有数据,并利用任何一个或多个流行的数据并行框架,如Hadoop、Spark和Flink。在数据湖设置中,这些框架共存。我们早期的工作表明,数据湖中的数据来源可以帮助实现可追溯性和管理。在多框架应用程序中生成的大量细粒度来源激发了对动态来源处理的需求。我们引入了一种新的并行流处理算法,在保留向后和向前溯源的同时减少了细粒度的溯源。该算法对无序到达的来源事件具有弹性。它使用几种策略来划分一个来源流。仿真结果表明,该算法在处理乱序源流方面表现良好,具有良好的可扩展性和准确性。
{"title":"Big Provenance Stream Processing for Data Intensive Computations","authors":"Isuru Suriarachchi, S. Withana, Beth Plale","doi":"10.1109/eScience.2018.00039","DOIUrl":"https://doi.org/10.1109/eScience.2018.00039","url":null,"abstract":"In the business and research landscape of today, data analysis consumes public and proprietary data from numerous sources, and utilizes any one or more of popular data-parallel frameworks such as Hadoop, Spark and Flink. In the Data Lake setting these frameworks co-exist. Our earlier work has shown that data provenance in Data Lakes can aid with both traceability and management. The sheer volume of fine-grained provenance generated in a multi-framework application motivates the need for on-the-fly provenance processing. We introduce a new parallel stream processing algorithm that reduces fine-grained provenance while preserving backward and forward provenance. The algorithm is resilient to provenance events arriving out-of-order. It is evaluated using several strategies for partitioning a provenance stream. The evaluation shows that the parallel algorithm performs well in processing out-of-order provenance streams, with good scalability and accuracy.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"35 1","pages":"245-255"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75853693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A First Look at the JX Workflow Language JX工作流语言简介
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00094
Tim Shaffer, Kyle M. D. Sweeney, Nathaniel Kremer-Herman, D. Thain
Scientific workflows are typically expressed as a graph of logical tasks, each one representing a single program along with its input and output files. This poster introduces JX (JSON eXtended), a declarative language that can express complex workloads as an assembly of sub-graphs that can be partitioned in flexible ways. We present a case study of using JX to represent complex workflows for the Lifemapper biodiversity project. We evaluate partitioning approaches across several computing environments, including ND-Condor, IU-Jetstream, and SDSC-Comet, and show that a coarse partitioning results in faster turnaround times, reduced data transfer, and lower master utilization across all three systems.
科学工作流通常表示为逻辑任务的图,每个任务代表一个程序及其输入和输出文件。这张海报介绍了JX (JSON eXtended),这是一种声明性语言,可以将复杂的工作负载表达为一组子图,这些子图可以以灵活的方式进行分区。我们提出了一个使用JX表示Lifemapper生物多样性项目的复杂工作流程的案例研究。我们评估了跨多个计算环境(包括ND-Condor、IU-Jetstream和SDSC-Comet)的分区方法,并表明,在所有三个系统中,粗分区会导致更快的周转时间、更少的数据传输和更低的主利用率。
{"title":"A First Look at the JX Workflow Language","authors":"Tim Shaffer, Kyle M. D. Sweeney, Nathaniel Kremer-Herman, D. Thain","doi":"10.1109/eScience.2018.00094","DOIUrl":"https://doi.org/10.1109/eScience.2018.00094","url":null,"abstract":"Scientific workflows are typically expressed as a graph of logical tasks, each one representing a single program along with its input and output files. This poster introduces JX (JSON eXtended), a declarative language that can express complex workloads as an assembly of sub-graphs that can be partitioned in flexible ways. We present a case study of using JX to represent complex workflows for the Lifemapper biodiversity project. We evaluate partitioning approaches across several computing environments, including ND-Condor, IU-Jetstream, and SDSC-Comet, and show that a coarse partitioning results in faster turnaround times, reduced data transfer, and lower master utilization across all three systems.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"15 1","pages":"352-353"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89434752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting Atmospheric Dust Forecast with PyCOMPSs 利用PyCOMPSs加强大气尘埃预报
Pub Date : 2018-10-01 DOI: 10.1109/eScience.2018.00135
Javier Conejero, Cristian Ramon-Cortes, K. Serradell, Rosa M. Badia
Task-based programming is becoming a tool of large interest for boosting High-Performance Computing (HPC) and Big Data applications. In particular, COMP Superscalar (COMPSs), is showing to be an effective task-based programming model for distributed computing of Big Data applications within HPC environments. Applications like NMMB-MONARCH, which is a dust forecast application composed by a set of steps (being some of them binaries with or without MPI), are perfect candidates for PyCOMPSs, the Python binding of COMPSs. This paper describes the success story of the adaptation of the NMMB-MONARCH online multi-scale atmospheric dust model to PyCOMPSs in order to exploit its inherent parallelism with the minimal developer effort. The paper also includes an evaluation of this implementation in the Nord3 supercomputer, a scalability analysis and an in-depth behaviour study. The main results presented in this paper are: (1) PyCOMPSs is able to extract the parallelism from the NMMB-MONARCH application; (2) it is able to improve the dust forecasting in terms of performance when compared with previous versions, and (3) PyCOMPSs is able to interact and share the resources with MPI applications when included in the workflow as tasks. Finally, we present the keys for exporting the knowledge of this experience to other applications in order to benefit from using PyCOMPSs.
基于任务的编程正在成为推动高性能计算(HPC)和大数据应用的重要工具。特别是COMP超标量(comps),在高性能计算环境下的分布式大数据应用中是一种有效的基于任务的编程模型。像NMMB-MONARCH这样的应用程序,它是一个由一组步骤组成的灰尘预测应用程序(其中一些是带有或不带有MPI的二进制文件),是pycomps的完美候选者,pycomps是comps的Python绑定。本文描述了将NMMB-MONARCH在线多尺度大气尘埃模型应用于PyCOMPSs的成功案例,以最小的开发人员努力利用其固有的并行性。本文还包括在Nord3超级计算机上对该实现的评估,可扩展性分析和深入的行为研究。本文的主要成果有:(1)PyCOMPSs能够从NMMB-MONARCH应用中提取并行性;(2)与以前的版本相比,它能够在性能方面提高粉尘预测;(3)PyCOMPSs能够与MPI应用程序交互并共享资源,当它作为任务包含在工作流中。最后,我们提出了将这些经验的知识导出到其他应用程序的关键,以便从使用pycomps中受益。
{"title":"Boosting Atmospheric Dust Forecast with PyCOMPSs","authors":"Javier Conejero, Cristian Ramon-Cortes, K. Serradell, Rosa M. Badia","doi":"10.1109/eScience.2018.00135","DOIUrl":"https://doi.org/10.1109/eScience.2018.00135","url":null,"abstract":"Task-based programming is becoming a tool of large interest for boosting High-Performance Computing (HPC) and Big Data applications. In particular, COMP Superscalar (COMPSs), is showing to be an effective task-based programming model for distributed computing of Big Data applications within HPC environments. Applications like NMMB-MONARCH, which is a dust forecast application composed by a set of steps (being some of them binaries with or without MPI), are perfect candidates for PyCOMPSs, the Python binding of COMPSs. This paper describes the success story of the adaptation of the NMMB-MONARCH online multi-scale atmospheric dust model to PyCOMPSs in order to exploit its inherent parallelism with the minimal developer effort. The paper also includes an evaluation of this implementation in the Nord3 supercomputer, a scalability analysis and an in-depth behaviour study. The main results presented in this paper are: (1) PyCOMPSs is able to extract the parallelism from the NMMB-MONARCH application; (2) it is able to improve the dust forecasting in terms of performance when compared with previous versions, and (3) PyCOMPSs is able to interact and share the resources with MPI applications when included in the workflow as tasks. Finally, we present the keys for exporting the knowledge of this experience to other applications in order to benefit from using PyCOMPSs.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"29 1","pages":"464-474"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89638784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2018 IEEE 14th International Conference on e-Science (e-Science)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1