A workflow-enabled big data analytics software stack for escience

Cosimo Palazzo, Andrea Mariello, S. Fiore, Alessandro D'Anca, D. Elia, Dean N. Williams, G. Aloisio
{"title":"A workflow-enabled big data analytics software stack for escience","authors":"Cosimo Palazzo, Andrea Mariello, S. Fiore, Alessandro D'Anca, D. Elia, Dean N. Williams, G. Aloisio","doi":"10.1109/HPCSim.2015.7237088","DOIUrl":null,"url":null,"abstract":"The availability of systems able to process and analyse big amount of data has boosted scientific advances in several fields. Workflows provide an effective tool to define and manage large sets of processing tasks. In the big data analytics area, the Ophidia project provides a cross-domain big data analytics framework for the analysis of scientific, multi-dimensional datasets. The framework exploits a server-side, declarative, parallel approach for data analysis and mining. It also features a complete workflow management system to support the execution of complex scientific data analysis, schedule tasks submission, manage operators dependencies and monitor jobs execution. The workflow management engine allows users to perform a coordinated execution of multiple data analytics operators (both single and massive - parameter sweep) in an effective manner. For the definition of the big data analytics workflow, a JSON schema has been properly designed and implemented. To aid the definition of the workflows, a visual design language consisting of several symbols, named Data Analytics Workflow Modelling Language (DAWML), has been also defined.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2015.7237088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

The availability of systems able to process and analyse big amount of data has boosted scientific advances in several fields. Workflows provide an effective tool to define and manage large sets of processing tasks. In the big data analytics area, the Ophidia project provides a cross-domain big data analytics framework for the analysis of scientific, multi-dimensional datasets. The framework exploits a server-side, declarative, parallel approach for data analysis and mining. It also features a complete workflow management system to support the execution of complex scientific data analysis, schedule tasks submission, manage operators dependencies and monitor jobs execution. The workflow management engine allows users to perform a coordinated execution of multiple data analytics operators (both single and massive - parameter sweep) in an effective manner. For the definition of the big data analytics workflow, a JSON schema has been properly designed and implemented. To aid the definition of the workflows, a visual design language consisting of several symbols, named Data Analytics Workflow Modelling Language (DAWML), has been also defined.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个支持工作流的escience大数据分析软件堆栈
能够处理和分析大量数据的系统的可用性推动了几个领域的科学进步。工作流提供了定义和管理大型处理任务集的有效工具。在大数据分析领域,Ophidia项目为科学、多维数据集的分析提供了一个跨领域的大数据分析框架。该框架利用服务器端、声明式、并行的方法进行数据分析和挖掘。它还具有完整的工作流程管理系统,以支持执行复杂的科学数据分析,计划任务提交,管理操作员依赖关系和监控作业执行。工作流管理引擎允许用户以有效的方式执行多个数据分析操作(包括单个和大量参数扫描)的协调执行。对于大数据分析工作流的定义,已经适当地设计和实现了JSON模式。为了帮助工作流的定义,还定义了一种由几个符号组成的视觉设计语言,称为数据分析工作流建模语言(DAWML)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transient performance evaluation of cloud computing applications and dynamic resource control in large-scale distributed systems A security framework for population-scale genomics analysis Deep learning with shallow architecture for image classification A new reality requiers new ecosystems Investigation of DVFS based dynamic reliability management for chip multiprocessors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1