SIM-PIPE DryRunner: An approach for testing container-based big data pipelines and generating simulation data

Aleena Thomas, Nikolay Nikolov, Antoine Pultier, D. Roman, B. Elvesæter, A. Soylu
{"title":"SIM-PIPE DryRunner: An approach for testing container-based big data pipelines and generating simulation data","authors":"Aleena Thomas, Nikolay Nikolov, Antoine Pultier, D. Roman, B. Elvesæter, A. Soylu","doi":"10.1109/COMPSAC54236.2022.00182","DOIUrl":null,"url":null,"abstract":"Big data pipelines are becoming increasingly vital in a wide range of data intensive application domains such as digital healthcare, telecommunication, and manufacturing for efficiently processing data. Data pipelines in such domains are complex and dynamic and involve a number of data processing steps that are deployed on heterogeneous computing resources under the realm of the Edge-Cloud paradigm. The processes of testing and simulating big data pipelines on heterogeneous resources need to be able to accurately represent this complexity. However, since big data processing is heavily resource-intensive, it makes testing and simulation based on historical execution data impractical. In this paper, we introduce the SIM - PIPE Dry Runner approach - a dry run approach that deploys a big data pipeline step by step in an isolated environment and executes it with sample data; this approach could be used for testing big data pipelines and realising practical simulations using existing simulators.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC54236.2022.00182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Big data pipelines are becoming increasingly vital in a wide range of data intensive application domains such as digital healthcare, telecommunication, and manufacturing for efficiently processing data. Data pipelines in such domains are complex and dynamic and involve a number of data processing steps that are deployed on heterogeneous computing resources under the realm of the Edge-Cloud paradigm. The processes of testing and simulating big data pipelines on heterogeneous resources need to be able to accurately represent this complexity. However, since big data processing is heavily resource-intensive, it makes testing and simulation based on historical execution data impractical. In this paper, we introduce the SIM - PIPE Dry Runner approach - a dry run approach that deploys a big data pipeline step by step in an isolated environment and executes it with sample data; this approach could be used for testing big data pipelines and realising practical simulations using existing simulators.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SIM-PIPE DryRunner:一种测试基于容器的大数据管道和生成模拟数据的方法
为了高效地处理数据,大数据管道在数字医疗、电信和制造业等广泛的数据密集型应用领域变得越来越重要。这些领域中的数据管道是复杂和动态的,并且涉及许多数据处理步骤,这些步骤部署在边缘云范式领域下的异构计算资源上。在异构资源上测试和模拟大数据管道的过程需要能够准确地表示这种复杂性。然而,由于大数据处理是资源密集型的,因此基于历史执行数据的测试和模拟是不切实际的。在本文中,我们介绍了SIM - PIPE干流方法——一种在孤立环境中逐步部署大数据管道并使用样本数据执行的干流方法;这种方法可以用于测试大数据管道,并使用现有模拟器实现实际模拟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Category-Aware App Permission Recommendation based on Sparse Linear Model Early Detection of At-Risk Students in a Calculus Course Apple-YOLO: A Novel Mobile Terminal Detector Based on YOLOv5 for Early Apple Leaf Diseases A Safe Route Recommendation Method Based on Driver Characteristics from Telematics Data GSDNet: An Anti-interference Cochlea Segmentation Model Based on GAN
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1