DFMCloudsim: cloudsim的扩展，用于在分布式数据中心上对数据片段迁移进行建模和仿真

Q2 Computer Science International Journal of Computers and Applications Pub Date : 2023-11-03 DOI:10.1080/1206212x.2023.2277554

Laila Bouhouch, Mostapha Zbakh, Claude Tadonki

{"title":"DFMCloudsim: cloudsim的扩展，用于在分布式数据中心上对数据片段迁移进行建模和仿真","authors":"Laila Bouhouch, Mostapha Zbakh, Claude Tadonki","doi":"10.1080/1206212x.2023.2277554","DOIUrl":null,"url":null,"abstract":"AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.","PeriodicalId":39673,"journal":{"name":"International Journal of Computers and Applications","volume":"13 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFMCloudsim: an extension of cloudsim for modeling and simulation of data fragments migration over distributed data centers\",\"authors\":\"Laila Bouhouch, Mostapha Zbakh, Claude Tadonki\",\"doi\":\"10.1080/1206212x.2023.2277554\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.\",\"PeriodicalId\":39673,\"journal\":{\"name\":\"International Journal of Computers and Applications\",\"volume\":\"13 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computers and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/1206212x.2023.2277554\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computers and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/1206212x.2023.2277554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

摘要由于运行在地理分布式云系统上的应用程序的数据量不断增加，对高效数据管理的需求已经成为一个关键的性能因素。除了基本的任务调度外，分布式云系统上输入数据的管理已经成为一个真正的挑战，特别是对于数据密集型应用程序。理想情况下，每个数据集应该与其使用者任务存储在相同的数据中心中，以便只能访问本地数据。但是，当给定任务不需要其中一个输入数据集中的所有项时，完全发送该数据集可能会导致严重的时间开销。为了解决这个问题，可以考虑一种数据碎片策略，以便对数据集进行分区并以这种形式处理它们。这种策略应该足够灵活，以支持任何用户定义的分区，并且足够合适，以最小化以碎片形式传输数据的开销。为了在真正的云中实现之前模拟和估计碎片化和迁移机制的基本统计数据，我们选择了Cloudsim，目的是通过相应的扩展对其进行增强。Cloudsim是一个流行的用于云计算调查的模拟器。我们提出的扩展名为DFMCloudsim，它的目标是为实现碎片化和数据迁移策略提供一个有效的模块。我们使用各种模拟场景验证我们的扩展。结果表明，我们的扩展有效地实现了其主要目标，与以前的工作相比，可以减少74.75%的数据传输开销。关键词:云计算;大数据;云数据;B:准备稿件，进行分析和实验。m.z, c.t.:在最初的解决方案设计中提供了帮助。所有作者都审阅了论文并批准了手稿的最终版本。数据和材料的可用性所有材料归作者所有，可以通过电子邮件请求访问。披露声明作者未报告潜在的利益冲突。作者简介:aila BouhouchLaila Bouhouch于2017年在摩洛哥阿加迪尔伊本·佐尔大学(Ibn Zohr University)国家应用科学学院获得计算机科学工程学位。她目前是摩洛哥拉巴特ENSIAS计算机科学系CEDOC ST2I实验室的博士生。主要研究方向为工作流系统中的大数据管理、云计算和分布式系统。Mostapha Zbakh于2001年获得比利时蒙斯理工学院计算机科学博士学位。2002年起担任摩洛哥拉巴特穆罕默德五世大学国家计算机科学与系统分析学院教授。他的研究兴趣包括负载平衡、并行和分布式系统、高性能计算、大数据和云计算。Claude Tadonki目前在Mines ParisTech/CRI担任研究职位，从事高性能计算主题和自动代码转换。他的背景是数学和计算机科学的结合。从博士学位开始，在不同的职位上，他一直从事高性能计算和运筹学的前沿研究，研究序列模型、方法和实现。他仍然对困难的真实问题的基本问题感兴趣，同时努力理解如何将优化、算法、编程和超级计算机的进步有效地结合起来，以提供最佳答案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DFMCloudsim: an extension of cloudsim for modeling and simulation of data fragments migration over distributed data centers

AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computers and Applications Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

4.70

自引率

0.00%

发文量

期刊介绍： The International Journal of Computers and Applications (IJCA) is a unique platform for publishing novel ideas, research outcomes and fundamental advances in all aspects of Computer Science, Computer Engineering, and Computer Applications. This is a peer-reviewed international journal with a vision to provide the academic and industrial community a platform for presenting original research ideas and applications. IJCA welcomes four special types of papers in addition to the regular research papers within its scope: (a) Papers for which all results could be easily reproducible. For such papers, the authors will be asked to upload "instructions for reproduction'''', possibly with the source codes or stable URLs (from where the codes could be downloaded). (b) Papers with negative results. For such papers, the experimental setting and negative results must be presented in detail. Also, why the negative results are important for the research community must be explained clearly. The rationale behind this kind of paper is that this would help researchers choose the correct approaches to solve problems and avoid the (already worked out) failed approaches. (c) Detailed report, case study and literature review articles about innovative software / hardware, new technology, high impact computer applications and future development with sufficient background and subject coverage. (d) Special issue papers focussing on a particular theme with significant importance or papers selected from a relevant conference with sufficient improvement and new material to differentiate from the papers published in a conference proceedings.