{"title":"DFMCloudsim: cloudsim的扩展,用于在分布式数据中心上对数据片段迁移进行建模和仿真","authors":"Laila Bouhouch, Mostapha Zbakh, Claude Tadonki","doi":"10.1080/1206212x.2023.2277554","DOIUrl":null,"url":null,"abstract":"AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.","PeriodicalId":39673,"journal":{"name":"International Journal of Computers and Applications","volume":"13 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFMCloudsim: an extension of cloudsim for modeling and simulation of data fragments migration over distributed data centers\",\"authors\":\"Laila Bouhouch, Mostapha Zbakh, Claude Tadonki\",\"doi\":\"10.1080/1206212x.2023.2277554\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.\",\"PeriodicalId\":39673,\"journal\":{\"name\":\"International Journal of Computers and Applications\",\"volume\":\"13 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computers and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/1206212x.2023.2277554\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computers and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/1206212x.2023.2277554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
DFMCloudsim: an extension of cloudsim for modeling and simulation of data fragments migration over distributed data centers
AbstractDue to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.Keywords: Cloud computingbig datacloudsimdata fragmentationdata migration AcknowledgmentsL. B.: prepared the manuscript, and performed analysis and experiments. M. Z., C. T.: helped in the initial solution design. All authors reviewed the paper and approved the final version of the manuscript.Availability of data and materialsAll of the material is owned by the authors and can be accessed by email request.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationNotes on contributorsLaila BouhouchLaila Bouhouch received her engineer degree in Computer Science at ENSA (National School of Applied Sciences) at Ibn Zohr University, Agadir, Morocco, in 2017. She is currently a Ph.D. student in the Department of Computer Science, Laboratory CEDOC ST2I, ENSIAS, Rabat, Morocco. Her research interests include big data management in workflow systems, cloud computing and distributed systems.Mostapha ZbakhMostapha Zbakh received his Ph.D. in computer sciences from Polytechnic Faculty of Mons, Belgium, in 2001. He is currently a Professor at ENSIAS (National School of Computer Science and System Analysis) at Mohammed V University, Rabat, Morocco, since 2002. His research interests include load balancing, parallel and distributed systems, HPC, Big data and Cloud computing.Claude TadonkiClaude Tadonki currently holds a research position at Mines ParisTech/CRI, working on HPC topics and automatic code transformations. His background is a combination of mathematics and computer science. From his Ph.D. and during his different positions afterwards, he has been involved in cutting-edge researches related to high-performance computing and operation research, following the sequence model, method, and implementation. He is still interested in fundamental questions about difficult genuine problems, while striving to understand how the advances in optimization, algorithmic, programming, and supercomputers can be efficiently combined to provide the best answer.
期刊介绍:
The International Journal of Computers and Applications (IJCA) is a unique platform for publishing novel ideas, research outcomes and fundamental advances in all aspects of Computer Science, Computer Engineering, and Computer Applications. This is a peer-reviewed international journal with a vision to provide the academic and industrial community a platform for presenting original research ideas and applications. IJCA welcomes four special types of papers in addition to the regular research papers within its scope: (a) Papers for which all results could be easily reproducible. For such papers, the authors will be asked to upload "instructions for reproduction'''', possibly with the source codes or stable URLs (from where the codes could be downloaded). (b) Papers with negative results. For such papers, the experimental setting and negative results must be presented in detail. Also, why the negative results are important for the research community must be explained clearly. The rationale behind this kind of paper is that this would help researchers choose the correct approaches to solve problems and avoid the (already worked out) failed approaches. (c) Detailed report, case study and literature review articles about innovative software / hardware, new technology, high impact computer applications and future development with sufficient background and subject coverage. (d) Special issue papers focussing on a particular theme with significant importance or papers selected from a relevant conference with sufficient improvement and new material to differentiate from the papers published in a conference proceedings.