In current multicore systems, cache memory is shared between multiple concurrent threads. Allocating the proper amount of cache to each thread is crucial to achieving high performance. Cache management in many existing systems is based on the least recently used replacement policy, which can lead to adverse contention between threads for shared cache space. Cache partitioning is a technique that reserves a certain amount of cache for each thread, and has been shown to work well in practice. We introduce the problem of determining the optimal cache partitioning to minimize the make span for completing a set of tasks. We analyze the problem using a model that generalizes a widely used empirical model for cache miss rates. Our first contribution is to give a mathematical characterization of the properties satisfied by an optimal partitioning. Second, we present an algorithm that finds a 1 +epsilon approximation to the optimal partitioning in O(n log frac{n}{epsilon}logfrac{n}{epsilon p}) time, where n is the number of tasks and p is a value that depends on the optimal solution. We compare our algorithm with several partitioning schemes used in practice or proposed in the literature. Simulations show that our algorithm achieves between 22-59% better make span compared to these algorithms.
在当前的多核系统中,缓存内存在多个并发线程之间共享。为每个线程分配适当数量的缓存对于实现高性能至关重要。许多现有系统中的缓存管理基于最近最少使用的替换策略,这可能导致线程之间对共享缓存空间的不利争用。缓存分区是一种为每个线程保留一定数量的缓存的技术,在实践中表现良好。我们介绍了确定最佳缓存分区以最小化完成一组任务的make跨度的问题。我们使用一个模型来分析这个问题,这个模型推广了一个广泛使用的缓存缺失率的经验模型。我们的第一个贡献是给出最优划分所满足的性质的数学表征。其次,我们提出了一种算法,该算法在O(n log frac{n}{epsilon} log frac{n}{epsilon p})时间内找到最优分区的1 + epsilon近似值,其中n是任务的数量,p是依赖于最优解的值。我们将我们的算法与实践中使用或在文献中提出的几种划分方案进行了比较。仿真结果表明,我们的算法达到了22-59之间% better make span compared to these algorithms.
{"title":"Makespan-Optimal Cache Partitioning","authors":"Pan Lai, Rui Fan","doi":"10.1109/MASCOTS.2013.28","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.28","url":null,"abstract":"In current multicore systems, cache memory is shared between multiple concurrent threads. Allocating the proper amount of cache to each thread is crucial to achieving high performance. Cache management in many existing systems is based on the least recently used replacement policy, which can lead to adverse contention between threads for shared cache space. Cache partitioning is a technique that reserves a certain amount of cache for each thread, and has been shown to work well in practice. We introduce the problem of determining the optimal cache partitioning to minimize the make span for completing a set of tasks. We analyze the problem using a model that generalizes a widely used empirical model for cache miss rates. Our first contribution is to give a mathematical characterization of the properties satisfied by an optimal partitioning. Second, we present an algorithm that finds a 1 +epsilon approximation to the optimal partitioning in O(n log frac{n}{epsilon}logfrac{n}{epsilon p}) time, where n is the number of tasks and p is a value that depends on the optimal solution. We compare our algorithm with several partitioning schemes used in practice or proposed in the literature. Simulations show that our algorithm achieves between 22-59% better make span compared to these algorithms.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132900514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Like Zhou, Song Wu, Huahua Sun, Hai Jin, Xuanhua Shi
With the prevalence of multicore processors in computer systems, many soft real-time applications, such as media-based ones, use parallel programming models to utilize hardware resources better and possibly shorten response time. Meanwhile, virtualization technology is widely used in cloud data centers. More and more cloud services including such parallel soft real-time applications are running in virtualized environment. However, current hyper visors do not provide adequate support for them because of soft real-time constraints and synchronization problems, which result in frequent deadline misses and serious performance degradation. CPU schedulers in underlying hyper visors are central to these issues. In this paper, we identify and analyze CPU scheduling problems in hyper visors, and propose a novel scheduling algorithm considering both soft real-time constraints and synchronization problems. In our proposed method, real-time priority is introduced to accelerate event processing of parallel soft real-time applications, and dynamic time slice is used to schedule virtual CPUs. Besides, all runnable virtual CPUs of virtual machines running parallel soft real-time applications are scheduled simultaneously to address synchronization problems. We implement a parallel soft real-time scheduler, named Poris, based on Xen. Our evaluation shows Poris can significantly improve the performance of parallel soft real-time applications. For example, compared to the Credit scheduler, Poris improves the performance of media player by up to a factor of 1.35, and shortens the execution time of PARSEC benchmark by up to 44.12%.
{"title":"Virtual Machine Scheduling for Parallel Soft Real-Time Applications","authors":"Like Zhou, Song Wu, Huahua Sun, Hai Jin, Xuanhua Shi","doi":"10.1109/MASCOTS.2013.74","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.74","url":null,"abstract":"With the prevalence of multicore processors in computer systems, many soft real-time applications, such as media-based ones, use parallel programming models to utilize hardware resources better and possibly shorten response time. Meanwhile, virtualization technology is widely used in cloud data centers. More and more cloud services including such parallel soft real-time applications are running in virtualized environment. However, current hyper visors do not provide adequate support for them because of soft real-time constraints and synchronization problems, which result in frequent deadline misses and serious performance degradation. CPU schedulers in underlying hyper visors are central to these issues. In this paper, we identify and analyze CPU scheduling problems in hyper visors, and propose a novel scheduling algorithm considering both soft real-time constraints and synchronization problems. In our proposed method, real-time priority is introduced to accelerate event processing of parallel soft real-time applications, and dynamic time slice is used to schedule virtual CPUs. Besides, all runnable virtual CPUs of virtual machines running parallel soft real-time applications are scheduled simultaneously to address synchronization problems. We implement a parallel soft real-time scheduler, named Poris, based on Xen. Our evaluation shows Poris can significantly improve the performance of parallel soft real-time applications. For example, compared to the Credit scheduler, Poris improves the performance of media player by up to a factor of 1.35, and shortens the execution time of PARSEC benchmark by up to 44.12%.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128642693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taking photos of landmarks is a favorite and popular way for travellers to keep memories of places they have visited. Community-contributed photo collections, such as on Flickr, provide us an opportunity to gain a more in-depth understanding of a landmark's visual appeal. While much current research is focusing on recommending which representative photos should be selected from such pervasive photo sources, our work aims to find out where a visitor can capture his or her own, beautiful and personal photo of a queried landmark. We believe that this aspect of helping users to take memorable photos has not been well studied. We propose a method to recommend a list of shooting locations that have the utmost potential to capture appealing photos for a landmark of interest. A Gaussian Mixture Model based clustering approach is applied to the camera locations from an existing photo repository, generating a set of regions each of which covers an area with sufficient semantics, e.g., a route section. The scores and ranks among these camera locations are evaluated through multiple criteria, including their potential for better visual aesthetics, overall social attractiveness, popularity, etc. Additionally, we investigate the temporal characteristics of these locations by considering the spatio-temporal space. A number of different recommendations are generated from these results, such as the best camera positions at different times throughout a single day, or the best visiting time in the same spatial area. Subjective evaluation studies have been conducted, which indicate that our work can generate promising results.
{"title":"Camera Shooting Location Recommendations for Landmarks in Geo-space","authors":"Y. Zhang, Roger Zimmermann","doi":"10.1109/MASCOTS.2013.25","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.25","url":null,"abstract":"Taking photos of landmarks is a favorite and popular way for travellers to keep memories of places they have visited. Community-contributed photo collections, such as on Flickr, provide us an opportunity to gain a more in-depth understanding of a landmark's visual appeal. While much current research is focusing on recommending which representative photos should be selected from such pervasive photo sources, our work aims to find out where a visitor can capture his or her own, beautiful and personal photo of a queried landmark. We believe that this aspect of helping users to take memorable photos has not been well studied. We propose a method to recommend a list of shooting locations that have the utmost potential to capture appealing photos for a landmark of interest. A Gaussian Mixture Model based clustering approach is applied to the camera locations from an existing photo repository, generating a set of regions each of which covers an area with sufficient semantics, e.g., a route section. The scores and ranks among these camera locations are evaluated through multiple criteria, including their potential for better visual aesthetics, overall social attractiveness, popularity, etc. Additionally, we investigate the temporal characteristics of these locations by considering the spatio-temporal space. A number of different recommendations are generated from these results, such as the best camera positions at different times throughout a single day, or the best visiting time in the same spatial area. Subjective evaluation studies have been conducted, which indicate that our work can generate promising results.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129062538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata snapshots and comparing the entropy within clusters to the entropy within natural partitions such as directory hierarchies. We discuss several different methods for gaining deeper insights into metadata snapshots, and show a small proof of concept using data from Los Alamos National Laboratories. In our initial work, we see evidence that it is possible to identify user locality information, traditionally the purview of dynamic traces, using a single static snapshot.
元数据快照是深入了解文件系统的常用方法,因为它们体积小且相对容易获取。由于它们是静态的,大多数研究人员使用它们进行相对简单的分析,例如文件大小分布和文件年龄。我们假设,通过对元数据快照中的特征进行集群,并将集群内的熵与自然分区(如目录层次结构)内的熵进行比较,可以更深入地了解文件系统和用户行为。我们讨论了几种不同的方法来获得对元数据快照的更深入的了解,并使用来自Los Alamos National Laboratories的数据展示了一个小的概念证明。在我们最初的工作中,我们看到有证据表明,使用单个静态快照可以识别用户位置信息(传统上是动态跟踪的范围)。
{"title":"Single-Snapshot File System Analysis","authors":"Avani Wildani, I. Adams, E. L. Miller","doi":"10.1109/MASCOTS.2013.47","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.47","url":null,"abstract":"Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata snapshots and comparing the entropy within clusters to the entropy within natural partitions such as directory hierarchies. We discuss several different methods for gaining deeper insights into metadata snapshots, and show a small proof of concept using data from Los Alamos National Laboratories. In our initial work, we see evidence that it is possible to identify user locality information, traditionally the purview of dynamic traces, using a single static snapshot.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116198308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peak power shaving allows data center providers to increase their computational capacity without exceeding a given power budget. Recent papers establish that machines may repurpose energy from uninterruptible power supplies (UPSs) to maintain power budgets during peak demand. Our paper demonstrates that existing studies overestimate cost savings by as much as 3.35x because they use simple battery reliability models, Boolean battery discharge and neglect the design and the cost of battery system communication in the state-of-the-art distributed UPS designs. We propose an architecture where batteries provide only a fraction of the data center power, exploiting nonlinear battery capacity properties to achieve longer battery life and longer peak shaving durations. This architecture demonstrates that a centralized UPS with partial discharge sufficiently reduces the cost so that double power conversion losses are not a limiting factor, thus contradicting the recent trends in warehouse-scale distributed UPS design. Our architecture increases battery lifetime by 78%, doubles the cost savings compared to the distributed design (corresponding to $75K/month savings for a 10MW data center) and significantly reduces the decision coordination latency by 4x relative to the state-of-the-art distributed designs.
{"title":"Architecting Efficient Peak Power Shaving Using Batteries in Data Centers","authors":"Baris Aksanli, Eddie Pettis, T. Simunic","doi":"10.1109/MASCOTS.2013.32","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.32","url":null,"abstract":"Peak power shaving allows data center providers to increase their computational capacity without exceeding a given power budget. Recent papers establish that machines may repurpose energy from uninterruptible power supplies (UPSs) to maintain power budgets during peak demand. Our paper demonstrates that existing studies overestimate cost savings by as much as 3.35x because they use simple battery reliability models, Boolean battery discharge and neglect the design and the cost of battery system communication in the state-of-the-art distributed UPS designs. We propose an architecture where batteries provide only a fraction of the data center power, exploiting nonlinear battery capacity properties to achieve longer battery life and longer peak shaving durations. This architecture demonstrates that a centralized UPS with partial discharge sufficiently reduces the cost so that double power conversion losses are not a limiting factor, thus contradicting the recent trends in warehouse-scale distributed UPS design. Our architecture increases battery lifetime by 78%, doubles the cost savings compared to the distributed design (corresponding to $75K/month savings for a 10MW data center) and significantly reduces the decision coordination latency by 4x relative to the state-of-the-art distributed designs.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116310774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Write amplification brings endurance challenges to NAND Flash-based solid state disks (SSDs) such as impacts upon their write endurance and lifetime. A large write amplification degrades program/erase cycles (P/Es) of NAND Flashes and reduces the endurance and performance of SSDs. The write amplification problem is mainly triggered by garbage collections, wear-leveling, metadata updates, and mapping table updates. Write amplification is defined as the ratio of data volume written by an SSD controller to data volume written by a host. In this paper, we propose a four-level model of write amplification for SSDs. The four levels considered in our model include the channel level, chip level, die level, and plane level. In light of this model, we design a method of analyzing write amplification of SSDs to trace SSD endurance and performance by incorporating the Ready/Busy (R/B) signal of NAND Flash. Our practical approach aims to measure the value of write amplification for an entire SSD rather than NAND Flashes. To validate our measurement technique and model, we implement a verified SSD (vSSD) system and perform a cross-comparison on a set of SSDs, which are stressed by micro-benchmarks and I/O traces. A new method for SSDs is adopted in our measurements to study the R/B signals of NAND Flashes in an SSD. Experimental results show that our model is accurate and the measurement technique is generally applicable to any SSDs.
{"title":"Measuring and Analyzing Write Amplification Characteristics of Solid State Disks","authors":"Hui Sun, X. Qin, Fei Wu, C. Xie","doi":"10.1109/MASCOTS.2013.29","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.29","url":null,"abstract":"Write amplification brings endurance challenges to NAND Flash-based solid state disks (SSDs) such as impacts upon their write endurance and lifetime. A large write amplification degrades program/erase cycles (P/Es) of NAND Flashes and reduces the endurance and performance of SSDs. The write amplification problem is mainly triggered by garbage collections, wear-leveling, metadata updates, and mapping table updates. Write amplification is defined as the ratio of data volume written by an SSD controller to data volume written by a host. In this paper, we propose a four-level model of write amplification for SSDs. The four levels considered in our model include the channel level, chip level, die level, and plane level. In light of this model, we design a method of analyzing write amplification of SSDs to trace SSD endurance and performance by incorporating the Ready/Busy (R/B) signal of NAND Flash. Our practical approach aims to measure the value of write amplification for an entire SSD rather than NAND Flashes. To validate our measurement technique and model, we implement a verified SSD (vSSD) system and perform a cross-comparison on a set of SSDs, which are stressed by micro-benchmarks and I/O traces. A new method for SSDs is adopted in our measurements to study the R/B signals of NAND Flashes in an SSD. Experimental results show that our model is accurate and the measurement technique is generally applicable to any SSDs.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121749657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes a method to obtain symbolic solution of large stochastic models using Gauss-Jordan elimination. Such solution is an efficient alternative to standard simulations and it allows fast and exact solution of very large and complex models that are hard to be dealt even with iterative numerical methods. The proposed method assumes the system described as a structured (modular) Markovian system with discrete states for each system module and transitions among those states ruled by Markovian processes. The mathematical representation of such system is made by a Kronecker (Tensor) formula, i.e., a tensor formulation of small matrices representing each system module transitions and occasional dependencies among modules. Preliminary results of the proposed solution indicate the expected efficiency of the proposed solution.
{"title":"Symbolic Solution of Kronecker-Based Structured Markovian Models","authors":"Paulo Fernandes, Lucelene Lopes, S. Yeralan","doi":"10.1109/MASCOTS.2013.62","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.62","url":null,"abstract":"This paper describes a method to obtain symbolic solution of large stochastic models using Gauss-Jordan elimination. Such solution is an efficient alternative to standard simulations and it allows fast and exact solution of very large and complex models that are hard to be dealt even with iterative numerical methods. The proposed method assumes the system described as a structured (modular) Markovian system with discrete states for each system module and transitions among those states ruled by Markovian processes. The mathematical representation of such system is made by a Kronecker (Tensor) formula, i.e., a tensor formulation of small matrices representing each system module transitions and occasional dependencies among modules. Preliminary results of the proposed solution indicate the expected efficiency of the proposed solution.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128248708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenjiang Dong, Jun Wang, G. Riley, S. Yalamanchili
There has been little research that studies the effect of partitioning on parallel simulation of multicore systems. This paper presents our study of this important problem in the context of Null-message-based synchronization algorithm for parallel multicore simulation. This paper focuses on coarse grain parallel simulation where each core and its cache slices are modeled within a single logical process (LP) and different partitioning schemes are only applied to the interconnection network. In this paper we show that encapsulating the entire on-chip interconnection network into a single logical process is an impediment to scalable simulation. This baseline partitioning and two other schemes are investigated. Experiments are conducted on a subset of the PARSEC benchmarks with 16-, 32-, 64- and 128-core models. Results show that the partitioning scheme has a significant impact on simulation performance and parallel efficiency. Beyond a certain system scale, one scheme consistently outperforms the other two schemes, and the performance as well as efficiency gaps increases as the size of the model increases - with up to 4.1 times faster speed and 277% better efficiency for 128-core models. We explain the reasons for this behavior, which can be traced to the features of the Null-message-based synchronization algorithm. Because of this, we believe that, if a component has increasing number of inter-LP interactions with increasing system size, such components should be partitioned into several sub-components to achieve better performance.
{"title":"A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems","authors":"Zhenjiang Dong, Jun Wang, G. Riley, S. Yalamanchili","doi":"10.1109/MASCOTS.2013.55","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.55","url":null,"abstract":"There has been little research that studies the effect of partitioning on parallel simulation of multicore systems. This paper presents our study of this important problem in the context of Null-message-based synchronization algorithm for parallel multicore simulation. This paper focuses on coarse grain parallel simulation where each core and its cache slices are modeled within a single logical process (LP) and different partitioning schemes are only applied to the interconnection network. In this paper we show that encapsulating the entire on-chip interconnection network into a single logical process is an impediment to scalable simulation. This baseline partitioning and two other schemes are investigated. Experiments are conducted on a subset of the PARSEC benchmarks with 16-, 32-, 64- and 128-core models. Results show that the partitioning scheme has a significant impact on simulation performance and parallel efficiency. Beyond a certain system scale, one scheme consistently outperforms the other two schemes, and the performance as well as efficiency gaps increases as the size of the model increases - with up to 4.1 times faster speed and 277% better efficiency for 128-core models. We explain the reasons for this behavior, which can be traced to the features of the Null-message-based synchronization algorithm. Because of this, we believe that, if a component has increasing number of inter-LP interactions with increasing system size, such components should be partitioned into several sub-components to achieve better performance.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129861004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, cloud storage systems have emerged as the primary solution for online storage and information sharing. Due to efficient storage and bandwidth utilization, the use of erasure codes and network coding is proven to effectively provide fault tolerance and fast content retrieval in cloud storage systems. In a nutshell, coded blocks are distributed among storage nodes, and file retrieval is accomplished by downloading sufficient coded blocks from any group of storage nodes. However, due to high correlation between coded blocks and the original file, even a single-byte update invalidates all coded blocks in the system. In this paper, we introduce DeltaNC, a new differential update algorithm that keeps all coded blocks in a network-coding-based cloud storage system synchronized by transmitting only the changes in the file. Our experimental results, from a trace-driven simulator, show that DeltaNC significantly reduces the bandwidth and CPU usage and its performance is comparable to that offered by the Diff program, the common tool for updating files.
{"title":"DeltaNC: Efficient File Updates for Network-Coding-Based Cloud Storage Systems","authors":"M. R. Zakerinasab, Mea Wang","doi":"10.1109/MASCOTS.2013.52","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.52","url":null,"abstract":"In recent years, cloud storage systems have emerged as the primary solution for online storage and information sharing. Due to efficient storage and bandwidth utilization, the use of erasure codes and network coding is proven to effectively provide fault tolerance and fast content retrieval in cloud storage systems. In a nutshell, coded blocks are distributed among storage nodes, and file retrieval is accomplished by downloading sufficient coded blocks from any group of storage nodes. However, due to high correlation between coded blocks and the original file, even a single-byte update invalidates all coded blocks in the system. In this paper, we introduce DeltaNC, a new differential update algorithm that keeps all coded blocks in a network-coding-based cloud storage system synchronized by transmitting only the changes in the file. Our experimental results, from a trace-driven simulator, show that DeltaNC significantly reduces the bandwidth and CPU usage and its performance is comparable to that offered by the Diff program, the common tool for updating files.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132444012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Researchers and manufacturers are currently putting a lot of efforts to design, improve and deploy the Internet of Things, involving a significant number of constrained and low cost embedded devices deployed in large scales with low power consumption, low bandwidth and limited communication range. For instance we can easily build a network composed by multiple sensors distributed in a building in order to monitor temperature in different offices. This kind of architecture is generally centralized as all sensors are mainly programmed to periodically transmit their data to the sink. The specific IPv6 Routing Protocol for Low-power and Lossy Networks (RPL) had been designed in order to enable such communications. Support for point-to-point traffic is also available. In fact new applications may also consider peer-to-peer communications between any nodes of the network. In that case, RPL is not optimal as data packets are forwarded in respect with longer paths with larger metrics. In this paper we propose to study the effectiveness of RPL compared to a shortest path algorithm such like the Dijkstra's algorithm. We suggest to analyze peer-to-peer communications inside random wireless sensor network topologies with size limited to 250 nodes, corresponding to a reasonable cluster size. We have built a particular simulation environment named Network Analysis and Routing eVALuation (NARVAL). This toolbox permits to generate random topologies in order to study the impact of routing algorithms on the effectiveness of communication protocols. In our work, we first generated many random network topologies where we selected a sink node. We built the Destination Oriented Directed Acyclic Graph (DODAG) from the chosen sink in respect with the RPL algorithm. We finally performed all paths between each couple of two distinct sensor nodes and compared them to the corresponding shortest paths obtained by the Dijkstra's algorithm. This approach permits to retrieve some statistics on the path extension between RPL and the Dijkstra's algorithm. We also analyzed the impact of the sink position and the network size on this path extension.
{"title":"Path Extension Analysis of Peer-to-Peer Communications in Small 6LoWPAN/RPL Sensor Networks","authors":"F. Melakessou, T. Engel","doi":"10.1109/MASCOTS.2013.40","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.40","url":null,"abstract":"Researchers and manufacturers are currently putting a lot of efforts to design, improve and deploy the Internet of Things, involving a significant number of constrained and low cost embedded devices deployed in large scales with low power consumption, low bandwidth and limited communication range. For instance we can easily build a network composed by multiple sensors distributed in a building in order to monitor temperature in different offices. This kind of architecture is generally centralized as all sensors are mainly programmed to periodically transmit their data to the sink. The specific IPv6 Routing Protocol for Low-power and Lossy Networks (RPL) had been designed in order to enable such communications. Support for point-to-point traffic is also available. In fact new applications may also consider peer-to-peer communications between any nodes of the network. In that case, RPL is not optimal as data packets are forwarded in respect with longer paths with larger metrics. In this paper we propose to study the effectiveness of RPL compared to a shortest path algorithm such like the Dijkstra's algorithm. We suggest to analyze peer-to-peer communications inside random wireless sensor network topologies with size limited to 250 nodes, corresponding to a reasonable cluster size. We have built a particular simulation environment named Network Analysis and Routing eVALuation (NARVAL). This toolbox permits to generate random topologies in order to study the impact of routing algorithms on the effectiveness of communication protocols. In our work, we first generated many random network topologies where we selected a sink node. We built the Destination Oriented Directed Acyclic Graph (DODAG) from the chosen sink in respect with the RPL algorithm. We finally performed all paths between each couple of two distinct sensor nodes and compared them to the corresponding shortest paths obtained by the Dijkstra's algorithm. This approach permits to retrieve some statistics on the path extension between RPL and the Dijkstra's algorithm. We also analyzed the impact of the sink position and the network size on this path extension.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"509 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127605477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}