Xiao Ling, Shadi Ibrahim, Hai Jin, Song Wu, Songqiao Tao
Virtualization has become a prominent tool in data centers and is extensively leveraged in cloud environments: it enables multiple virtual machines (VMs) - with multiple operating systems and applications - to run within a physical server. However, virtualization introduces the challenging issue of preserving the high disk utilization (i.e., reducing the seek delay and rotation overhead) when allocating disk resources to VMs. Exploiting spatial locality, a key technique for improving disk utilization and performance, faces additional challenges in the virtualized cloud because of the transparency feature of virtualization (hyper visors do not have the information about the access patterns of applications running within each VM). To this end, this paper contributes a novel disk I/O scheduling framework, named Pregather, to improve disk I/O efficiency through exposure and exploitation of the special spatial locality in the virtualized environment (regional and sub-regional spatial locality corresponds to the virtual disk space and applications' access patterns, respectively), thereby improving the performance of disk-intensive applications without harming the transparency feature of virtualization (without a priori knowledge of the applications' access patterns). The key idea behind Pregather is to implement an intelligent model to predict the access regularity of sub-regional spatial locality for each VM. We implement the Pregather disk scheduling framework and perform extensive experiments that involve multiple simultaneous applications of both synthetic benchmarks and a MapReduce application on Xen-based platforms. Our experiments demonstrate the accuracy of our prediction model and indicate that Pregather results in the high disk spatial locality and a significant improvement in disk throughput and application performance.
{"title":"Exploiting Spatial Locality to Improve Disk Efficiency in Virtualized Environments","authors":"Xiao Ling, Shadi Ibrahim, Hai Jin, Song Wu, Songqiao Tao","doi":"10.1109/MASCOTS.2013.27","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.27","url":null,"abstract":"Virtualization has become a prominent tool in data centers and is extensively leveraged in cloud environments: it enables multiple virtual machines (VMs) - with multiple operating systems and applications - to run within a physical server. However, virtualization introduces the challenging issue of preserving the high disk utilization (i.e., reducing the seek delay and rotation overhead) when allocating disk resources to VMs. Exploiting spatial locality, a key technique for improving disk utilization and performance, faces additional challenges in the virtualized cloud because of the transparency feature of virtualization (hyper visors do not have the information about the access patterns of applications running within each VM). To this end, this paper contributes a novel disk I/O scheduling framework, named Pregather, to improve disk I/O efficiency through exposure and exploitation of the special spatial locality in the virtualized environment (regional and sub-regional spatial locality corresponds to the virtual disk space and applications' access patterns, respectively), thereby improving the performance of disk-intensive applications without harming the transparency feature of virtualization (without a priori knowledge of the applications' access patterns). The key idea behind Pregather is to implement an intelligent model to predict the access regularity of sub-regional spatial locality for each VM. We implement the Pregather disk scheduling framework and perform extensive experiments that involve multiple simultaneous applications of both synthetic benchmarks and a MapReduce application on Xen-based platforms. Our experiments demonstrate the accuracy of our prediction model and indicate that Pregather results in the high disk spatial locality and a significant improvement in disk throughput and application performance.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129474285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasingly serious concerns about the IT carbon footprints have been pushing data center operators to cap their (brown) energy consumption. Naturally, achieving energy capping involves deciding the energy usage over a long timescale (without foreseeing the far future) and hence, we call this process "energy budgeting". The specific goal of this paper is to study energy budgeting for virtualized data centers from an algorithmic perspective: we develop a provably-efficient online algorithm, called eBud (energy Budgeting), which determines server CPU speed and resource allocation to virtual machines for minimizing the data center operational cost while satisfying the long-term energy capping constraint in an online fashion. We rigorously prove that eBud achieves a close-to-minimum cost compared to the optimal offline algorithm with future information, while bounding the potential violation of energy budget constraint, in an almost arbitrarily random environment. We also perform a trace-based simulation study to complement the analysis. The simulation results are consistent with our theoretical analysis and show that eBud reduces the cost by more than 60% (compared to state-of-the-art prediction-based algorithm) while resulting in a zero energy budget deficit.
{"title":"Online Energy Budgeting for Virtualized Data Centers","authors":"M. A. Islam, Shaolei Ren, Gang Quan","doi":"10.1109/MASCOTS.2013.64","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.64","url":null,"abstract":"Increasingly serious concerns about the IT carbon footprints have been pushing data center operators to cap their (brown) energy consumption. Naturally, achieving energy capping involves deciding the energy usage over a long timescale (without foreseeing the far future) and hence, we call this process \"energy budgeting\". The specific goal of this paper is to study energy budgeting for virtualized data centers from an algorithmic perspective: we develop a provably-efficient online algorithm, called eBud (energy Budgeting), which determines server CPU speed and resource allocation to virtual machines for minimizing the data center operational cost while satisfying the long-term energy capping constraint in an online fashion. We rigorously prove that eBud achieves a close-to-minimum cost compared to the optimal offline algorithm with future information, while bounding the potential violation of energy budget constraint, in an almost arbitrarily random environment. We also perform a trace-based simulation study to complement the analysis. The simulation results are consistent with our theoretical analysis and show that eBud reduces the cost by more than 60% (compared to state-of-the-art prediction-based algorithm) while resulting in a zero energy budget deficit.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133446259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joobum Kim, Yamini Jayabal, M. Razo, M. Tacca, A. Fumagalli
An increasing number of users perform large transfers over data networks. While, for the most part, these transfers are currently performed over the IP network, a number of studies advocate the use of end-to-end optical circuits to support these resource-consuming jobs. One of the major advantages is the ability to carry a large fraction of the overall network traffic using the relatively lower-cost and lower-power optical equipment, when compared to IP routers. For example, in optical flow network, end-to-end optical circuits can be established by reserving wavelength channels only when needed. Once the circuit is established, the large data set is seamlessly transferred across the network without requiring IP routers to be involved in the data transfer. For a circuit to be successfully established the following conditions must be simultaneously met: a transmitter must be available at the sender, a receiver must be available at the destination, and a wavelength channel must be available across the network to connect the sender to the destination. Data transfer can start only when the conditions above are simultaneously met. As a result, a request can experience a delay before being established. Network throughput and delay are affected by the availability of network channels (channel-contention) and end-user's receiver (receiver-contention). The contribution of this paper is twofold. First, channel throughput and delay are analytically estimated. Second, the analytical results are validated using simulation results. A number of experiments are conducted using the presented analytical models and simulation platform to investigate the effect of channel and receiver contention on throughput and delay.
{"title":"Channel and Receiver Contention in Optical Flow Switching Networks","authors":"Joobum Kim, Yamini Jayabal, M. Razo, M. Tacca, A. Fumagalli","doi":"10.1109/MASCOTS.2013.59","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.59","url":null,"abstract":"An increasing number of users perform large transfers over data networks. While, for the most part, these transfers are currently performed over the IP network, a number of studies advocate the use of end-to-end optical circuits to support these resource-consuming jobs. One of the major advantages is the ability to carry a large fraction of the overall network traffic using the relatively lower-cost and lower-power optical equipment, when compared to IP routers. For example, in optical flow network, end-to-end optical circuits can be established by reserving wavelength channels only when needed. Once the circuit is established, the large data set is seamlessly transferred across the network without requiring IP routers to be involved in the data transfer. For a circuit to be successfully established the following conditions must be simultaneously met: a transmitter must be available at the sender, a receiver must be available at the destination, and a wavelength channel must be available across the network to connect the sender to the destination. Data transfer can start only when the conditions above are simultaneously met. As a result, a request can experience a delay before being established. Network throughput and delay are affected by the availability of network channels (channel-contention) and end-user's receiver (receiver-contention). The contribution of this paper is twofold. First, channel throughput and delay are analytically estimated. Second, the analytical results are validated using simulation results. A number of experiments are conducted using the presented analytical models and simulation platform to investigate the effect of channel and receiver contention on throughput and delay.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124161922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Li Sun, Wei Zheng, Naveen Rawat, Vikramsinh Sawant, Dimitrios Koutsonikolas
Cognitive radio networks (CRNs) have emerged as a promising solution to the ever-growing demand for additional spectrum resources and more efficient spectrum utilization. A large number of routing protocols for CRNs have been proposed recently, each based on different design goals, and evaluated in different scenarios, under different assumptions. However, little is known about the relative performance of all these protocols, let alone the tradeoffs among their different design goals. In this paper, we conduct the first detailed, empirical performance comparison of three representative routing protocols for CRNs, under the same realistic set of assumptions. Our extensive simulation study shows that the performance of routing protocols in CRNs is affected by a number of factors, in addition to PU activity, some of which have been largely ignored by the majority of previous works. We find that different protocols perform well under different scenarios, and investigate the causes of the observed performance. Furthermore, we present a generic software architecture for the experimental evaluation of CRN routing protocols on a test bed based on the USRP2 platform, and compare the performance of two protocols on a 6 node test bed. The test bed results confirm the findings of our simulation study.
{"title":"Performance Comparison of Routing Protocols for Cognitive Radio Networks","authors":"Li Sun, Wei Zheng, Naveen Rawat, Vikramsinh Sawant, Dimitrios Koutsonikolas","doi":"10.1109/MASCOTS.2013.67","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.67","url":null,"abstract":"Cognitive radio networks (CRNs) have emerged as a promising solution to the ever-growing demand for additional spectrum resources and more efficient spectrum utilization. A large number of routing protocols for CRNs have been proposed recently, each based on different design goals, and evaluated in different scenarios, under different assumptions. However, little is known about the relative performance of all these protocols, let alone the tradeoffs among their different design goals. In this paper, we conduct the first detailed, empirical performance comparison of three representative routing protocols for CRNs, under the same realistic set of assumptions. Our extensive simulation study shows that the performance of routing protocols in CRNs is affected by a number of factors, in addition to PU activity, some of which have been largely ignored by the majority of previous works. We find that different protocols perform well under different scenarios, and investigate the causes of the observed performance. Furthermore, we present a generic software architecture for the experimental evaluation of CRN routing protocols on a test bed based on the USRP2 platform, and compare the performance of two protocols on a 6 node test bed. The test bed results confirm the findings of our simulation study.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129821277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We envision a next-generation cellular network, where base stations allow Internet connectivity through different wireless interfaces (e.g., LTE and WiFi), and licensed cellular frequencies can be used also for device-to-device communications. With this scenario in mind, we develop a model that synthetically and consistently describes the diverse communications opportunities offered by the above network system. Then, we propose a fix-and-relax approach that makes the model solvable in real time. As one of its possible applications, our numerical results show how the model can be effectively used to design and analyze policies for dynamic frequency allocation.
{"title":"A Fix-and-Relax Model for Heterogeneous LTE-Based Networks","authors":"F. Malandrino, C. Casetti, C. Chiasserini","doi":"10.1109/MASCOTS.2013.41","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.41","url":null,"abstract":"We envision a next-generation cellular network, where base stations allow Internet connectivity through different wireless interfaces (e.g., LTE and WiFi), and licensed cellular frequencies can be used also for device-to-device communications. With this scenario in mind, we develop a model that synthetically and consistently describes the diverse communications opportunities offered by the above network system. Then, we propose a fix-and-relax approach that makes the model solvable in real time. As one of its possible applications, our numerical results show how the model can be effectively used to design and analyze policies for dynamic frequency allocation.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115241878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data compression and decompression utilities can be critical in increasing communication throughput, reducing communication latencies, achieving energy-efficient communication, and making effective use of available storage. This paper experimentally evaluates several such utilities for multiple compression levels on systems that represent current mobile platforms. We characterize each utility in terms of its compression ratio, compression and decompression through-put, and energy efficiency. We consider different use cases that are typical for modern mobile environments. We find a wide variety of energy costs associated with data compression and decompression and provide practical guidelines for selecting the most energy efficient configurations for each use case. The best performing configurations provide 6-fold and 4-fold improvements in energy efficiency for compressed uploads and downloads over WLAN, respectively, when compared to uncompressed data transfers.
{"title":"Performance and Energy Consumption of Lossless Compression/Decompression Utilities on Mobile Computing Platforms","authors":"A. Milenković, Armen Dzhagaryan, Martin Burtscher","doi":"10.1109/MASCOTS.2013.33","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.33","url":null,"abstract":"Data compression and decompression utilities can be critical in increasing communication throughput, reducing communication latencies, achieving energy-efficient communication, and making effective use of available storage. This paper experimentally evaluates several such utilities for multiple compression levels on systems that represent current mobile platforms. We characterize each utility in terms of its compression ratio, compression and decompression through-put, and energy efficiency. We consider different use cases that are typical for modern mobile environments. We find a wide variety of energy costs associated with data compression and decompression and provide practical guidelines for selecting the most energy efficient configurations for each use case. The best performing configurations provide 6-fold and 4-fold improvements in energy efficiency for compressed uploads and downloads over WLAN, respectively, when compared to uncompressed data transfers.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130063975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic resource configuration is crucial to the provisioning of service level agreements (SLAs) in cloud computing. Most of today's autonomic resource configuration approaches are designed to scale a single type of resource. A few works are able to partition multiple resources, but mainly to meet the requirement of throughput. Unlike throughput, however, response time behaves nonlinearly with respect to resources. Therefore, these approaches are hardly applicable to dynamic sharing of multi-resources for the provisioning of response time guarantee. Moreover, the optimization of resource efficiency and utilization has great significance to IaaS providers. We show theoretically and experimentally that resource optimization lies in balanced configuration of resources. In this paper, we propose a framework, BConf, for dynamic balanced configuration of multi-resources for the provisioning of response time guarantee in virtualized clusters. BConf employs an integrated MPC (model predictive control) and adaptive PI (proportional integral) control approach (IMAP). MPC is applied to actively balance multiple resources using a novel resource metric. For the performance prediction, a gray-box model is built on generic OS and hardware metrics in addition to resource actuators and performance. We find out that resource penalty is an effective metric to measure the imbalanced degree of a configuration. Using this metric and the model, BConf tunes resources in a balanced way by minimizing the resource penalty while satisfying the response time target. Adaptive PI is used to coordinate with MPC by narrowing the optimization space to a promising region. Within BConf framework, resources are coordinated during contention. Experimental results with mixed TPC-W and TPC-C benchmarks show that BConf reduces resource usages by about 50% and 30% for TPC-W and TPC-C respectively, improves stability by more than 35.6%, and has a much shorter settling time, in comparison with a representative partition approach. The advantages of BConf in resource coordination are also demonstrated.
{"title":"Dynamic Balanced Configuration of Multi-resources in Virtualized Clusters","authors":"Yudi Wei, Chengzhong Xu","doi":"10.1109/MASCOTS.2013.14","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.14","url":null,"abstract":"Dynamic resource configuration is crucial to the provisioning of service level agreements (SLAs) in cloud computing. Most of today's autonomic resource configuration approaches are designed to scale a single type of resource. A few works are able to partition multiple resources, but mainly to meet the requirement of throughput. Unlike throughput, however, response time behaves nonlinearly with respect to resources. Therefore, these approaches are hardly applicable to dynamic sharing of multi-resources for the provisioning of response time guarantee. Moreover, the optimization of resource efficiency and utilization has great significance to IaaS providers. We show theoretically and experimentally that resource optimization lies in balanced configuration of resources. In this paper, we propose a framework, BConf, for dynamic balanced configuration of multi-resources for the provisioning of response time guarantee in virtualized clusters. BConf employs an integrated MPC (model predictive control) and adaptive PI (proportional integral) control approach (IMAP). MPC is applied to actively balance multiple resources using a novel resource metric. For the performance prediction, a gray-box model is built on generic OS and hardware metrics in addition to resource actuators and performance. We find out that resource penalty is an effective metric to measure the imbalanced degree of a configuration. Using this metric and the model, BConf tunes resources in a balanced way by minimizing the resource penalty while satisfying the response time target. Adaptive PI is used to coordinate with MPC by narrowing the optimization space to a promising region. Within BConf framework, resources are coordinated during contention. Experimental results with mixed TPC-W and TPC-C benchmarks show that BConf reduces resource usages by about 50% and 30% for TPC-W and TPC-C respectively, improves stability by more than 35.6%, and has a much shorter settling time, in comparison with a representative partition approach. The advantages of BConf in resource coordination are also demonstrated.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125558844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Performance modelling of web applications involves the task of estimating service demands of requests at physical resources, such as CPUs. In this paper, we propose a service demand estimation algorithm based on a Markov Chain Monte Carlo (MCMC) technique, Gibbs sampling. Our methodology is widely applicable as it requires only queue length samples at each resource, which are simple to measure. Additionally, since we use a Bayesian approach, our method can use prior information on the distribution of parameters, a feature not always available with existing demand estimation approaches. The main challenge of Gibbs sampling is to efficiently evaluate the conditional expression required to sample from the posterior distribution of the demands. This expression is shown to be the equilibrium solution of a multiclass closed queueing network. We define a novel approximation to efficiently obtain the normalising constant to make the cost of its evaluation acceptable for MCMC applications. Experimental evaluation based on simulation data with different model sizes demonstrates the effectiveness of Gibbs sampling for service demand estimation.
{"title":"Bayesian Service Demand Estimation Using Gibbs Sampling","authors":"Weikun Wang, G. Casale","doi":"10.1109/mascots.2013.78","DOIUrl":"https://doi.org/10.1109/mascots.2013.78","url":null,"abstract":"Performance modelling of web applications involves the task of estimating service demands of requests at physical resources, such as CPUs. In this paper, we propose a service demand estimation algorithm based on a Markov Chain Monte Carlo (MCMC) technique, Gibbs sampling. Our methodology is widely applicable as it requires only queue length samples at each resource, which are simple to measure. Additionally, since we use a Bayesian approach, our method can use prior information on the distribution of parameters, a feature not always available with existing demand estimation approaches. The main challenge of Gibbs sampling is to efficiently evaluate the conditional expression required to sample from the posterior distribution of the demands. This expression is shown to be the equilibrium solution of a multiclass closed queueing network. We define a novel approximation to efficiently obtain the normalising constant to make the cost of its evaluation acceptable for MCMC applications. Experimental evaluation based on simulation data with different model sizes demonstrates the effectiveness of Gibbs sampling for service demand estimation.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121735599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rapid flooding is necessary for code updates and routing tree formation in wireless sensor networks. Link correlation-aware collective flooding (CF) is a recently proposed technique that provides a substrate for efficiently disseminating a single packet. Applying CF to multiple packet dissemination poses several challenges, such as reliability degradation, redundant transmissions, and increased contention among node transmissions. The varying link correlation observed in real networks makes the problem harder. In this paper, we propose a multi-packet flooding protocol, SYREN, that exploits the synergy among link correlation and network coding. In particular, SYREN exploits link correlation to eliminate the overhead of explicit control packets in networks with high correlation, and uses network coding to pipeline transmission of multiple packets via a novel, single yet scalable timer per node. SYREN reduces the number of redundant transmissions while achieving near-perfect reliability, especially in networks with low link correlation. Test bed experiments and simulations show that SYREN reduces the average number of transmissions by 30% and dissemination delay by more than 60% while achieving the same reliability as state-of-the-art protocols.
{"title":"SYREN: Synergistic Link Correlation-Aware and Network Coding-Based Dissemination in Wireless Sensor Networks","authors":"S. Alam, Salmin Sultana, Y. C. Hu, S. Fahmy","doi":"10.1109/MASCOTS.2013.70","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.70","url":null,"abstract":"Rapid flooding is necessary for code updates and routing tree formation in wireless sensor networks. Link correlation-aware collective flooding (CF) is a recently proposed technique that provides a substrate for efficiently disseminating a single packet. Applying CF to multiple packet dissemination poses several challenges, such as reliability degradation, redundant transmissions, and increased contention among node transmissions. The varying link correlation observed in real networks makes the problem harder. In this paper, we propose a multi-packet flooding protocol, SYREN, that exploits the synergy among link correlation and network coding. In particular, SYREN exploits link correlation to eliminate the overhead of explicit control packets in networks with high correlation, and uses network coding to pipeline transmission of multiple packets via a novel, single yet scalable timer per node. SYREN reduces the number of redundant transmissions while achieving near-perfect reliability, especially in networks with low link correlation. Test bed experiments and simulations show that SYREN reduces the average number of transmissions by 30% and dissemination delay by more than 60% while achieving the same reliability as state-of-the-art protocols.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131759743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dachuan Huang, Xuechen Zhang, Wei Shi, Mai Zheng, Song Jiang, Feng Qin
Unlike in the consumer electronics and personal computing areas, in the HPC environment hard disks can hardly be replaced by SSDs. The reasons include hard disk's large capacity, very low price, and decent peak throughput. However, when latency dominates the I/O performance (e.g., when accessing random data), the hard disk's performance can be compromised. If the issue of high latency could be effectively solved, the HPC community would enjoy a large, affordable and fast storage without having to replace disks completely with expensive SSDs. In this paper, we propose an almost latency-free hard-disk dominated storage system called LiU for HPC. The key technique is leveraging limited amount of SSD storage for its low-latency access, and changing data layout in a hybrid storage hierarchy with low-latency SSD at the top and high-latency hard disk at the bottom. If a segment of data would be randomly accessed, we lift its top part (the head) up in the hierarchy to the SSD and leave the remaining part (the body) untouched on the disk. As a result, the latency of accessing this whole segment can be removed because access latency of the body can be hidden by the access time of the head on the SSD. Combined with the effect of prefetching a large segment, LiU (Lift it Up) can effectively remove disk access latency so disk's high peak throughput can now be fully exploited for data-intensive HPC applications. We have implemented a prototype of LiU in the PVFS parallel file system and evaluated it with representative MPI-IO micro benchmarks, including MPI-IO-test, mpi-tile-io, and ior-mpi-io, and one macro-benchmark BTIO. Our experimental results show that LiU can effectively improve the I/O performance for HPC applications, with the throughput improvement ratio up to 5.8. Furthermore, LiU can bring much more benefits to sequential-I/O MPI applications when the applications are interfered by other workloads. For example, LiU improves the I/O throughput of mpi-io-test, which is under interference, by 1.1-3.4 times, while improving the same workload without interference by 15%.
与消费电子和个人计算领域不同,在高性能计算环境中,硬盘很难被固态硬盘取代。原因包括硬盘的大容量、非常低的价格和不错的峰值吞吐量。但是,当延迟占I/O性能的主导地位时(例如,访问随机数据时),硬盘的性能可能会受到损害。如果可以有效地解决高延迟问题,HPC社区将享受到一个大的、负担得起的和快速的存储,而不必完全用昂贵的ssd取代磁盘。在本文中,我们提出了一种几乎没有延迟的硬盘为主的HPC存储系统,称为LiU。关键技术是利用有限数量的SSD存储进行低延迟访问,并在混合存储层次结构中更改数据布局,其中低延迟SSD位于顶部,高延迟硬盘位于底部。如果一段数据将被随机访问,我们将其顶部部分(头部)在层次结构中提升到SSD,而将其余部分(主体)保留在磁盘上。因此,由于主体的访问延迟可以被SSD上磁头的访问时间所隐藏,因此可以消除访问整个段的延迟。结合预取大段的效果,LiU (Lift it Up)可以有效地消除磁盘访问延迟,因此磁盘的高峰吞吐量现在可以完全用于数据密集型HPC应用程序。我们在PVFS并行文件系统中实现了LiU的原型,并使用具有代表性的MPI-IO微基准(包括MPI-IO-test、mpi-tile-io和ior-mpi-io)和一个宏观基准BTIO对其进行了评估。实验结果表明,LiU可以有效地提高HPC应用的I/O性能,吞吐量提升比高达5.8。此外,当顺序i /O MPI应用程序受到其他工作负载的干扰时,LiU可以为这些应用程序带来更多好处。例如,LiU将受干扰的mpi-io-test的I/O吞吐量提高了1.1-3.4倍,而在没有干扰的情况下将相同的工作负载提高了15%。
{"title":"LiU: Hiding Disk Access Latency for HPC Applications with a New SSD-Enabled Data Layout","authors":"Dachuan Huang, Xuechen Zhang, Wei Shi, Mai Zheng, Song Jiang, Feng Qin","doi":"10.1109/MASCOTS.2013.19","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.19","url":null,"abstract":"Unlike in the consumer electronics and personal computing areas, in the HPC environment hard disks can hardly be replaced by SSDs. The reasons include hard disk's large capacity, very low price, and decent peak throughput. However, when latency dominates the I/O performance (e.g., when accessing random data), the hard disk's performance can be compromised. If the issue of high latency could be effectively solved, the HPC community would enjoy a large, affordable and fast storage without having to replace disks completely with expensive SSDs. In this paper, we propose an almost latency-free hard-disk dominated storage system called LiU for HPC. The key technique is leveraging limited amount of SSD storage for its low-latency access, and changing data layout in a hybrid storage hierarchy with low-latency SSD at the top and high-latency hard disk at the bottom. If a segment of data would be randomly accessed, we lift its top part (the head) up in the hierarchy to the SSD and leave the remaining part (the body) untouched on the disk. As a result, the latency of accessing this whole segment can be removed because access latency of the body can be hidden by the access time of the head on the SSD. Combined with the effect of prefetching a large segment, LiU (Lift it Up) can effectively remove disk access latency so disk's high peak throughput can now be fully exploited for data-intensive HPC applications. We have implemented a prototype of LiU in the PVFS parallel file system and evaluated it with representative MPI-IO micro benchmarks, including MPI-IO-test, mpi-tile-io, and ior-mpi-io, and one macro-benchmark BTIO. Our experimental results show that LiU can effectively improve the I/O performance for HPC applications, with the throughput improvement ratio up to 5.8. Furthermore, LiU can bring much more benefits to sequential-I/O MPI applications when the applications are interfered by other workloads. For example, LiU improves the I/O throughput of mpi-io-test, which is under interference, by 1.1-3.4 times, while improving the same workload without interference by 15%.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131928554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}