We propose a resource allocation model that captures the interaction between legitimate users of a distributed service provisioning system with malicious intruders attempting to disrupt its operation. The system consists of a bank of servers providing service to incoming requests. Malicious intruders generate fake traffic to the servers attempting to degrade service provisioning. Legitimate traffic may be balanced using available mechanisms in order to mitigate the damage from the attack. We characterize the guaranteed region, i.e. the set of legitimate traffic intensities that are sustainable given specific intensities of the fake traffic, under the assumption that the fake traffic is routed using static policies. This assumption will be relaxed, allowing arbitrary routing policies, in the full version of this work.
{"title":"Sustainability of service provisioning systems under attack","authors":"G. Paschos, L. Tassiulas","doi":"10.1145/2465529.2465747","DOIUrl":"https://doi.org/10.1145/2465529.2465747","url":null,"abstract":"We propose a resource allocation model that captures the interaction between legitimate users of a distributed service provisioning system with malicious intruders attempting to disrupt its operation. The system consists of a bank of servers providing service to incoming requests. Malicious intruders generate fake traffic to the servers attempting to degrade service provisioning. Legitimate traffic may be balanced using available mechanisms in order to mitigate the damage from the attack. We characterize the guaranteed region, i.e. the set of legitimate traffic intensities that are sustainable given specific intensities of the fake traffic, under the assumption that the fake traffic is routed using static policies. This assumption will be relaxed, allowing arbitrary routing policies, in the full version of this work.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121920430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We develop a reuse distance/stack distance based analytical modeling framework for efficient, online prediction of cache performance for a range of cache configurations and replacement policies LRU, PLRU, RANDOM, NMRU. Our framework unifies existing cache miss rate prediction techniques such as Smith's associativity model, Poisson variants, and hardware way-counter based schemes. We also show how to adapt LRU way-counters to work when the number of sets in the cache changes. As an example application, we demonstrate how results from our models can be used to select, based on workload access characteristics, last-level cache configurations that aim to minimize energy-delay product.
{"title":"Reuse-based online models for caches","authors":"Rathijit Sen, D. Wood","doi":"10.1145/2465529.2465756","DOIUrl":"https://doi.org/10.1145/2465529.2465756","url":null,"abstract":"We develop a reuse distance/stack distance based analytical modeling framework for efficient, online prediction of cache performance for a range of cache configurations and replacement policies LRU, PLRU, RANDOM, NMRU. Our framework unifies existing cache miss rate prediction techniques such as Smith's associativity model, Poisson variants, and hardware way-counter based schemes. We also show how to adapt LRU way-counters to work when the number of sets in the cache changes. As an example application, we demonstrate how results from our models can be used to select, based on workload access characteristics, last-level cache configurations that aim to minimize energy-delay product.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121461870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Dai, Chao Lv, Jiaxin Li, Weihua Zhang, B. Zang
Chen Dai Parallel Process Institute Fudan University Shanghai, China daichen@fudan.edu.cn Chao Lv Parallel Process Institute Fudan University Shanghai, China lch@fudan.edu.cn Jiaxin Li Parallel Process Institute Fudan University Shanghai, China lijiaxin@fudan.edu.cn Weihua Zhang Parallel Process Institute Fudan University Shanghai, China zhangweihua@fudan.edu.cn Binyu Zang Parallel Process Institue Fudan University Shanghai, China byzang@fudan.edu.cn
{"title":"Understanding architectural characteristics of multimedia retrieval workloads","authors":"Chen Dai, Chao Lv, Jiaxin Li, Weihua Zhang, B. Zang","doi":"10.1145/2465529.2465541","DOIUrl":"https://doi.org/10.1145/2465529.2465541","url":null,"abstract":"Chen Dai Parallel Process Institute Fudan University Shanghai, China daichen@fudan.edu.cn Chao Lv Parallel Process Institute Fudan University Shanghai, China lch@fudan.edu.cn Jiaxin Li Parallel Process Institute Fudan University Shanghai, China lijiaxin@fudan.edu.cn Weihua Zhang Parallel Process Institute Fudan University Shanghai, China zhangweihua@fudan.edu.cn Binyu Zang Parallel Process Institue Fudan University Shanghai, China byzang@fudan.edu.cn","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116869204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Sundaresan, Nazanin Magharei, N. Feamster, R. Teixeira, Sam Crawford
We present the first large-scale analysis of Web performance bottlenecks as measured from broadband access networks, using data collected from extensive home router deployments. We analyze the limits of throughput on improving Web performance and identify the contribution of critical factors such as DNS lookups and TCP connection establishment to Web page load times. We find that, as broadband speeds continue to increase, other factors such as TCP connection setup time, server response time, and network latency are often dominant performance bottlenecks. Thus, realizing a "faster Web" requires not only higher download throughput, but also optimizations to reduce both client and server-side latency.
{"title":"Web performance bottlenecks in broadband access networks","authors":"S. Sundaresan, Nazanin Magharei, N. Feamster, R. Teixeira, Sam Crawford","doi":"10.1145/2465529.2465745","DOIUrl":"https://doi.org/10.1145/2465529.2465745","url":null,"abstract":"We present the first large-scale analysis of Web performance bottlenecks as measured from broadband access networks, using data collected from extensive home router deployments. We analyze the limits of throughput on improving Web performance and identify the contribution of critical factors such as DNS lookups and TCP connection establishment to Web page load times. We find that, as broadband speeds continue to increase, other factors such as TCP connection setup time, server response time, and network latency are often dominant performance bottlenecks. Thus, realizing a \"faster Web\" requires not only higher download throughput, but also optimizations to reduce both client and server-side latency.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115249198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vytautas Valancius, Bharath Ravi, N. Feamster, A. Snoeren
Online service providers aim to provide good performance for an increasingly diverse set of applications and services. One of the most effective ways to improve service performance is to replicate the service closer to the end users. Replication alone, however, has its limits: while operators can replicate static content, wide-scale replication of dynamic content is not always feasible or cost effective. To improve the latency of such services many operators turn to Internet traffic engineering. In this paper, we study the benefits of performing replica-to-end-user mappings in conjunction with active Internet traffic engineering. We present the design of PECAN, a system that controls both the selection of replicas ("content routing") and the routes between the clients and their associated replicas ("network routing"). We emulate a replicated service that can perform both content and network routing by deploying PECAN on a distributed testbed. In our testbed, we see that jointly performing content and network routing can reduce round-trip latency by 4.3% on average over performing content routing alone (potentially reducing service response times by tens of milliseconds or more) and that most of these gains can be realized with no more than five alternate routes at each replica.
{"title":"Quantifying the benefits of joint content and network routing","authors":"Vytautas Valancius, Bharath Ravi, N. Feamster, A. Snoeren","doi":"10.1145/2465529.2465762","DOIUrl":"https://doi.org/10.1145/2465529.2465762","url":null,"abstract":"Online service providers aim to provide good performance for an increasingly diverse set of applications and services. One of the most effective ways to improve service performance is to replicate the service closer to the end users. Replication alone, however, has its limits: while operators can replicate static content, wide-scale replication of dynamic content is not always feasible or cost effective. To improve the latency of such services many operators turn to Internet traffic engineering. In this paper, we study the benefits of performing replica-to-end-user mappings in conjunction with active Internet traffic engineering. We present the design of PECAN, a system that controls both the selection of replicas (\"content routing\") and the routes between the clients and their associated replicas (\"network routing\"). We emulate a replicated service that can perform both content and network routing by deploying PECAN on a distributed testbed. In our testbed, we see that jointly performing content and network routing can reduce round-trip latency by 4.3% on average over performing content routing alone (potentially reducing service response times by tens of milliseconds or more) and that most of these gains can be realized with no more than five alternate routes at each replica.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121868153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Melanie Kambadur, K. Tang, Joshua Lopez, Martha A. Kim
As software scalability lags behind hardware parallelism, understanding scaling behavior is more important than ever. This paper demonstrates how to use Parallel Block Vector (PBV) profiles to measure the scaling properties of multithreaded programs from a new perspective: the basic block's view. Through this lens, we guide users through quick and simple methods to produce high-resolution application scaling analyses. This method requires no manual program modification, new hardware, or lengthy simulations, and captures the impact of architecture, operating systems, threading models, and inputs. We apply these techniques to a set of parallel benchmarks, and, as an example, demonstrate that when it comes to scaling, functions in an application do not behave monolithically.
{"title":"Parallel scaling properties from a basic block view","authors":"Melanie Kambadur, K. Tang, Joshua Lopez, Martha A. Kim","doi":"10.1145/2465529.2465748","DOIUrl":"https://doi.org/10.1145/2465529.2465748","url":null,"abstract":"As software scalability lags behind hardware parallelism, understanding scaling behavior is more important than ever. This paper demonstrates how to use Parallel Block Vector (PBV) profiles to measure the scaling properties of multithreaded programs from a new perspective: the basic block's view. Through this lens, we guide users through quick and simple methods to produce high-resolution application scaling analyses. This method requires no manual program modification, new hardware, or lengthy simulations, and captures the impact of architecture, operating systems, threading models, and inputs. We apply these techniques to a set of parallel benchmarks, and, as an example, demonstrate that when it comes to scaling, functions in an application do not behave monolithically.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"10 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130926917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As cloud services continue to grow, a key requirement is delivering an 'always-on' experience to end users. Of the several factors affecting service availability, network failures in the hosting datacenters have received little attention. This paper presents a preliminary analysis of intra-datacenter and inter-datacenter network failures from a service perspective. We describe an empirical study analyzing and correlating network failure events over an year across multiple datacenters in a service provider. Our broader goal is to outline steps leveraging existing network mechanisms to improve end-to-end service availability.
{"title":"An empirical analysis of intra- and inter-datacenter network failures for geo-distributed services","authors":"Rahul Potharaju, Navendu Jain","doi":"10.1145/2465529.2465749","DOIUrl":"https://doi.org/10.1145/2465529.2465749","url":null,"abstract":"As cloud services continue to grow, a key requirement is delivering an 'always-on' experience to end users. Of the several factors affecting service availability, network failures in the hosting datacenters have received little attention. This paper presents a preliminary analysis of intra-datacenter and inter-datacenter network failures from a service perspective. We describe an empirical study analyzing and correlating network failure events over an year across multiple datacenters in a service provider. Our broader goal is to outline steps leveraging existing network mechanisms to improve end-to-end service availability.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130409864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heavy-tails are a continual source of excitement and confusion across disciplines as they are repeatedly "discovered" in new contexts. This is especially true within computer systems, where heavy-tails seemingly pop up everywhere -- from degree distributions in the internet and social networks to file sizes and interarrival times of workloads. However, despite nearly a decade of work on heavy-tails they are still treated as mysterious, surprising, and even controversial. The goal of this tutorial is to show that heavy-tailed distributions need not be mysterious and should not be surprising or controversial. In particular, we will demystify heavy-tailed distributions by showing how to reason formally about their counter-intuitive properties; we will highlight that their emergence should be expected (not surprising) by showing that a wide variety of general processes lead to heavy-tailed distributions; and we will highlight that most of the controversy surrounding heavy-tails is the result of bad statistics, and can be avoided by using the proper tools.
{"title":"The fundamentals of heavy-tails: properties, emergence, and identification","authors":"J. Nair, A. Wierman, B. Zwart","doi":"10.1145/2465529.2466587","DOIUrl":"https://doi.org/10.1145/2465529.2466587","url":null,"abstract":"Heavy-tails are a continual source of excitement and confusion across disciplines as they are repeatedly \"discovered\" in new contexts. This is especially true within computer systems, where heavy-tails seemingly pop up everywhere -- from degree distributions in the internet and social networks to file sizes and interarrival times of workloads. However, despite nearly a decade of work on heavy-tails they are still treated as mysterious, surprising, and even controversial.\u0000 The goal of this tutorial is to show that heavy-tailed distributions need not be mysterious and should not be surprising or controversial. In particular, we will demystify heavy-tailed distributions by showing how to reason formally about their counter-intuitive properties; we will highlight that their emergence should be expected (not surprising) by showing that a wide variety of general processes lead to heavy-tailed distributions; and we will highlight that most of the controversy surrounding heavy-tails is the result of bad statistics, and can be avoided by using the proper tools.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"358 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116683309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The advent of the so-called NoSQL databases has brought about a new model of using storage systems. While traditional relational database systems took advantage of features offered by centrally-managed, enterprise-class storage arrays, the new generation of database systems with weaker data consistency models is content with using and manag- ing locally attached individual storage devices and providing data reliability and availability through high-level software features and protocols. This tutorial aims to review the architecture of selected NoSQL DBs to lay the foundations for understanding how these new DB systems behave. In particular, it focuses on how (in)efficiently these new systems use I/O and other resources to accomplish their work. The tutorial examines the behavior of several NoSQL DBs with an emphasis on Cassandra - a popular NoSQL DB system. It uses I/O traces and resource utilization profiles captured in private cloud deployments that use both dedicated directly attached storage as well as shared networked storage.
{"title":"Profiling and analyzing the I/O performance of NoSQL DBs","authors":"J. Schindler","doi":"10.1145/2465529.2479782","DOIUrl":"https://doi.org/10.1145/2465529.2479782","url":null,"abstract":"The advent of the so-called NoSQL databases has brought about a new model of using storage systems. While traditional relational database systems took advantage of features offered by centrally-managed, enterprise-class storage arrays, the new generation of database systems with weaker data consistency models is content with using and manag- ing locally attached individual storage devices and providing data reliability and availability through high-level software features and protocols. This tutorial aims to review the architecture of selected NoSQL DBs to lay the foundations for understanding how these new DB systems behave. In particular, it focuses on how (in)efficiently these new systems use I/O and other resources to accomplish their work. The tutorial examines the behavior of several NoSQL DBs with an emphasis on Cassandra - a popular NoSQL DB system. It uses I/O traces and resource utilization profiles captured in private cloud deployments that use both dedicated directly attached storage as well as shared networked storage.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134292113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Shafiq, Lusheng Ji, A. Liu, Jeffrey Pang, Shobha Venkataraman, Jia Wang
During crowded events, cellular networks face voice and data traffic volumes that are often orders of magnitude higher than what they face during routine days. Despite the use of portable base stations for temporarily increasing communication capacity and free Wi-Fi access points for offloading Internet traffic from cellular base stations, crowded events still present significant challenges for cellular network operators looking to reduce dropped call events and improve Internet speeds. For effective cellular network design, management, and optimization, it is crucial to understand how cellular network performance degrades during crowded events, what causes this degradation, and how practical mitigation schemes would perform in real-life crowded events. This paper makes a first step towards this end by characterizing the operational performance of a tier-1 cellular network in the United States during two high-profile crowded events in 2012. We illustrate how the changes in population distribution, user behavior, and application workload during crowded events result in significant voice and data performance degradation, including more than two orders of magnitude increase in connection failures. Our findings suggest two mechanisms that can improve performance without resorting to costly infrastructure changes: radio resource allocation tuning and opportunistic connection sharing. Using trace-driven simulations, we show that more aggressive release of radio resources via 1-2 seconds shorter RRC timeouts as compared to routine days helps to achieve better tradeoff between wasted radio resources, energy consumption, and delay during crowded events; and opportunistic connection sharing can reduce connection failures by 95% when employed by a small number of devices in each cell sector.
{"title":"A first look at cellular network performance during crowded events","authors":"M. Shafiq, Lusheng Ji, A. Liu, Jeffrey Pang, Shobha Venkataraman, Jia Wang","doi":"10.1145/2465529.2465754","DOIUrl":"https://doi.org/10.1145/2465529.2465754","url":null,"abstract":"During crowded events, cellular networks face voice and data traffic volumes that are often orders of magnitude higher than what they face during routine days. Despite the use of portable base stations for temporarily increasing communication capacity and free Wi-Fi access points for offloading Internet traffic from cellular base stations, crowded events still present significant challenges for cellular network operators looking to reduce dropped call events and improve Internet speeds. For effective cellular network design, management, and optimization, it is crucial to understand how cellular network performance degrades during crowded events, what causes this degradation, and how practical mitigation schemes would perform in real-life crowded events. This paper makes a first step towards this end by characterizing the operational performance of a tier-1 cellular network in the United States during two high-profile crowded events in 2012. We illustrate how the changes in population distribution, user behavior, and application workload during crowded events result in significant voice and data performance degradation, including more than two orders of magnitude increase in connection failures. Our findings suggest two mechanisms that can improve performance without resorting to costly infrastructure changes: radio resource allocation tuning and opportunistic connection sharing. Using trace-driven simulations, we show that more aggressive release of radio resources via 1-2 seconds shorter RRC timeouts as compared to routine days helps to achieve better tradeoff between wasted radio resources, energy consumption, and delay during crowded events; and opportunistic connection sharing can reduce connection failures by 95% when employed by a small number of devices in each cell sector.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122607697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}