Willis Lang, Frank Bertsch, D. DeWitt, Nigel Ellis
Microsoft operates the Azure SQL Database (ASD) cloud service, one of the dominant relational cloud database services in the market today. To aid the academic community in their research on designing and efficiently operating cloud database services, Microsoft is introducing the release of production-level telemetry traces from the ASD service. This telemetry data set provides, over a wide set of important hardware resources and counters, the consumption level of each customer database replica. The first release will be a multi-month time-series data set that includes the full cluster traces from two different ASD global regions.
{"title":"Microsoft azure SQL database telemetry","authors":"Willis Lang, Frank Bertsch, D. DeWitt, Nigel Ellis","doi":"10.1145/2806777.2806845","DOIUrl":"https://doi.org/10.1145/2806777.2806845","url":null,"abstract":"Microsoft operates the Azure SQL Database (ASD) cloud service, one of the dominant relational cloud database services in the market today. To aid the academic community in their research on designing and efficiently operating cloud database services, Microsoft is introducing the release of production-level telemetry traces from the ASD service. This telemetry data set provides, over a wide set of important hardware resources and counters, the consumption level of each customer database replica. The first release will be a multi-month time-series data set that includes the full cluster traces from two different ASD global regions.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"398 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116691015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ding Li, James W. Mickens, Suman Nath, Lenin Ravindranath
In a modern web application, a single high-level action like a mouse click triggers a flurry of asynchronous events on the client browser and remote web servers. We introduce Domino, a new tool which automatically captures and analyzes end-to-end, asynchronous causal relationship of events that span clients and servers. Using Domino, we found uncharacteristically long event chains in Bing Maps, discovered data races in the WinJS implementation of promises, and developed a new server-side scheduling algorithm for reducing the tail latency of server responses.
{"title":"Domino: understanding wide-area, asynchronous event causality in web applications","authors":"Ding Li, James W. Mickens, Suman Nath, Lenin Ravindranath","doi":"10.1145/2806777.2806940","DOIUrl":"https://doi.org/10.1145/2806777.2806940","url":null,"abstract":"In a modern web application, a single high-level action like a mouse click triggers a flurry of asynchronous events on the client browser and remote web servers. We introduce Domino, a new tool which automatically captures and analyzes end-to-end, asynchronous causal relationship of events that span clients and servers. Using Domino, we found uncharacteristically long event chains in Bing Maps, discovered data races in the WinJS implementation of promises, and developed a new server-side scheduling algorithm for reducing the tail latency of server responses.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132015403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergey Grizan, David Chu, A. Wolman, Roger Wattenhofer
In cloud gaming, servers perform remote rendering on behalf of thin clients. Such a server must deliver sufficient frame rate (at least 30fps) to each of its clients. At the same time, each client desires an immersive experience, and therefore the server should also provide the best graphics quality possible to each client. Statically provisioning time slices of the server GPU for each client suffers from severe underutilization because clients can come and go, and scenes that the clients need rendered can vary greatly in terms of GPU resource usage over time. In this work, we present dJay, a utility-maximizing cloud gaming server that dynamically tunes client GPU rendering workloads in order to 1) ensure all clients get satisfactory frame rate, and 2) provide the best possible graphics quality across clients. To accomplish this, we develop three main components. First, we build an online profiler that collects key cost and benefit data, and distills the data into a reusable regression model. Second, we build an online utility optimizer that uses the regression model to tune GPU workloads for better graphics quality. The optimizer solves the Multiple Choice Knapsack problem. We demonstrate dJay on two high quality commercial games, Doom 3 and Fable 3. Our results show that when compared to a static configuration, we can respond much better to peaks and troughs, achieving up to four times the multi-tenant density on a single server while offering clients the best possible graphics quality.
{"title":"dJay: enabling high-density multi-tenancy for cloud gaming servers with dynamic cost-benefit GPU load balancing","authors":"Sergey Grizan, David Chu, A. Wolman, Roger Wattenhofer","doi":"10.1145/2806777.2806942","DOIUrl":"https://doi.org/10.1145/2806777.2806942","url":null,"abstract":"In cloud gaming, servers perform remote rendering on behalf of thin clients. Such a server must deliver sufficient frame rate (at least 30fps) to each of its clients. At the same time, each client desires an immersive experience, and therefore the server should also provide the best graphics quality possible to each client. Statically provisioning time slices of the server GPU for each client suffers from severe underutilization because clients can come and go, and scenes that the clients need rendered can vary greatly in terms of GPU resource usage over time. In this work, we present dJay, a utility-maximizing cloud gaming server that dynamically tunes client GPU rendering workloads in order to 1) ensure all clients get satisfactory frame rate, and 2) provide the best possible graphics quality across clients. To accomplish this, we develop three main components. First, we build an online profiler that collects key cost and benefit data, and distills the data into a reusable regression model. Second, we build an online utility optimizer that uses the regression model to tune GPU workloads for better graphics quality. The optimizer solves the Multiple Choice Knapsack problem. We demonstrate dJay on two high quality commercial games, Doom 3 and Fable 3. Our results show that when compared to a static configuration, we can respond much better to peaks and troughs, achieving up to four times the multi-tenant density on a single server while offering clients the best possible graphics quality.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117192255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mayank Pundir, Luke M. Leslie, Indranil Gupta, R. Campbell
Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems -- PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication inherent in today's graph processing systems to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when 6--12% of the servers fail, and between 87--95% when half the cluster fails. Furthermore, using various graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios, and achieves a worst-case accuracy of 97%.
{"title":"Zorro: zero-cost reactive failure recovery in distributed graph processing","authors":"Mayank Pundir, Luke M. Leslie, Indranil Gupta, R. Campbell","doi":"10.1145/2806777.2806934","DOIUrl":"https://doi.org/10.1145/2806777.2806934","url":null,"abstract":"Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems -- PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication inherent in today's graph processing systems to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when 6--12% of the servers fail, and between 87--95% when half the cluster fails. Furthermore, using various graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios, and achieves a worst-case accuracy of 97%.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117237552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cloud computing is a successful model for hosting web-facing applications that are accessed by their users as services. While clouds currently offer Service Level Agreements (SLAs) containing guarantees of availability, they do not make performance guarantees for deployed applications. In this work we present Cerebro -- a system for establishing statistical guarantees of application response time in cloud settings. Cerebro combines off-line static analysis of application control structure with on-line cloud performance monitoring and statistical forecasting to predict bounds on the response time of web-facing application programming interfaces (APIs). Because Cerebro does not require application instrumentation or per-application cloud benchmarking, it does not impose any runtime overhead, and is suitable for use at cloud scales. Also, because the bounds are statistical, they are appropriate for use as the basis for SLAs between cloud-hosted applications and their users. We investigate the correctness of Cerebro predictions, the tightness of their bounds, and the duration over which the bounds persist in both Google App Engine and AppScale (public and private cloud platforms respectively). We also detail the effectiveness of our SLA prediction methodology compared to other performance bound estimation methods based on simple statistical analysis.
{"title":"Response time service level agreements for cloud-hosted web applications","authors":"Hiranya Jayathilaka, C. Krintz, R. Wolski","doi":"10.1145/2806777.2806842","DOIUrl":"https://doi.org/10.1145/2806777.2806842","url":null,"abstract":"Cloud computing is a successful model for hosting web-facing applications that are accessed by their users as services. While clouds currently offer Service Level Agreements (SLAs) containing guarantees of availability, they do not make performance guarantees for deployed applications. In this work we present Cerebro -- a system for establishing statistical guarantees of application response time in cloud settings. Cerebro combines off-line static analysis of application control structure with on-line cloud performance monitoring and statistical forecasting to predict bounds on the response time of web-facing application programming interfaces (APIs). Because Cerebro does not require application instrumentation or per-application cloud benchmarking, it does not impose any runtime overhead, and is suitable for use at cloud scales. Also, because the bounds are statistical, they are appropriate for use as the basis for SLAs between cloud-hosted applications and their users. We investigate the correctness of Cerebro predictions, the tightness of their bounds, and the duration over which the bounds persist in both Google App Engine and AppScale (public and private cloud platforms respectively). We also detail the effectiveness of our SLA prediction methodology compared to other performance bound estimation methods based on simple statistical analysis.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131672403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richard Li, Dallin Abendroth, Xing Lin, Yuankai Guo, H. Baek, E. Eide, R. Ricci, J. Merwe
Penetration testing---the process of probing a deployed system for security vulnerabilities---involves a fundamental tension. If one tests a production system, there is a real danger of collateral damage; this is particularly true for systems hosted in the cloud due to the presence of other tenants. If one tests against a separate system brought up to model the live one, the dynamic state of the production system is not captured, and the value of the test is reduced. This paper presents Potassium, which provides penetration testing as a service (PTaaS) and resolves this tension for system owners, penetration testers, and cloud providers. Potassium uses techniques originally developed for live migration of virtual machines to clone them instead, capturing their full disk, memory, and network state. Potassium isolates the cloned system from the rest of the cloud, providing confidence that side effects of the penetration test will not harm other tenants. The penetration tester effectively owns the cloned system, allowing testing to be more thorough, efficient, and automatable. Experiments with our Potassium prototype show that PTaaS can detect real-world vulnerabilities while having minimal impact on cloud-based production systems.
{"title":"Potassium: penetration testing as a service","authors":"Richard Li, Dallin Abendroth, Xing Lin, Yuankai Guo, H. Baek, E. Eide, R. Ricci, J. Merwe","doi":"10.1145/2806777.2806935","DOIUrl":"https://doi.org/10.1145/2806777.2806935","url":null,"abstract":"Penetration testing---the process of probing a deployed system for security vulnerabilities---involves a fundamental tension. If one tests a production system, there is a real danger of collateral damage; this is particularly true for systems hosted in the cloud due to the presence of other tenants. If one tests against a separate system brought up to model the live one, the dynamic state of the production system is not captured, and the value of the test is reduced. This paper presents Potassium, which provides penetration testing as a service (PTaaS) and resolves this tension for system owners, penetration testers, and cloud providers. Potassium uses techniques originally developed for live migration of virtual machines to clone them instead, capturing their full disk, memory, and network state. Potassium isolates the cloned system from the rest of the cloud, providing confidence that side effects of the penetration test will not harm other tenants. The penetration tester effectively owns the cloned system, allowing testing to be more thorough, efficient, and automatable. Experiments with our Potassium prototype show that PTaaS can detect real-world vulnerabilities while having minimal impact on cloud-based production systems.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115736524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Putting computers into low power mode (e.g., suspend-to-RAM) could potentially save significant amount of power when the computers are not in use. Unfortunately, this is often infeasible in practice because data stored on the computers (i.e., directly attached disks, DAS) might need to be accessed by others. Separating storage from computation by attaching storage on the network (e.g., NAS and SAN) could potentially solve this problem, at the cost of lower performance, more network congestion, increased peak power consumption, and higher equipment cost. Though DAS does not suffer these problems, it is not flexible for power saving. In this paper, we present DSwitch, an architecture that, depending on the workload, allows a disk to be attached either directly or through network. We design flexible workload migration based on DSwitch, and show that a wide variety of applications in both data center and home/office settings can be well supported. The experiments demonstrate that our prototype DSwitch achieves a power savings of 91.9% to 97.5% when a disk is in low power network attached mode, while incurring no performance degradation and minimal power overhead when it is in high performance directly attached mode.
{"title":"DSwitch: a dual mode direct and network attached disk","authors":"Quanlu Zhang, Yafei Dai, Lintao Zhang","doi":"10.1145/2806777.2806850","DOIUrl":"https://doi.org/10.1145/2806777.2806850","url":null,"abstract":"Putting computers into low power mode (e.g., suspend-to-RAM) could potentially save significant amount of power when the computers are not in use. Unfortunately, this is often infeasible in practice because data stored on the computers (i.e., directly attached disks, DAS) might need to be accessed by others. Separating storage from computation by attaching storage on the network (e.g., NAS and SAN) could potentially solve this problem, at the cost of lower performance, more network congestion, increased peak power consumption, and higher equipment cost. Though DAS does not suffer these problems, it is not flexible for power saving. In this paper, we present DSwitch, an architecture that, depending on the workload, allows a disk to be attached either directly or through network. We design flexible workload migration based on DSwitch, and show that a wide variety of applications in both data center and home/office settings can be well supported. The experiments demonstrate that our prototype DSwitch achieves a power savings of 91.9% to 97.5% when a disk is in low power network attached mode, while incurring no performance degradation and minimal power overhead when it is in high performance directly attached mode.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116888432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although many models exist to predict the time taken to migrate a virtual machine from one physical machine to another, our empirical validation of these models has shown the 90th percentile error to be 46% (43 secs) and 159% (112 secs) for KVM and Xen live migration, respectively. Our analysis reveals that these models are fundamentally flawed as they all fail to take into account the following three critical parameters: (i) the writable working set size, (ii) the number of pages eligible for the skip technique, (iii) the relation of the number of skipped pages with the page dirty rate and the page transfer rate, and incorrectly model the key parameter---the number of new pages dirtied per unit time. In this paper, we propose a novel model that takes all these parameters into account. We present a thorough validation with 53 workloads and show that the 90th percentile error in the estimated migration times is only 12% (8 secs) and 19% (14 secs) for KVM and Xen live migration, respectively.
{"title":"Towards a comprehensive performance model of virtual machine live migration","authors":"Senthil Nathan, U. Bellur, Purushottam Kulkarni","doi":"10.1145/2806777.2806838","DOIUrl":"https://doi.org/10.1145/2806777.2806838","url":null,"abstract":"Although many models exist to predict the time taken to migrate a virtual machine from one physical machine to another, our empirical validation of these models has shown the 90th percentile error to be 46% (43 secs) and 159% (112 secs) for KVM and Xen live migration, respectively. Our analysis reveals that these models are fundamentally flawed as they all fail to take into account the following three critical parameters: (i) the writable working set size, (ii) the number of pages eligible for the skip technique, (iii) the relation of the number of skipped pages with the page dirty rate and the page transfer rate, and incorrectly model the key parameter---the number of new pages dirtied per unit time. In this paper, we propose a novel model that takes all these parameters into account. We present a thorough validation with 53 workloads and show that the 90th percentile error in the estimated migration times is only 12% (8 secs) and 19% (14 secs) for KVM and Xen live migration, respectively.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126907786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With growing data volumes generated and stored across geo-distributed datacenters, it is becoming increasingly inefficient to aggregate all data required for computation at a single datacenter. Instead, a recent trend is to distribute computation to take advantage of data locality, thus reducing the resource (e.g., bandwidth) costs while improving performance. In this trend, new challenges are emerging in job scheduling, which requires coordination among the datacenters as each job runs across geo-distributed sites. In this paper, we propose novel job scheduling algorithms that coordinate job scheduling across datacenters with low overhead, while achieving near-optimal performance. Our extensive simulation study with realistic job traces shows that the proposed scheduling algorithms result in up to 50% improvement in average job completion time over the Shortest Remaining Processing Time (SRPT) based approaches.
{"title":"Scheduling jobs across geo-distributed datacenters","authors":"Chien-Chun Hung, L. Golubchik, Minlan Yu","doi":"10.1145/2806777.2806780","DOIUrl":"https://doi.org/10.1145/2806777.2806780","url":null,"abstract":"With growing data volumes generated and stored across geo-distributed datacenters, it is becoming increasingly inefficient to aggregate all data required for computation at a single datacenter. Instead, a recent trend is to distribute computation to take advantage of data locality, thus reducing the resource (e.g., bandwidth) costs while improving performance. In this trend, new challenges are emerging in job scheduling, which requires coordination among the datacenters as each job runs across geo-distributed sites. In this paper, we propose novel job scheduling algorithms that coordinate job scheduling across datacenters with low overhead, while achieving near-optimal performance. Our extensive simulation study with realistic job traces shows that the proposed scheduling algorithms result in up to 50% improvement in average job completion time over the Shortest Remaining Processing Time (SRPT) based approaches.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128394769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion
Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with μs-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. Monitoring hardware queue depths allows us to detect increases in queuing latencies. Carefully coordinated adjustments to the NIC's packet redirection table enable us to reassign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%--54% of processor energy, which corresponds to 85%--93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%--46% of their standalone throughput, which corresponds to 82%--92% of the Pareto bound.
{"title":"Energy proportionality and workload consolidation for latency-critical applications","authors":"G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion","doi":"10.1145/2806777.2806848","DOIUrl":"https://doi.org/10.1145/2806777.2806848","url":null,"abstract":"Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with μs-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. Monitoring hardware queue depths allows us to detect increases in queuing latencies. Carefully coordinated adjustments to the NIC's packet redirection table enable us to reassign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%--54% of processor energy, which corresponds to 85%--93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%--46% of their standalone throughput, which corresponds to 82%--92% of the Pareto bound.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115739050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}