M. A. Islam, Luting Yang, K. Ranganath, Shaolei Ren
The common practice of power infrastructure oversubscription in data centers exposes dangerous vulnerabilities to well-timed power attacks (i.e., maliciously timed power loads), possibly creating outages and resulting in multimillion-dollar losses. In this paper, we focus on the emerging threat of power attacks in a multi-tenant data center, where a malicious tenant (i.e., attacker) aims at compromising the data center availability by launching power attacks and overloading the power capacity. We discover a novel acoustic side channel resulting from servers' cooling fan noise, which can help the attacker time power attacks at the moments when benign tenants' power usage is high. Concretely, we exploit the acoustic side channel by: (1) employing a high-pass filter to filter out the air conditioner's noise; (2) applying non-negative matrix factorization with sparsity constraint to demix the received aggregate noise and detect periods of high power usage by benign tenants; and (3) designing a state machine to guide power attacks. We run experiments in a practical data center environment as well as simulation studies, and demonstrate that the acoustic side channel can assist the attacker with detecting more than 50% of all attack opportunities, representing state-of-the-art timing accuracy.
{"title":"Why Some Like It Loud: Timing Power Attacks in Multi-tenant Data Centers Using an Acoustic Side Channel","authors":"M. A. Islam, Luting Yang, K. Ranganath, Shaolei Ren","doi":"10.1145/3219617.3219645","DOIUrl":"https://doi.org/10.1145/3219617.3219645","url":null,"abstract":"The common practice of power infrastructure oversubscription in data centers exposes dangerous vulnerabilities to well-timed power attacks (i.e., maliciously timed power loads), possibly creating outages and resulting in multimillion-dollar losses. In this paper, we focus on the emerging threat of power attacks in a multi-tenant data center, where a malicious tenant (i.e., attacker) aims at compromising the data center availability by launching power attacks and overloading the power capacity. We discover a novel acoustic side channel resulting from servers' cooling fan noise, which can help the attacker time power attacks at the moments when benign tenants' power usage is high. Concretely, we exploit the acoustic side channel by: (1) employing a high-pass filter to filter out the air conditioner's noise; (2) applying non-negative matrix factorization with sparsity constraint to demix the received aggregate noise and detect periods of high power usage by benign tenants; and (3) designing a state machine to guide power attacks. We run experiments in a practical data center environment as well as simulation studies, and demonstrate that the acoustic side channel can assist the attacker with detecting more than 50% of all attack opportunities, representing state-of-the-art timing accuracy.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124062423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Yang, Lei Deng, M. Hajiesmaili, Cheng Tan, W. Wong
In many online learning paradigms, convexity plays a central role in the derivation and analysis of online learning algorithms. The results, however, fail to be extended to the non-convex settings, which are necessitated by tons of recent applications. The Online Non-Convex Learning problem generalizes the classic Online Convex Optimization framework by relaxing the convexity assumption on the cost function (to a Lipschitz continuous function) and the decision set. The state-of-the-art result for ønco demonstrates that the classic Hedge algorithm attains a sublinear regret of O(√T log T). The regret lower bound for øco, however, is Omega(√T), and to the best of our knowledge, there is no result in the context of the ønco problem achieving the same bound. This paper proposes the Online Recursive Weighting algorithm with regret of O(√T), matching the tight regret lower bound for the øco problem, and fills the regret gap between the state-of-the-art results in the online convex and non-convex optimization problems.
{"title":"An Optimal Algorithm for Online Non-Convex Learning","authors":"L. Yang, Lei Deng, M. Hajiesmaili, Cheng Tan, W. Wong","doi":"10.1145/3219617.3219635","DOIUrl":"https://doi.org/10.1145/3219617.3219635","url":null,"abstract":"In many online learning paradigms, convexity plays a central role in the derivation and analysis of online learning algorithms. The results, however, fail to be extended to the non-convex settings, which are necessitated by tons of recent applications. The Online Non-Convex Learning problem generalizes the classic Online Convex Optimization framework by relaxing the convexity assumption on the cost function (to a Lipschitz continuous function) and the decision set. The state-of-the-art result for ønco demonstrates that the classic Hedge algorithm attains a sublinear regret of O(√T log T). The regret lower bound for øco, however, is Omega(√T), and to the best of our knowledge, there is no result in the context of the ønco problem achieving the same bound. This paper proposes the Online Recursive Weighting algorithm with regret of O(√T), matching the tight regret lower bound for the øco problem, and fills the regret gap between the state-of-the-art results in the online convex and non-convex optimization problems.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131121629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation, propagates through random walks. Social diffusion propagates through shortest paths. These diffusions are linear in the sense of not distinguishing between contributions of few "strong" relations or many "weak'' relations. Recent methods such as node embeddings and graph convolutional networks (GCN) attained significant gains in quality for SSL tasks. These methods vary on how the graph structure, seed label information, and other features are used, but do share a common thread of nonlinearity that suppresses weak relations and re-enforces stronger ones. Aiming for quality gain with more scalable methods, we revisit classic linear diffusion methods and place them in a self-training framework. The resulting bootstrapped diffusions are nonlinear in that they re-enforce stronger relations, as with the more complex methods. Surprisingly, we observe that SSL with bootstrapped diffusions not only significantly improves over the respective non-bootstrapped baselines but also outperform state-of-the-art SSL methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.
{"title":"Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity","authors":"Eliav Buchnik, E. Cohen","doi":"10.1145/3219617.3219621","DOIUrl":"https://doi.org/10.1145/3219617.3219621","url":null,"abstract":"Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation, propagates through random walks. Social diffusion propagates through shortest paths. These diffusions are linear in the sense of not distinguishing between contributions of few \"strong\" relations or many \"weak'' relations. Recent methods such as node embeddings and graph convolutional networks (GCN) attained significant gains in quality for SSL tasks. These methods vary on how the graph structure, seed label information, and other features are used, but do share a common thread of nonlinearity that suppresses weak relations and re-enforces stronger ones. Aiming for quality gain with more scalable methods, we revisit classic linear diffusion methods and place them in a self-training framework. The resulting bootstrapped diffusions are nonlinear in that they re-enforce stronger relations, as with the more complex methods. Surprisingly, we observe that SSL with bootstrapped diffusions not only significantly improves over the respective non-bootstrapped baselines but also outperform state-of-the-art SSL methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"26 35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127837653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.
{"title":"A Quantitative Evaluation of Contemporary GPU Simulation Methodology","authors":"Akshay Jain, Mahmoud Khairy, Timothy G. Rogers","doi":"10.1145/3219617.3219658","DOIUrl":"https://doi.org/10.1145/3219617.3219658","url":null,"abstract":"Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122244451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mowei Wang, Yong Cui, Shihan Xiao, Xin Wang, Dan Yang, Kai Chen, Jun Zhu
The emerging optical/wireless topology reconfiguration technologies have shown great potential in improving the performance of data center networks. However, it also poses a big challenge on how to find the best topology configurations to support the dynamic traffic demands. In this work, we present xWeaver, a traffic-driven deep learning solution to infer the high-performance network topology online. xWeaver supports a powerful network model that enables the topology optimization over different performance metrics and network architectures. With the design of properly-structured neural networks, it can automatically derive the critical traffic patterns from data traces and learn the underlying mapping between the traffic patterns and topology configurations specific to the target data center. After offline training, xWeaver generates the optimized (or near-optimal) topology configuration online, and can also smoothly update its model parameters for new traffic patterns. The experiment results show the significant performance gain of xWeaver in supporting smaller flow completion time.
{"title":"Neural Network Meets DCN: Traffic-driven Topology Adaptation with Deep Learning","authors":"Mowei Wang, Yong Cui, Shihan Xiao, Xin Wang, Dan Yang, Kai Chen, Jun Zhu","doi":"10.1145/3219617.3219656","DOIUrl":"https://doi.org/10.1145/3219617.3219656","url":null,"abstract":"The emerging optical/wireless topology reconfiguration technologies have shown great potential in improving the performance of data center networks. However, it also poses a big challenge on how to find the best topology configurations to support the dynamic traffic demands. In this work, we present xWeaver, a traffic-driven deep learning solution to infer the high-performance network topology online. xWeaver supports a powerful network model that enables the topology optimization over different performance metrics and network architectures. With the design of properly-structured neural networks, it can automatically derive the critical traffic patterns from data traces and learn the underlying mapping between the traffic patterns and topology configurations specific to the target data center. After offline training, xWeaver generates the optimized (or near-optimal) topology configuration online, and can also smoothly update its model parameters for new traffic patterns. The experiment results show the significant performance gain of xWeaver in supporting smaller flow completion time.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116542322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","authors":"","doi":"10.1145/3219617","DOIUrl":"https://doi.org/10.1145/3219617","url":null,"abstract":"","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131617836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Resource Management II","authors":"Nicolas Gast","doi":"10.1145/3258595","DOIUrl":"https://doi.org/10.1145/3258595","url":null,"abstract":"","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116501310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This editorial announces a new series on diversity in the ACM Sigmetrics Performance Evaluation Review (PER). In several upcoming and future issues we will feature invited articles on diversity from authors in the performance evaluation community, but also from the larger Computing Science (CS) community. The articles will touch various aspects in CS including K-12 and post-secondary education, graduate studies, academic recruitment, industry perspectives, harassment issues, and gender, ethnicity, and racial bias.
{"title":"ACM Sigmetrics Performance Evaluation Review: A New Series on Diversity","authors":"N. Hegde","doi":"10.1145/3219617.3219675","DOIUrl":"https://doi.org/10.1145/3219617.3219675","url":null,"abstract":"This editorial announces a new series on diversity in the ACM Sigmetrics Performance Evaluation Review (PER). In several upcoming and future issues we will feature invited articles on diversity from authors in the performance evaluation community, but also from the larger Computing Science (CS) community. The articles will touch various aspects in CS including K-12 and post-secondary education, graduate studies, academic recruitment, industry perspectives, harassment issues, and gender, ethnicity, and racial bias.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121976551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems with fluctuating arrival rates, or the time-varying capacity as encountered in power-aware systems or wireless downlink channels. To study this, we focus on a restless bandit model, which has proved to be a powerful stochastic optimization framework to model scheduling of activities. This paper is a first step to its optimal control when restless bandits are subject to changing conditions. We consider the restless bandit problem in an asymptotic regime, which is obtained by letting the population of bandits grow large, and letting the environment change relatively fast. We present sufficient conditions for a policy to be asymptotically optimal and show that a set of priority policies satisfies these. Under an indexability assumption, an averaged version of Whittle's index policy is proved to be inside this set of asymptotic optimal policies. The performance of the averaged Whittle's index policy is numerically evaluated for a multi-class scheduling problem.
{"title":"Asymptotic Optimal Control of Markov-Modulated Restless Bandits","authors":"Santiago Duran, I. M. Verloop","doi":"10.1145/3219617.3219636","DOIUrl":"https://doi.org/10.1145/3219617.3219636","url":null,"abstract":"This paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems with fluctuating arrival rates, or the time-varying capacity as encountered in power-aware systems or wireless downlink channels. To study this, we focus on a restless bandit model, which has proved to be a powerful stochastic optimization framework to model scheduling of activities. This paper is a first step to its optimal control when restless bandits are subject to changing conditions. We consider the restless bandit problem in an asymptotic regime, which is obtained by letting the population of bandits grow large, and letting the environment change relatively fast. We present sufficient conditions for a policy to be asymptotically optimal and show that a set of priority policies satisfies these. Under an indexability assumption, an averaged version of Whittle's index policy is proved to be inside this set of asymptotic optimal policies. The performance of the averaged Whittle's index policy is numerically evaluated for a multi-class scheduling problem.","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127431981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Networking","authors":"B. V. Houdt","doi":"10.1145/3258591","DOIUrl":"https://doi.org/10.1145/3258591","url":null,"abstract":"","PeriodicalId":210440,"journal":{"name":"Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114345016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}