Modern data center storage systems are invariably networked to allow for consolidation and flexible management of storage. They also include high performance storage devices based on flash or other emerging technologies, generally accessed through low-latency and high throughput protocols such as NVMe (or its derivatives) carried over the network. With the increasing complexity and data-centric nature of the applications, properly configuring the quality of service (QoS) for the storage path has become crucial for ensuring the desired application performance. Such QoS is substantially influenced by the QoS in the network path, in the access protocol, and in the storage device. In this paper, we define a new transport level QoS mechanism for the network segment and demonstrate how it can augment and coordinate with the access level QoS mechanism defined for NVMe, and a similar QoS mechanism configured in the device. We show that the transport QoS mechanism not only provides the desired QoS to different classes of storage accesses but is also able to protect the access to the shared persistent memory (PM) devices located along with the storage but requiring much lower latency than storage. We demonstrate that a proper coordinated configuration of the 3 QoS’es on the path is crucial to achieve the desired differentiation depending on where the bottlenecks appear.
{"title":"Configuring and Coordinating End-to-End QoS for Emerging Storage Infrastructure","authors":"Jit Gupta, Krishna Kant, Amitangshu Pal, Joyanta Biswas","doi":"10.1145/3631606","DOIUrl":"https://doi.org/10.1145/3631606","url":null,"abstract":"Modern data center storage systems are invariably networked to allow for consolidation and flexible management of storage. They also include high performance storage devices based on flash or other emerging technologies, generally accessed through low-latency and high throughput protocols such as NVMe (or its derivatives) carried over the network. With the increasing complexity and data-centric nature of the applications, properly configuring the quality of service (QoS) for the storage path has become crucial for ensuring the desired application performance. Such QoS is substantially influenced by the QoS in the network path, in the access protocol, and in the storage device. In this paper, we define a new transport level QoS mechanism for the network segment and demonstrate how it can augment and coordinate with the access level QoS mechanism defined for NVMe, and a similar QoS mechanism configured in the device. We show that the transport QoS mechanism not only provides the desired QoS to different classes of storage accesses but is also able to protect the access to the shared persistent memory (PM) devices located along with the storage but requiring much lower latency than storage. We demonstrate that a proper coordinated configuration of the 3 QoS’es on the path is crucial to achieve the desired differentiation depending on where the bottlenecks appear.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"113 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135137751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider a non-preemptive multiserver queue with multiple priority classes. We assume distinct exponentially distributed service times and separate quasi-Poisson arrival processes with a predefined maximum number of requests that can be present in the system for each class. We present an approximation method to obtain the steady-state probabilities for the number of requests of each class in our system. In our method, the priority levels (classes) are solved “nearly separately”, linked only by certain conditional probabilities determined approximately from the solution of other priority levels. Several numerical examples illustrate the accuracy of our approximate solution. The proposed approach significantly reduces the complexity of the problem while featuring generally good accuracy.
{"title":"An approximation method for a non-preemptive multiserver queue with quasi-Poisson arrivals","authors":"Alexandre Brandwajn, Thomas Begin","doi":"10.1145/3624474","DOIUrl":"https://doi.org/10.1145/3624474","url":null,"abstract":"We consider a non-preemptive multiserver queue with multiple priority classes. We assume distinct exponentially distributed service times and separate quasi-Poisson arrival processes with a predefined maximum number of requests that can be present in the system for each class. We present an approximation method to obtain the steady-state probabilities for the number of requests of each class in our system. In our method, the priority levels (classes) are solved “nearly separately”, linked only by certain conditional probabilities determined approximately from the solution of other priority levels. Several numerical examples illustrate the accuracy of our approximate solution. The proposed approach significantly reduces the complexity of the problem while featuring generally good accuracy.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135736224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Amparore, M. Beccuti, P. Castagno, S. Pernice, G. Franceschinis, M. Pennisi
Computational modeling has become a widespread approach for studying real-world phenomena by using different modeling perspectives, in particular, the microscopic point of view concentrates on the behavior of the single components and their interactions from which the global system evolution emerges, while the macroscopic point of view represents the system’s overall behavior abstracting as much as possible from that of the single components. The preferred point of view depends on the effort required to develop the model, on the detail level of the available information about the system to be modeled, and on the type of measures that are of interest to the modeler; each point of view may lead to a different modeling language and simulation paradigm. An approach adequate for the microscopic point of view is Agent-Based Modeling and Simulation, which has gained popularity in the last few decades but lacks a formal definition common to the different tools supporting it. This may lead to modeling mistakes and wrong interpretation of the results, especially when comparing models of the same system developed according to different points of view. The aim of the work described in this paper is to provide a common compositional modeling language from which both a macro and a micro simulation model can be automatically derived: these models are coherent by construction and may be studied through different simulation approaches and tools. A framework is thus proposed in which a model can be composed using a Petri Net formalism and then studied through both an Agent-Based Simulation and a classical Stochastic Simulation Algorithm, depending on the study goal.
{"title":"From compositional Petri Net modeling to macro and micro simulation by means of Stochastic Simulation and Agent-Based models","authors":"E. Amparore, M. Beccuti, P. Castagno, S. Pernice, G. Franceschinis, M. Pennisi","doi":"10.1145/3617681","DOIUrl":"https://doi.org/10.1145/3617681","url":null,"abstract":"Computational modeling has become a widespread approach for studying real-world phenomena by using different modeling perspectives, in particular, the microscopic point of view concentrates on the behavior of the single components and their interactions from which the global system evolution emerges, while the macroscopic point of view represents the system’s overall behavior abstracting as much as possible from that of the single components. The preferred point of view depends on the effort required to develop the model, on the detail level of the available information about the system to be modeled, and on the type of measures that are of interest to the modeler; each point of view may lead to a different modeling language and simulation paradigm. An approach adequate for the microscopic point of view is Agent-Based Modeling and Simulation, which has gained popularity in the last few decades but lacks a formal definition common to the different tools supporting it. This may lead to modeling mistakes and wrong interpretation of the results, especially when comparing models of the same system developed according to different points of view. The aim of the work described in this paper is to provide a common compositional modeling language from which both a macro and a micro simulation model can be automatically derived: these models are coherent by construction and may be studied through different simulation approaches and tools. A framework is thus proposed in which a model can be composed using a Petri Net formalism and then studied through both an Agent-Based Simulation and a classical Stochastic Simulation Algorithm, depending on the study goal.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"1 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43090262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tareq Si Salem, Giovanni Neglia, Stratis Ioannidis
We study an online caching problem in which requests can be served by a local cache to avoid retrieval costs from a remote server. The cache can update its state after a batch of requests and store an arbitrarily small fraction of each file. We study no-regret algorithms based on Online Mirror Descent (OMD) strategies. We show that bounds for the regret crucially depend on the diversity of the request process, provided by the diversity ratio R/h , where R is the size of the batch and h is the maximum multiplicity of a request in a given batch. We characterize the optimality of OMD caching policies w.r.t. regret under different diversity regimes. We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees, even when update costs cannot be neglected. We provide a formal characterization of the rounding problem through optimal transport theory, and moreover we propose a computationally efficient randomized rounding scheme.
{"title":"No-regret Caching via Online Mirror Descent","authors":"Tareq Si Salem, Giovanni Neglia, Stratis Ioannidis","doi":"10.1145/3605209","DOIUrl":"https://doi.org/10.1145/3605209","url":null,"abstract":"We study an online caching problem in which requests can be served by a local cache to avoid retrieval costs from a remote server. The cache can update its state after a batch of requests and store an arbitrarily small fraction of each file. We study no-regret algorithms based on Online Mirror Descent (OMD) strategies. We show that bounds for the regret crucially depend on the diversity of the request process, provided by the diversity ratio R/h , where R is the size of the batch and h is the maximum multiplicity of a request in a given batch. We characterize the optimality of OMD caching policies w.r.t. regret under different diversity regimes. We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees, even when update costs cannot be neglected. We provide a formal characterization of the rounding problem through optimal transport theory, and moreover we propose a computationally efficient randomized rounding scheme.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135396955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashok Krishnan K. S., C. Singh, S. T. Maguluri, Parimal Parag
We study optimal pricing in a single server queue when the customers valuation of service depends on their waiting time. In particular, we consider a very general model, where the customer valuations are random and are sampled from a distribution that depends on the queue length. The goal of the service provider is to set dynamic state dependent prices in order to maximize its revenue, while also managing congestion. We model the problem as a Markov decision process and present structural results on the optimal policy. We also present an algorithm to find an approximate optimal policy. We further present a myopic policy that is easy to evaluate and present bounds on its performance. We finally illustrate the quality of our approximate solution and the myopic solution using numerical simulations.
{"title":"Optimal Pricing in a Single Server System","authors":"Ashok Krishnan K. S., C. Singh, S. T. Maguluri, Parimal Parag","doi":"10.1145/3607252","DOIUrl":"https://doi.org/10.1145/3607252","url":null,"abstract":"We study optimal pricing in a single server queue when the customers valuation of service depends on their waiting time. In particular, we consider a very general model, where the customer valuations are random and are sampled from a distribution that depends on the queue length. The goal of the service provider is to set dynamic state dependent prices in order to maximize its revenue, while also managing congestion. We model the problem as a Markov decision process and present structural results on the optimal policy. We also present an algorithm to find an approximate optimal policy. We further present a myopic policy that is easy to evaluate and present bounds on its performance. We finally illustrate the quality of our approximate solution and the myopic solution using numerical simulations.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 32"},"PeriodicalIF":0.6,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41684381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider a horizontal and dynamic auto-scaling technique in a cloud system where virtual machines hosted on a physical node are turned on and off to minimise energy consumption while meeting performance requirements. Finding cloud management policies that adapt the system to the load is not straightforward, and we consider here that virtual machines are turned on and off depending on queue load thresholds. We want to compute the optimal threshold values that minimize consumption costs and penalty costs (when performance requirements are not met). To solve this problem, we propose several optimisation methods, based on two different mathematical approaches. The first one is based on queueing theory and uses local search heuristics coupled with the stationary distributions of Markov chains. The second approach tackles the problem using Markov Decision Process (MDP) in which we assume that the policy is of a special multi-threshold type called hysteresis. We improve the heuristics of the former approach with the aggregation of Markov chains and queues approximation techniques. We assess the benefit of threshold-aware algorithms for solving MDPs. Then we carry out theoretical analyzes of the two approaches. We also compare them numerically and we show that all of the presented MDP algorithms strongly outperform the local search heuristics. Finally, we propose a cost model for a real scenario of a cloud system to apply our optimisation algorithms and to show their practical relevance. The major scientific contribution of the article is a set of fast (almost in real time) load-based threshold computation methods that can be used by a cloud provider to optimize its financial costs.
{"title":"Efficient Computation of Optimal Thresholds in Cloud Auto-scaling Systems","authors":"Thomas Tournaire, Hind Castel-Taleb, E. Hyon","doi":"10.1145/3603532","DOIUrl":"https://doi.org/10.1145/3603532","url":null,"abstract":"We consider a horizontal and dynamic auto-scaling technique in a cloud system where virtual machines hosted on a physical node are turned on and off to minimise energy consumption while meeting performance requirements. Finding cloud management policies that adapt the system to the load is not straightforward, and we consider here that virtual machines are turned on and off depending on queue load thresholds. We want to compute the optimal threshold values that minimize consumption costs and penalty costs (when performance requirements are not met). To solve this problem, we propose several optimisation methods, based on two different mathematical approaches. The first one is based on queueing theory and uses local search heuristics coupled with the stationary distributions of Markov chains. The second approach tackles the problem using Markov Decision Process (MDP) in which we assume that the policy is of a special multi-threshold type called hysteresis. We improve the heuristics of the former approach with the aggregation of Markov chains and queues approximation techniques. We assess the benefit of threshold-aware algorithms for solving MDPs. Then we carry out theoretical analyzes of the two approaches. We also compare them numerically and we show that all of the presented MDP algorithms strongly outperform the local search heuristics. Finally, we propose a cost model for a real scenario of a cloud system to apply our optimisation algorithms and to show their practical relevance. The major scientific contribution of the article is a set of fast (almost in real time) load-based threshold computation methods that can be used by a cloud provider to optimize its financial costs.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 31"},"PeriodicalIF":0.6,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41511671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gargi Alavani, Jineet Desai, Snehanshu Saha, S. Sarkar
The General Purpose Graphics Processing Unit has secured a prominent position in the High-Performance Computing world due to its performance gain and programmability. Understanding the relationship between Graphics Processing Unit (GPU) power consumption and program features can aid developers in building energy-efficient sustainable applications. In this work, we propose a static analysis-based power model built using machine learning techniques. We have investigated six machine learning models across three NVIDIA GPU architectures: Kepler, Maxwell, and Volta with Random Forest, Extra Trees, Gradient Boosting, CatBoost, and XGBoost reporting favorable results. We observed that the XGBoost technique-based prediction model is the most efficient technique with an R2 value of 0.9646 on Volta Architecture. The dataset used for these techniques includes kernels from different benchmarks suits, sizes, nature (e.g., compute-bound, memory-bound), and complexity (e.g., control divergence, memory access patterns). Experimental results suggest that the proposed solution can help developers precisely predict GPU applications power consumption using program analysis across GPU architectures. Developers can use this approach to refactor their code to build energy-efficient GPU applications.
{"title":"Program Analysis and Machine Learning–based Approach to Predict Power Consumption of CUDA Kernel","authors":"Gargi Alavani, Jineet Desai, Snehanshu Saha, S. Sarkar","doi":"10.1145/3603533","DOIUrl":"https://doi.org/10.1145/3603533","url":null,"abstract":"The General Purpose Graphics Processing Unit has secured a prominent position in the High-Performance Computing world due to its performance gain and programmability. Understanding the relationship between Graphics Processing Unit (GPU) power consumption and program features can aid developers in building energy-efficient sustainable applications. In this work, we propose a static analysis-based power model built using machine learning techniques. We have investigated six machine learning models across three NVIDIA GPU architectures: Kepler, Maxwell, and Volta with Random Forest, Extra Trees, Gradient Boosting, CatBoost, and XGBoost reporting favorable results. We observed that the XGBoost technique-based prediction model is the most efficient technique with an R2 value of 0.9646 on Volta Architecture. The dataset used for these techniques includes kernels from different benchmarks suits, sizes, nature (e.g., compute-bound, memory-bound), and complexity (e.g., control divergence, memory access patterns). Experimental results suggest that the proposed solution can help developers precisely predict GPU applications power consumption using program analysis across GPU architectures. Developers can use this approach to refactor their code to build energy-efficient GPU applications.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 24"},"PeriodicalIF":0.6,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43472653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Walther Carballo-Hernández, M. Pelcat, S. Bhattacharyya, R. C. Galán, F. Berry
The introduction of deep learning algorithms, such as Convolutional Neural Networks (CNNs) in many near-sensor embedded systems, opens new challenges in terms of energy efficiency and hardware performance. An emerging solution to address these challenges is to use tailored heterogeneous hardware accelerators combining processing elements of different architectural natures such as Central Processing Unit (CPU), Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), or Application Specific Integrated Circuit (ASIC). To progress towards heterogeneity, a great asset would be an automated design space exploration tool that chooses, for each accelerated partition of a CNN, the most appropriate architecture considering available resources. To feed such a design space exploration process, models are required that provide very fast yet precise evaluations of alternative architectures or alternative forms of CNNs. Quick configuration estimation could be achieved with few parameters from representative input sequences. This article studies a solution called flydeling (as a contraction of flyweight modeling) for obtaining these models by inspiring from the black-box System Identification (SI) domain. We refer to models derived using the proposed approach as flyweight models (flydels). A methodology is proposed to generate these flydels, using CNN properties as predictor features together with SI techniques with a stochastic excitation input at a feature map dimensions level. For an embedded CPU-FPGA-GPU heterogeneous platform, it is demonstrated that it is possible to learn these Key Performance Indicators (KPIs) flydels at an early design stage and from high-level application features. For latency, energy, and resource utilization, flydels obtain estimation errors varying between 5% and 10% with less model parameters compared to state-of-the-art solutions and are built automatically from platform measurements.
{"title":"Flydeling: Streamlined Performance Models for Hardware Acceleration of CNNs through System Identification","authors":"Walther Carballo-Hernández, M. Pelcat, S. Bhattacharyya, R. C. Galán, F. Berry","doi":"10.1145/3594870","DOIUrl":"https://doi.org/10.1145/3594870","url":null,"abstract":"The introduction of deep learning algorithms, such as Convolutional Neural Networks (CNNs) in many near-sensor embedded systems, opens new challenges in terms of energy efficiency and hardware performance. An emerging solution to address these challenges is to use tailored heterogeneous hardware accelerators combining processing elements of different architectural natures such as Central Processing Unit (CPU), Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), or Application Specific Integrated Circuit (ASIC). To progress towards heterogeneity, a great asset would be an automated design space exploration tool that chooses, for each accelerated partition of a CNN, the most appropriate architecture considering available resources. To feed such a design space exploration process, models are required that provide very fast yet precise evaluations of alternative architectures or alternative forms of CNNs. Quick configuration estimation could be achieved with few parameters from representative input sequences. This article studies a solution called flydeling (as a contraction of flyweight modeling) for obtaining these models by inspiring from the black-box System Identification (SI) domain. We refer to models derived using the proposed approach as flyweight models (flydels). A methodology is proposed to generate these flydels, using CNN properties as predictor features together with SI techniques with a stochastic excitation input at a feature map dimensions level. For an embedded CPU-FPGA-GPU heterogeneous platform, it is demonstrated that it is possible to learn these Key Performance Indicators (KPIs) flydels at an early design stage and from high-level application features. For latency, energy, and resource utilization, flydels obtain estimation errors varying between 5% and 10% with less model parameters compared to state-of-the-art solutions and are built automatically from platform measurements.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 33"},"PeriodicalIF":0.6,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43919677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenkai Dai, Klaus-Tycho Foerster, David Fuchssteiner, Stefan Schmid
Emerging reconfigurable data centers introduce unprecedented flexibility in how the physical layer can be programmed to adapt to current traffic demands. These reconfigurable topologies are commonly hybrid, consisting of static and reconfigurable links, enabled by e.g., an Optical Circuit Switch (OCS) connected to top-of-rack switches in Clos networks. Even though prior work has showcased the practical benefits of hybrid networks, several crucial performance aspects are not well understood. For example, many systems enforce artificial segregation of the hybrid network parts, leaving money on the table. In this article, we study the algorithmic problem of how to jointly optimize topology and routing in reconfigurable data centers, in order to optimize a most fundamental metric, maximum link load. The complexity of reconfiguration mechanisms in this space is unexplored at large, especially for the following cross-layer network-design problem: given a hybrid network and a traffic matrix, jointly design the physical layer and the flow routing in order to minimize the maximum link load. We chart the corresponding algorithmic landscape in our work, investigating both un-/splittable flows and (non-)segregated routing policies. A topological complexity classification of the problem reveals NP-hardness in general for network topologies that are trees of depth at least two, in contrast to the tractability on trees of depth one. We moreover prove that the problem is not submodular for all these routing policies, even in multi-layer trees. However, networks that can be abstracted by a single packet switch (e.g., nonblocking Fat-Tree topologies) can be optimized efficiently, and we present optimal polynomial-time algorithms accordingly. We complement our theoretical results with trace-driven simulation studies, where our algorithms can significantly improve the network load in comparison to the state-of-the-art.
{"title":"Load-optimization in Reconfigurable Data-center Networks: Algorithms and Complexity of Flow Routing","authors":"Wenkai Dai, Klaus-Tycho Foerster, David Fuchssteiner, Stefan Schmid","doi":"10.1145/3597200","DOIUrl":"https://doi.org/10.1145/3597200","url":null,"abstract":"Emerging reconfigurable data centers introduce unprecedented flexibility in how the physical layer can be programmed to adapt to current traffic demands. These reconfigurable topologies are commonly hybrid, consisting of static and reconfigurable links, enabled by e.g., an Optical Circuit Switch (OCS) connected to top-of-rack switches in Clos networks. Even though prior work has showcased the practical benefits of hybrid networks, several crucial performance aspects are not well understood. For example, many systems enforce artificial segregation of the hybrid network parts, leaving money on the table. In this article, we study the algorithmic problem of how to jointly optimize topology and routing in reconfigurable data centers, in order to optimize a most fundamental metric, maximum link load. The complexity of reconfiguration mechanisms in this space is unexplored at large, especially for the following cross-layer network-design problem: given a hybrid network and a traffic matrix, jointly design the physical layer and the flow routing in order to minimize the maximum link load. We chart the corresponding algorithmic landscape in our work, investigating both un-/splittable flows and (non-)segregated routing policies. A topological complexity classification of the problem reveals NP-hardness in general for network topologies that are trees of depth at least two, in contrast to the tractability on trees of depth one. We moreover prove that the problem is not submodular for all these routing policies, even in multi-layer trees. However, networks that can be abstracted by a single packet switch (e.g., nonblocking Fat-Tree topologies) can be optimized efficiently, and we present optimal polynomial-time algorithms accordingly. We complement our theoretical results with trace-driven simulation studies, where our algorithms can significantly improve the network load in comparison to the state-of-the-art.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 30"},"PeriodicalIF":0.6,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47120363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking probability is the relevant performance metric for the eager class, the tolerant class seeks to minimize its mean sojourn time. In this article, we analyse the performance of each class under dynamic scheduling policies, where the scheduling of both classes depends on the instantaneous state of the system. This analysis is carried out under a certain fluid limit, where the arrival rate and service rate of the eager class are scaled to infinity, holding the offered load constant. Our performance characterizations reveal a (dynamic) pseudo-conservation law that ties the performance of both classes to the standalone blocking probabilities associated with the scheduling policies for the eager class. Furthermore, the performance is robust to other specifics of the scheduling policies. We also characterize the Pareto frontier of the achievable region of performance vectors under the same fluid limit, and identify a (two-parameter) class of Pareto-complete scheduling policies.
{"title":"Dynamic Scheduling in a Partially Fluid, Partially Lossy Queueing System","authors":"Kiran Chaudhary, Veeraruna Kavitha, Jayakrishnan Nair","doi":"10.1145/3582884","DOIUrl":"https://doi.org/10.1145/3582884","url":null,"abstract":"We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking probability is the relevant performance metric for the eager class, the tolerant class seeks to minimize its mean sojourn time. In this article, we analyse the performance of each class under dynamic scheduling policies, where the scheduling of both classes depends on the instantaneous state of the system. This analysis is carried out under a certain fluid limit, where the arrival rate and service rate of the eager class are scaled to infinity, holding the offered load constant. Our performance characterizations reveal a (dynamic) pseudo-conservation law that ties the performance of both classes to the standalone blocking probabilities associated with the scheduling policies for the eager class. Furthermore, the performance is robust to other specifics of the scheduling policies. We also characterize the Pareto frontier of the achievable region of performance vectors under the same fluid limit, and identify a (two-parameter) class of Pareto-complete scheduling policies.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135127219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}