In coding-based distributed storage systems (DSSs), a set of storage nodes (SNs) hold coded fragments of a data unit that collectively allow one to recover the original information. It is well known that data modification (a.k.a. pollution attack) is the Achilles’ heel of such coding systems; indeed, intentional modification of a single coded fragment has the potential to prevent the reconstruction of the original information because of error propagation induced by the decoding algorithm. The challenge we take in this work is to devise an algorithm to identify polluted coded fragments within the set encoding a data unit and to characterize its performance. To this end, we provide the following contributions: (i) We devise MIND (Malicious node IdeNtification in DSS), an algorithm that is general with respect to the encoding mechanism chosen for the DSS, it is able to cope with a heterogeneous allocation of coded fragments to SNs, and it is effective in successfully identifying polluted coded fragments in a low-redundancy scenario; (ii) We formally prove both MIND termination and correctness; (iii) We derive an accurate analytical characterization of MIND performance (hit probability and complexity); (iv) We develop a C++ prototype that implements MIND to validate the performance predictions of the analytical model. Finally, to show applicability of our work, we define performance and robustness metrics for an allocation of coded fragments to SNs and we apply the results of the analytical characterization of MIND performance to select coded fragments allocations yielding robustness to collusion as well as the highest probability to identify actual attackers.
{"title":"Malicious Node Identification in Coded Distributed Storage Systems under Pollution Attacks","authors":"R. Gaeta, Marco Grangetto","doi":"10.1145/3491062","DOIUrl":"https://doi.org/10.1145/3491062","url":null,"abstract":"In coding-based distributed storage systems (DSSs), a set of storage nodes (SNs) hold coded fragments of a data unit that collectively allow one to recover the original information. It is well known that data modification (a.k.a. pollution attack) is the Achilles’ heel of such coding systems; indeed, intentional modification of a single coded fragment has the potential to prevent the reconstruction of the original information because of error propagation induced by the decoding algorithm. The challenge we take in this work is to devise an algorithm to identify polluted coded fragments within the set encoding a data unit and to characterize its performance.\u0000 To this end, we provide the following contributions: (i) We devise MIND (Malicious node IdeNtification in DSS), an algorithm that is general with respect to the encoding mechanism chosen for the DSS, it is able to cope with a heterogeneous allocation of coded fragments to SNs, and it is effective in successfully identifying polluted coded fragments in a low-redundancy scenario; (ii) We formally prove both MIND termination and correctness; (iii) We derive an accurate analytical characterization of MIND performance (hit probability and complexity); (iv) We develop a C++ prototype that implements MIND to validate the performance predictions of the analytical model.\u0000 Finally, to show applicability of our work, we define performance and robustness metrics for an allocation of coded fragments to SNs and we apply the results of the analytical characterization of MIND performance to select coded fragments allocations yielding robustness to collusion as well as the highest probability to identify actual attackers.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"28 1","pages":"12:1-12:27"},"PeriodicalIF":0.6,"publicationDate":"2021-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78185062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In data center networks, the reliability of Service Function Chain (SFC)—an end-to-end service presented by a chain of virtual network functions (VNFs)—is a complex and specific function of placement, configuration, and application requirements, both in hardware and software. Existing approaches to reliability analysis do not jointly consider multiple features of system components, including, (i) heterogeneity, (ii) disjointness, (iii) sharing, (iv) redundancy, and (v) failure interdependency. To this end, we develop a novel analysis of service reliability of the so-called generic SFC, consisting of n = k + r sub-SFCs, whereby k ≥ 1 and r ≥ 0 are the numbers of arbitrary placed primary and backup (redundant) sub-SFCs, respectively. Our analysis is based on combinatorics and a reduced binomial theorem—resulting in a simple approach, which, however, can be utilized to analyze rather complex SFC configurations. The analysis is practically applicable to various VNF placement strategies in arbitrary data center configurations, and topologies and can be effectively used for evaluation and optimization of reliable SFC placements.
在数据中心网络中,业务功能链(SFC)的可靠性是一个复杂而具体的功能,涉及硬件和软件的放置、配置和应用需求。业务功能链是由一系列虚拟网络功能链(VNFs)提供的端到端服务。现有的可靠性分析方法没有联合考虑系统组件的多种特征,包括:(i)异质性,(ii)不连接性,(iii)共享性,(iv)冗余性和(v)故障相互依赖性。为此,我们对所谓的通用SFC的服务可靠性进行了新的分析,该分析由n = k + r个子SFC组成,其中k≥1和r≥0分别是任意放置的主和备份(冗余)子SFC的数量。我们的分析基于组合学和简化二项式定理——这产生了一种简单的方法,然而,它可以用来分析相当复杂的SFC配置。该分析实际适用于任意数据中心配置和拓扑中的各种VNF放置策略,可以有效地用于评估和优化可靠的SFC放置。
{"title":"A Combinatorial Reliability Analysis of Generic Service Function Chains in Data Center Networks","authors":"Anna Engelmann, A. Jukan","doi":"10.1145/3477046","DOIUrl":"https://doi.org/10.1145/3477046","url":null,"abstract":"\u0000 In data center networks, the reliability of Service Function Chain (SFC)—an end-to-end service presented by a chain of virtual network functions (VNFs)—is a complex and specific function of placement, configuration, and application requirements, both in hardware and software. Existing approaches to reliability analysis do not jointly consider multiple features of system components, including, (i) heterogeneity, (ii) disjointness, (iii) sharing, (iv) redundancy, and (v) failure interdependency. To this end, we develop a novel analysis of service reliability of the so-called\u0000 generic SFC,\u0000 consisting of\u0000 n\u0000 =\u0000 k\u0000 +\u0000 r\u0000 sub-SFCs, whereby\u0000 k\u0000 ≥ 1 and\u0000 r\u0000 ≥ 0 are the numbers of arbitrary placed primary and backup (redundant) sub-SFCs, respectively. Our analysis is based on combinatorics and a reduced binomial theorem—resulting in a simple approach, which, however, can be utilized to analyze rather complex SFC configurations. The analysis is practically applicable to various VNF placement strategies in arbitrary data center configurations, and topologies and can be effectively used for evaluation and optimization of reliable SFC placements.\u0000","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"35 4 1","pages":"9:1-9:24"},"PeriodicalIF":0.6,"publicationDate":"2021-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77233559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most IT systems depend on a set of configuration variables (CVs), expressed as a name/value pair that collectively defines the resource allocation for the system. While the ill effects of misconfiguration or improper resource allocation are well-known, there are no effective a priori metrics to quantify the impact of the configuration on the desired system attributes such as performance, availability, etc. In this paper, we propose a Configuration Health Index (CHI) framework specifically attuned to the performance attribute to capture the influence of CVs on the performance aspects of the system. We show how CHI, which is defined as a configuration scoring system, can take advantage of the domain knowledge and the available (but rather limited) performance data to produce important insights into the configuration settings. We compare the CHI with both well-advertised segmented non-linear models and state-of-the-art data-driven models, and show that the CHI not only consistently provides better results but also avoids the dangers of a pure data drive approach which may predict incorrect behavior or eliminate some essential configuration variables from consideration.
{"title":"Performance Health Index for Complex Cyber Infrastructures","authors":"Sanjeev Sondur, K. Kant","doi":"10.1145/3538646","DOIUrl":"https://doi.org/10.1145/3538646","url":null,"abstract":"Most IT systems depend on a set of configuration variables (CVs), expressed as a name/value pair that collectively defines the resource allocation for the system. While the ill effects of misconfiguration or improper resource allocation are well-known, there are no effective a priori metrics to quantify the impact of the configuration on the desired system attributes such as performance, availability, etc. In this paper, we propose a Configuration Health Index (CHI) framework specifically attuned to the performance attribute to capture the influence of CVs on the performance aspects of the system. We show how CHI, which is defined as a configuration scoring system, can take advantage of the domain knowledge and the available (but rather limited) performance data to produce important insights into the configuration settings. We compare the CHI with both well-advertised segmented non-linear models and state-of-the-art data-driven models, and show that the CHI not only consistently provides better results but also avoids the dangers of a pure data drive approach which may predict incorrect behavior or eliminate some essential configuration variables from consideration.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 32"},"PeriodicalIF":0.6,"publicationDate":"2021-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42536638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caixiang Fan, Sara Ghaemi, Hamzeh Khazaei, Yuxiang Chen, P. Musílek
Distributed ledgers (DLs) provide many advantages over centralized solutions in Internet of Things projects, including but not limited to improved security, transparency, and fault tolerance. To leverage DLs at scale, their well-known limitation (i.e., performance) should be adequately analyzed and addressed. Directed acyclic graph-based DLs have been proposed to tackle the performance and scalability issues by design. The first among them, IOTA, has shown promising signs in addressing the preceding issues. IOTA is an open source DL designed for the Internet of Things. It uses a directed acyclic graph to store transactions on its ledger, to achieve a potentially higher scalability over blockchain-based DLs. However, due to the uncertainty and centralization of the deployed consensus, the current IOTA implementation exposes some performance issues, making it less performant than the initial design. In this article, we first extend an existing simulator to support realistic IOTA simulations and investigate the impact of different design parameters on IOTA’s performance. Then, we propose a layered model to help the users of IOTA determine the optimal waiting time to resend the previously submitted but not yet confirmed transaction. Our findings reveal the impact of the transaction arrival rate, tip selection algorithms, weighted tip selection algorithm randomness, and network delay on the throughput. Using the proposed layered model, we shed some light on the distribution of the confirmed transactions. The distribution is leveraged to calculate the optimal time for resending an unconfirmed transaction to the DL. The performance analysis results can be used by both system designers and users to support their decision making.
{"title":"Performance Analysis of the IOTA DAG-Based Distributed Ledger","authors":"Caixiang Fan, Sara Ghaemi, Hamzeh Khazaei, Yuxiang Chen, P. Musílek","doi":"10.7939/R3-W1C1-WT05","DOIUrl":"https://doi.org/10.7939/R3-W1C1-WT05","url":null,"abstract":"Distributed ledgers (DLs) provide many advantages over centralized solutions in Internet of Things projects, including but not limited to improved security, transparency, and fault tolerance. To leverage DLs at scale, their well-known limitation (i.e., performance) should be adequately analyzed and addressed. Directed acyclic graph-based DLs have been proposed to tackle the performance and scalability issues by design. The first among them, IOTA, has shown promising signs in addressing the preceding issues. IOTA is an open source DL designed for the Internet of Things. It uses a directed acyclic graph to store transactions on its ledger, to achieve a potentially higher scalability over blockchain-based DLs. However, due to the uncertainty and centralization of the deployed consensus, the current IOTA implementation exposes some performance issues, making it less performant than the initial design. In this article, we first extend an existing simulator to support realistic IOTA simulations and investigate the impact of different design parameters on IOTA’s performance. Then, we propose a layered model to help the users of IOTA determine the optimal waiting time to resend the previously submitted but not yet confirmed transaction. Our findings reveal the impact of the transaction arrival rate, tip selection algorithms, weighted tip selection algorithm randomness, and network delay on the throughput. Using the proposed layered model, we shed some light on the distribution of the confirmed transactions. The distribution is leveraged to calculate the optimal time for resending an unconfirmed transaction to the DL. The performance analysis results can be used by both system designers and users to support their decision making.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"10:1-10:20"},"PeriodicalIF":0.6,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86032559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Continuous enhancements and diversity in modern multi-core hardware, such as wider and deeper core pipelines and memory subsystems, bring to practice a set of hard-to-solve challenges when modeling their upper-bound capabilities and identifying the main application bottlenecks. Insightful roofline models are widely used for this purpose, but the existing approaches overly abstract the micro-architecture complexity, thus providing unrealistic performance bounds that lead to a misleading characterization of real-world applications. To address this problem, the Mansard Roofline Model (MaRM), proposed in this work, uncovers a minimum set of architectural features that must be considered to provide insightful, but yet accurate and realistic, modeling of performance upper bounds for modern processors. By encapsulating the retirement constraints due to the amount of retirement slots, Reorder-Buffer and Physical Register File sizes, the proposed model accurately models the capabilities of a real platform (average rRMSE of 5.4%) and characterizes 12 application kernels from standard benchmark suites. By following a herein proposed MaRM interpretation methodology and guidelines, speed-ups of up to 5× are obtained when optimizing real-world bioinformatic application, as well as a super-linear speedup of 18.5× when parallelized.
{"title":"Mansard Roofline Model: Reinforcing the Accuracy of the Roofs","authors":"Diogo Marques, A. Ilic, L. Sousa","doi":"10.1145/3475866","DOIUrl":"https://doi.org/10.1145/3475866","url":null,"abstract":"Continuous enhancements and diversity in modern multi-core hardware, such as wider and deeper core pipelines and memory subsystems, bring to practice a set of hard-to-solve challenges when modeling their upper-bound capabilities and identifying the main application bottlenecks. Insightful roofline models are widely used for this purpose, but the existing approaches overly abstract the micro-architecture complexity, thus providing unrealistic performance bounds that lead to a misleading characterization of real-world applications. To address this problem, the Mansard Roofline Model (MaRM), proposed in this work, uncovers a minimum set of architectural features that must be considered to provide insightful, but yet accurate and realistic, modeling of performance upper bounds for modern processors. By encapsulating the retirement constraints due to the amount of retirement slots, Reorder-Buffer and Physical Register File sizes, the proposed model accurately models the capabilities of a real platform (average rRMSE of 5.4%) and characterizes 12 application kernels from standard benchmark suites. By following a herein proposed MaRM interpretation methodology and guidelines, speed-ups of up to 5× are obtained when optimizing real-world bioinformatic application, as well as a super-linear speedup of 18.5× when parallelized.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 23"},"PeriodicalIF":0.6,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44571161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Randomized work stealing is used in distributed systems to increase performance and improve resource utilization. In this article, we consider randomized work stealing in a large system of homogeneous processors where parent jobs spawn child jobs that can feasibly be executed in parallel with the parent job. We analyse the performance of two work stealing strategies: one where only child jobs can be transferred across servers and the other where parent jobs are transferred. We define a mean-field model to derive the response time distribution in a large-scale system with Poisson arrivals and exponential parent and child job durations. We prove that the model has a unique fixed point that corresponds to the steady state of a structured Markov chain, allowing us to use matrix analytic methods to compute the unique fixed point. The accuracy of the mean-field model is validated using simulation. Using numerical examples, we illustrate the effect of different probe rates, load, and different child job size distributions on performance with respect to the two stealing strategies, individually, and compared to each other.
{"title":"Performance Analysis of Work Stealing in Large-scale Multithreaded Computing","authors":"Nikki Sonenberg, Grzegorz Kielanski, B. Van Houdt","doi":"10.1145/3470887","DOIUrl":"https://doi.org/10.1145/3470887","url":null,"abstract":"Randomized work stealing is used in distributed systems to increase performance and improve resource utilization. In this article, we consider randomized work stealing in a large system of homogeneous processors where parent jobs spawn child jobs that can feasibly be executed in parallel with the parent job. We analyse the performance of two work stealing strategies: one where only child jobs can be transferred across servers and the other where parent jobs are transferred. We define a mean-field model to derive the response time distribution in a large-scale system with Poisson arrivals and exponential parent and child job durations. We prove that the model has a unique fixed point that corresponds to the steady state of a structured Markov chain, allowing us to use matrix analytic methods to compute the unique fixed point. The accuracy of the mean-field model is validated using simulation. Using numerical examples, we illustrate the effect of different probe rates, load, and different child job size distributions on performance with respect to the two stealing strategies, individually, and compared to each other.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 28"},"PeriodicalIF":0.6,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41749830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nitish K. Panigrahy, P. Nain, G. Neglia, D. Towsley
Caching systems have long been crucial for improving the performance of a wide variety of network and web-based online applications. In such systems, end-to-end application performance heavily depends on the fraction of objects transferred from the cache, also known as the cache hit probability. Many caching policies have been proposed and implemented to improve the hit probability. In this work, we propose a new method to compute an upper bound on hit probability for all non-anticipative caching policies and for policies that have no knowledge of future requests. Our key insight is to order the objects according to the ratio of their Hazard Rate(HR) function values to their sizes, and place in the cache the objects with the largest ratios till the cache capacity is exhausted. When object request processes are conditionally independent, we prove that this cache allocation based on the HR-to-size ratio rule guarantees the maximum achievable expected number of object hits across all non-anticipative caching policies. Further, the HR ordering rule serves as an upper bound on cache hit probability when object request processes follow either independent delayed renewal process or a Markov modulated Poisson process. We also derive closed form expressions for the upper bound under some specific object request arrival processes. We provide simulation results to validate its correctness and to compare it to the state-of-the-art upper bounds, such as produced by Bélády’s algorithm. We find it to be tighter than state-of-the-art upper bounds for some specific object request arrival processes such as independent renewal, Markov modulated, and shot noise processes.
{"title":"A New Upper Bound on Cache Hit Probability for Non-Anticipative Caching Policies","authors":"Nitish K. Panigrahy, P. Nain, G. Neglia, D. Towsley","doi":"10.1145/3547332","DOIUrl":"https://doi.org/10.1145/3547332","url":null,"abstract":"Caching systems have long been crucial for improving the performance of a wide variety of network and web-based online applications. In such systems, end-to-end application performance heavily depends on the fraction of objects transferred from the cache, also known as the cache hit probability. Many caching policies have been proposed and implemented to improve the hit probability. In this work, we propose a new method to compute an upper bound on hit probability for all non-anticipative caching policies and for policies that have no knowledge of future requests. Our key insight is to order the objects according to the ratio of their Hazard Rate(HR) function values to their sizes, and place in the cache the objects with the largest ratios till the cache capacity is exhausted. When object request processes are conditionally independent, we prove that this cache allocation based on the HR-to-size ratio rule guarantees the maximum achievable expected number of object hits across all non-anticipative caching policies. Further, the HR ordering rule serves as an upper bound on cache hit probability when object request processes follow either independent delayed renewal process or a Markov modulated Poisson process. We also derive closed form expressions for the upper bound under some specific object request arrival processes. We provide simulation results to validate its correctness and to compare it to the state-of-the-art upper bounds, such as produced by Bélády’s algorithm. We find it to be tighter than state-of-the-art upper bounds for some specific object request arrival processes such as independent renewal, Markov modulated, and shot noise processes.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 24"},"PeriodicalIF":0.6,"publicationDate":"2021-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49153406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the advent of the Internet of Things (IoT), applications are becoming increasingly dependent on networks to not only transmit content at high throughput but also deliver it when it is fresh, i.e., synchronized between source and destination. Existing studies have proposed the metric age of information (AoI) to quantify freshness and have system designs that achieve low AoI. However, despite active research in this area, existing results are not applicable to general wired networks for two reasons. First, they focus on wireless settings, where AoI is mostly affected by interference and collision, while queueing issues are more prevalent in wired settings. Second, traditional high-throughput/low-latency legacy drop-adverse (LDA) flows are not taken into account in most system designs; hence, the problem of scheduling mixed flows with distinct performance objectives is not addressed. In this article, we propose a hierarchical system design to treat wired networks shared by mixed flow traffic, specifically LDA and AoI flows, and study the characteristics of achieving a good tradeoff between throughput and AoI. Our approach to the problem consists of two layers: freshness-aware traffic engineering (FATE) and in-network freshness control (IFC). The centralized FATE solution studies the characteristics of the source flow to derive the sending rate/update frequency for flows via the optimization problem LDA-AoI Coscheduling. The parameters specified by FATE are then distributed to IFC, which is implemented at each outport of the network’s nodes and used for efficient scheduling between LDA and AoI flows. We present a Linux implementation of IFC and demonstrate the effectiveness of FATE/IFC through extensive emulations. Our results show that it is possible to trade a little throughput (5% lower) for much shorter AoI (49% to 71% shorter) compared to state-of-the-art traffic engineering.
{"title":"Trading Throughput for Freshness: Freshness-aware Traffic Engineering and In-Network Freshness Control","authors":"Shih-Hao Tseng, Soojean Han, A. Wierman","doi":"10.1145/3576919","DOIUrl":"https://doi.org/10.1145/3576919","url":null,"abstract":"With the advent of the Internet of Things (IoT), applications are becoming increasingly dependent on networks to not only transmit content at high throughput but also deliver it when it is fresh, i.e., synchronized between source and destination. Existing studies have proposed the metric age of information (AoI) to quantify freshness and have system designs that achieve low AoI. However, despite active research in this area, existing results are not applicable to general wired networks for two reasons. First, they focus on wireless settings, where AoI is mostly affected by interference and collision, while queueing issues are more prevalent in wired settings. Second, traditional high-throughput/low-latency legacy drop-adverse (LDA) flows are not taken into account in most system designs; hence, the problem of scheduling mixed flows with distinct performance objectives is not addressed. In this article, we propose a hierarchical system design to treat wired networks shared by mixed flow traffic, specifically LDA and AoI flows, and study the characteristics of achieving a good tradeoff between throughput and AoI. Our approach to the problem consists of two layers: freshness-aware traffic engineering (FATE) and in-network freshness control (IFC). The centralized FATE solution studies the characteristics of the source flow to derive the sending rate/update frequency for flows via the optimization problem LDA-AoI Coscheduling. The parameters specified by FATE are then distributed to IFC, which is implemented at each outport of the network’s nodes and used for efficient scheduling between LDA and AoI flows. We present a Linux implementation of IFC and demonstrate the effectiveness of FATE/IFC through extensive emulations. Our results show that it is possible to trade a little throughput (5% lower) for much shorter AoI (49% to 71% shorter) compared to state-of-the-art traffic engineering.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 26"},"PeriodicalIF":0.6,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44252251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As video-streaming services have expanded and improved, cloud-based video has evolved into a necessary feature of any successful business for reaching internal and external audiences. In this article, video streaming over distributed storage is considered where the video segments are encoded using an erasure code for better reliability. We consider a representative system architecture for a realistic (typical) content delivery network (CDN). Given multiple parallel streams/link between each server and the edge router, we need to determine, for each client request, the subset of servers to stream the video, as well as one of the parallel streams from each chosen server. To have this scheduling, this article proposes a two-stage probabilistic scheduling. The selection of video quality is also chosen with a certain probability distribution that is optimized in our algorithm. With these parameters, the playback time of video segments is determined by characterizing the download time of each coded chunk for each video segment. Using the playback times, a bound on the moment generating function of the stall duration is used to bound the mean stall duration. Based on this, we formulate an optimization problem to jointly optimize the convex combination of mean stall duration and average video quality for all requests, where the two-stage probabilistic scheduling, video quality selection, bandwidth split among parallel streams, and auxiliary bound parameters can be chosen. This non-convex problem is solved using an efficient iterative algorithm. Based on the offline version of our proposed algorithm, an online policy is developed where servers selection, quality, bandwidth split, and parallel streams are selected in an online manner. Experimental results show significant improvement in QoE metrics for cloud-based video as compared to the considered baselines.
{"title":"VidCloud: Joint Stall and Quality Optimization for Video Streaming over Cloud","authors":"A. Al-Abbasi, V. Aggarwal","doi":"10.1145/3442187","DOIUrl":"https://doi.org/10.1145/3442187","url":null,"abstract":"As video-streaming services have expanded and improved, cloud-based video has evolved into a necessary feature of any successful business for reaching internal and external audiences. In this article, video streaming over distributed storage is considered where the video segments are encoded using an erasure code for better reliability. We consider a representative system architecture for a realistic (typical) content delivery network (CDN). Given multiple parallel streams/link between each server and the edge router, we need to determine, for each client request, the subset of servers to stream the video, as well as one of the parallel streams from each chosen server. To have this scheduling, this article proposes a two-stage probabilistic scheduling. The selection of video quality is also chosen with a certain probability distribution that is optimized in our algorithm. With these parameters, the playback time of video segments is determined by characterizing the download time of each coded chunk for each video segment. Using the playback times, a bound on the moment generating function of the stall duration is used to bound the mean stall duration. Based on this, we formulate an optimization problem to jointly optimize the convex combination of mean stall duration and average video quality for all requests, where the two-stage probabilistic scheduling, video quality selection, bandwidth split among parallel streams, and auxiliary bound parameters can be chosen. This non-convex problem is solved using an efficient iterative algorithm. Based on the offline version of our proposed algorithm, an online policy is developed where servers selection, quality, bandwidth split, and parallel streams are selected in an online manner. Experimental results show significant improvement in QoE metrics for cloud-based video as compared to the considered baselines.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"97 1","pages":"17:1-17:32"},"PeriodicalIF":0.6,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91022446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid advance of information technology, network systems have become increasingly complex and hence the underlying system dynamics are often unknown or difficult to characterize. Finding a good network control policy is of significant importance to achieve desirable network performance (e.g., high throughput or low delay). In this work, we consider using model-based reinforcement learning (RL) to learn the optimal control policy for queueing networks so that the average job delay (or equivalently the average queue backlog) is minimized. Traditional approaches in RL, however, cannot handle the unbounded state spaces of the network control problem. To overcome this difficulty, we propose a new algorithm, called RL for Queueing Networks (RL-QN), which applies model-based RL methods over a finite subset of the state space while applying a known stabilizing policy for the rest of the states. We establish that the average queue backlog under RL-QN with an appropriately constructed subset can be arbitrarily close to the optimal result. We evaluate RL-QN in dynamic server allocation, routing, and switching problems. Simulation results show that RL-QN minimizes the average queue backlog effectively.
{"title":"RL-QN: A Reinforcement Learning Framework for Optimal Control of Queueing Systems","authors":"Bai Liu, Qiaomin Xie, E. Modiano","doi":"10.1145/3529375","DOIUrl":"https://doi.org/10.1145/3529375","url":null,"abstract":"With the rapid advance of information technology, network systems have become increasingly complex and hence the underlying system dynamics are often unknown or difficult to characterize. Finding a good network control policy is of significant importance to achieve desirable network performance (e.g., high throughput or low delay). In this work, we consider using model-based reinforcement learning (RL) to learn the optimal control policy for queueing networks so that the average job delay (or equivalently the average queue backlog) is minimized. Traditional approaches in RL, however, cannot handle the unbounded state spaces of the network control problem. To overcome this difficulty, we propose a new algorithm, called RL for Queueing Networks (RL-QN), which applies model-based RL methods over a finite subset of the state space while applying a known stabilizing policy for the rest of the states. We establish that the average queue backlog under RL-QN with an appropriately constructed subset can be arbitrarily close to the optimal result. We evaluate RL-QN in dynamic server allocation, routing, and switching problems. Simulation results show that RL-QN minimizes the average queue backlog effectively.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 35"},"PeriodicalIF":0.6,"publicationDate":"2020-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45155339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}