Many cloud service providers (CSPs) offer an on-demand service with a small delay. Motivated by the reality of cloud ecosystems, we study non-interruptible services and consider a differentiated service model to complement the existing market by offering multiple service level agreements (SLAs) to satisfy users with different delay tolerance. The model itself is incentive compatible by construction. Two typical architectures are considered to fulfill SLAs: (i) non-preemptive priority queues and (ii) multiple independent groups of servers. We leverage queueing theory to establish guidelines for the resultant market: (a) Under the first architecture, the service model can only improve the revenue marginally over the pure on-demand service model and (b) under the second architecture, we give a closed-form expression of the revenue improvement when a CSP offers two SLAs and derive a condition under which the market is viable. Additionally, under the second architecture, we give an exhaustive search procedure to find the optimal SLA delays and prices when a CSP generally offers multiple SLAs. Numerical results show that the achieved revenue improvement can be significant even if two SLAs are offered. Our results can help CSPs design optimal delay-differentiated services and choose appropriate serving architectures.
{"title":"Delay and Price Differentiation in Cloud Computing: A Service Model, Supporting Architectures, and Performance","authors":"Xiaohu Wu, Francesco De Pellegrini, G. Casale","doi":"10.1145/3592852","DOIUrl":"https://doi.org/10.1145/3592852","url":null,"abstract":"Many cloud service providers (CSPs) offer an on-demand service with a small delay. Motivated by the reality of cloud ecosystems, we study non-interruptible services and consider a differentiated service model to complement the existing market by offering multiple service level agreements (SLAs) to satisfy users with different delay tolerance. The model itself is incentive compatible by construction. Two typical architectures are considered to fulfill SLAs: (i) non-preemptive priority queues and (ii) multiple independent groups of servers. We leverage queueing theory to establish guidelines for the resultant market: (a) Under the first architecture, the service model can only improve the revenue marginally over the pure on-demand service model and (b) under the second architecture, we give a closed-form expression of the revenue improvement when a CSP offers two SLAs and derive a condition under which the market is viable. Additionally, under the second architecture, we give an exhaustive search procedure to find the optimal SLA delays and prices when a CSP generally offers multiple SLAs. Numerical results show that the achieved revenue improvement can be significant even if two SLAs are offered. Our results can help CSPs design optimal delay-differentiated services and choose appropriate serving architectures.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 40"},"PeriodicalIF":0.6,"publicationDate":"2020-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46609164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Consider a setting where Willie generates a Poisson stream of jobs and routes them to a single server that follows the first-in first-out discipline. Suppose there is an adversary Alice, who desires to receive service without being detected. We ask the question: What is the number of jobs that she can receive covertly, i.e., without being detected by Willie? In the case where both Willie and Alice jobs have exponential service times with respective rates μ1 and μ2, we demonstrate a phase-transition when Alice adopts the strategy of inserting a single job probabilistically when the server idles: over n busy periods, she can achieve a covert throughput, measured by the expected number of jobs covertly inserted, of O(√ n) when μ1 < 2 μ2, O(√ n log n) when μ1 = 2μ2, and O(nμ2/μ1) when μ1 > 2μ2. When both Willie and Alice jobs have general service times, we establish an upper bound for the number of jobs Alice can execute covertly. This bound is related to the Fisher information. More general insertion policies are also discussed.
{"title":"Covert Cycle Stealing in a Single FIFO Server","authors":"Bo Jiang, P. Nain, D. Towsley","doi":"10.1145/3462774","DOIUrl":"https://doi.org/10.1145/3462774","url":null,"abstract":"Consider a setting where Willie generates a Poisson stream of jobs and routes them to a single server that follows the first-in first-out discipline. Suppose there is an adversary Alice, who desires to receive service without being detected. We ask the question: What is the number of jobs that she can receive covertly, i.e., without being detected by Willie? In the case where both Willie and Alice jobs have exponential service times with respective rates μ1 and μ2, we demonstrate a phase-transition when Alice adopts the strategy of inserting a single job probabilistically when the server idles: over n busy periods, she can achieve a covert throughput, measured by the expected number of jobs covertly inserted, of O(√ n) when μ1 < 2 μ2, O(√ n log n) when μ1 = 2μ2, and O(nμ2/μ1) when μ1 > 2μ2. When both Willie and Alice jobs have general service times, we establish an upper bound for the number of jobs Alice can execute covertly. This bound is related to the Fisher information. More general insertion policies are also discussed.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 33"},"PeriodicalIF":0.6,"publicationDate":"2020-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45643114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Ch, L. Narayana, Sharayu Moharir, N. Karamchandani
The rapid proliferation of shared edge computing platforms has enabled application service providers to deploy a wide variety of services with stringent latency and high bandwidth requirements. A key advantage of these platforms is that they provide pay-as-you-go flexibility by charging clients in proportion to their resource usage through short-term contracts. This affords the client significant cost-saving opportunities by dynamically deciding when to host its service on the platform, depending on the changing intensity of requests. A natural policy for our setting is the Time-To-Live (TTL) policy. We show that TTL performs poorly both in the adversarial arrival setting, i.e., in terms of the competitive ratio, and for i.i.d. stochastic arrivals with low arrival rates, irrespective of the value of the TTL timer. We propose an online policy called RetroRenting (RR) and characterize its performance in terms of the competitive ratio. Our results show that RR overcomes the limitations of TTL. In addition, we provide performance guarantees for RR for i.i.d. stochastic arrival processes coupled with negatively associated rent cost sequences and prove that it compares well with the optimal online policy. Further, we conduct simulations using both synthetic and real-world traces to compare the performance of RR with the optimal offline and online policies. The simulations show that the performance of RR is near optimal for all settings considered. Our results illustrate the universality of RR.
{"title":"On Renting Edge Resources for Service Hosting","authors":"V. Ch, L. Narayana, Sharayu Moharir, N. Karamchandani","doi":"10.1145/3478433","DOIUrl":"https://doi.org/10.1145/3478433","url":null,"abstract":"The rapid proliferation of shared edge computing platforms has enabled application service providers to deploy a wide variety of services with stringent latency and high bandwidth requirements. A key advantage of these platforms is that they provide pay-as-you-go flexibility by charging clients in proportion to their resource usage through short-term contracts. This affords the client significant cost-saving opportunities by dynamically deciding when to host its service on the platform, depending on the changing intensity of requests. A natural policy for our setting is the Time-To-Live (TTL) policy. We show that TTL performs poorly both in the adversarial arrival setting, i.e., in terms of the competitive ratio, and for i.i.d. stochastic arrivals with low arrival rates, irrespective of the value of the TTL timer. We propose an online policy called RetroRenting (RR) and characterize its performance in terms of the competitive ratio. Our results show that RR overcomes the limitations of TTL. In addition, we provide performance guarantees for RR for i.i.d. stochastic arrival processes coupled with negatively associated rent cost sequences and prove that it compares well with the optimal online policy. Further, we conduct simulations using both synthetic and real-world traces to compare the performance of RR with the optimal offline and online policies. The simulations show that the performance of RR is near optimal for all settings considered. Our results illustrate the universality of RR.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 30"},"PeriodicalIF":0.6,"publicationDate":"2019-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41602607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alireza Namazi, S. Safari, S. Mohammadi, Meisam Abdollahi
This article proposes a Semi Online Reliable Task (SORT) mapping approach to many-core platforms divided into two sections: offline and online. The offline section is a twofolded approach. It maintains the reliability of the mapped task graph against soft errors considering the reliability threshold defined by designers. As wear-out mechanisms decrease the lifetime of the system, our proposed approach increases the lifetime of the system using task migration scenarios. It specifies task migration plans with the minimum overhead using a novel heuristic approach. SORT maintains the required level of reliability of the task graph in the whole lifetime of the system using a replication technique with minimum replica overhead, maximum achievable performance, and minimum temperature increase. The online segment uses migration plans obtained in the offline segment to increase the lifetime and also permanently maintains the reliability threshold for the task graph during runtime. Results show that the effectiveness of SORT improves on bigger mesh sizes and higher reliability thresholds. Simulation results obtained from real benchmarks show that the proposed approach decreases design-time calculation up to 4,371% compared to exhaustive exploration while achieving a lifetime negligibly lower than the exhaustive solution (up to 5.83%).
{"title":"SORT: Semi Online Reliable Task Mapping for Embedded Multi-Core Systems","authors":"Alireza Namazi, S. Safari, S. Mohammadi, Meisam Abdollahi","doi":"10.1145/3322899","DOIUrl":"https://doi.org/10.1145/3322899","url":null,"abstract":"This article proposes a Semi Online Reliable Task (SORT) mapping approach to many-core platforms divided into two sections: offline and online. The offline section is a twofolded approach. It maintains the reliability of the mapped task graph against soft errors considering the reliability threshold defined by designers. As wear-out mechanisms decrease the lifetime of the system, our proposed approach increases the lifetime of the system using task migration scenarios. It specifies task migration plans with the minimum overhead using a novel heuristic approach. SORT maintains the required level of reliability of the task graph in the whole lifetime of the system using a replication technique with minimum replica overhead, maximum achievable performance, and minimum temperature increase. The online segment uses migration plans obtained in the offline segment to increase the lifetime and also permanently maintains the reliability threshold for the task graph during runtime. Results show that the effectiveness of SORT improves on bigger mesh sizes and higher reliability thresholds. Simulation results obtained from real benchmarks show that the proposed approach decreases design-time calculation up to 4,371% compared to exhaustive exploration while achieving a lifetime negligibly lower than the exhaustive solution (up to 5.83%).","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"31 1","pages":"11:1-11:25"},"PeriodicalIF":0.6,"publicationDate":"2019-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78406954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study a quantum entanglement distribution switch serving a set of users in a star topology with equal-length links. The quantum switch, much like a quantum repeater, can perform entanglement swapping to extend entanglement across longer distances. Additionally, the switch is equipped with entanglement switching logic, enabling it to implement switching policies to better serve the needs of the network. In this work, the function of the switch is to create bipartite or tripartite entangled states among users at the highest possible rates at a fixed ratio. Using Markov chains, we model a set of randomized switching policies. Discovering that some are better than others, we present analytical results for the case where the switch stores one qubit per user, and find that the best policies outperform a time division multiplexing policy for sharing the switch between bipartite and tripartite state generation. This performance improvement decreases as the number of users grows. The model is easily augmented to study the capacity region in the presence of quantum state decoherence and associated cut-off times for qubit storage, obtaining similar results. Moreover, decoherence-associated quantum storage cut-off times appear to have little effect on capacity in our identical-link system. We also study a smaller class of policies when the switch stores two qubits per user.
{"title":"On the Capacity Region of Bipartite and Tripartite Entanglement Switching","authors":"Gayane Vardoyan, P. Nain, S. Guha, D. Towsley","doi":"10.1145/3571809","DOIUrl":"https://doi.org/10.1145/3571809","url":null,"abstract":"We study a quantum entanglement distribution switch serving a set of users in a star topology with equal-length links. The quantum switch, much like a quantum repeater, can perform entanglement swapping to extend entanglement across longer distances. Additionally, the switch is equipped with entanglement switching logic, enabling it to implement switching policies to better serve the needs of the network. In this work, the function of the switch is to create bipartite or tripartite entangled states among users at the highest possible rates at a fixed ratio. Using Markov chains, we model a set of randomized switching policies. Discovering that some are better than others, we present analytical results for the case where the switch stores one qubit per user, and find that the best policies outperform a time division multiplexing policy for sharing the switch between bipartite and tripartite state generation. This performance improvement decreases as the number of users grows. The model is easily augmented to study the capacity region in the presence of quantum state decoherence and associated cut-off times for qubit storage, obtaining similar results. Moreover, decoherence-associated quantum storage cut-off times appear to have little effect on capacity in our identical-link system. We also study a smaller class of policies when the switch stores two qubits per user.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"8 1","pages":"1 - 18"},"PeriodicalIF":0.6,"publicationDate":"2019-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48413417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}