Our aim is to estimate the largest community (a.k.a., mode) in a population composed of multiple disjoint communities. This estimation is performed in a fixed confidence setting via sequential sampling of individuals with replacement. We consider two sampling models: (i) an identityless model, wherein only the community of each sampled individual is revealed, and (ii) an identity-based model, wherein the learner is able to discern whether or not each sampled individual has been sampled before, in addition to the community of that individual. The former model corresponds to the classical problem of identifying the mode of a discrete distribution, whereas the latter seeks to capture the utility of identity information in mode estimation. For each of these models, we establish information theoretic lower bounds on the expected number of samples needed to meet the prescribed confidence level, and propose sound algorithms with a sample complexity that is provably asymptotically optimal. Our analysis highlights that identity information can indeed be utilized to improve the efficiency of community mode estimation.
{"title":"Fixed confidence community mode estimation","authors":"Meera Pai, Nikhil Karamchandani, Jayakrishnan Nair","doi":"10.1016/j.peva.2023.102379","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102379","url":null,"abstract":"<div><p>Our aim is to estimate the largest community (a.k.a., mode) in a population composed of multiple disjoint communities. This estimation is performed in a fixed confidence setting via sequential sampling of individuals with replacement. We consider two sampling models: (i) an identityless model, wherein only the community of each sampled individual is revealed, and (ii) an identity-based model, wherein the learner is able to discern whether or not each sampled individual has been sampled before, in addition to the community of that individual. The former model corresponds to the classical problem of identifying the mode of a discrete distribution, whereas the latter seeks to capture the utility of identity information in mode estimation. For each of these models, we establish information theoretic lower bounds on the expected number of samples needed to meet the prescribed confidence level, and propose sound algorithms with a sample complexity that is provably asymptotically optimal. Our analysis highlights that identity information can indeed be utilized to improve the efficiency of community mode estimation.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102379"},"PeriodicalIF":2.2,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92025544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01DOI: 10.1016/j.peva.2023.102372
Khushboo Agarwal , Veeraruna Kavitha
The viral propagation of fake posts on online social networks (OSNs) has become an alarming concern. The paper aims to design control mechanisms for fake post detection while negligibly affecting the propagation of real posts. Towards this, a warning mechanism based on crowd-signals was recently proposed, where all users actively declare the post as real or fake. In this paper, we consider a more realistic framework where users exhibit different adversarial or non-cooperative behaviour: (i) they can independently decide whether to provide their response, (ii) they can choose not to consider the warning signal while providing the response, and (iii) they can be real-coloring adversaries who deliberately declare any post as real. To analyse the post-propagation process in this complex system, we propose and study a new branching process, namely total-current population-dependent branching process with multiple death types. At first, we compare and show that the existing warning mechanism significantly under-performs in the presence of adversaries. Then, we design new mechanisms which remarkably perform better than the existing mechanism by cleverly eliminating the influence of the responses of the adversaries. Finally, we propose another enhanced mechanism which assumes minimal knowledge about the user-specific parameters. The theoretical results are validated using Monte-Carlo simulations.
{"title":"Robust fake-post detection against real-coloring adversaries","authors":"Khushboo Agarwal , Veeraruna Kavitha","doi":"10.1016/j.peva.2023.102372","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102372","url":null,"abstract":"<div><p>The viral propagation of fake posts on online social networks (OSNs) has become an alarming concern. The paper aims to design control mechanisms for fake post detection while negligibly affecting the propagation of real posts. Towards this, a warning mechanism based on crowd-signals was recently proposed, where all users actively declare the post as real or fake. In this paper, we consider a more realistic framework where users exhibit different adversarial or non-cooperative behaviour: (i) they can independently decide whether to provide their response, (ii) they can choose not to consider the warning signal while providing the response, and (iii) they can be real-coloring adversaries who deliberately declare any post as real. To analyse the post-propagation process in this complex system, we propose and study a new branching process, namely total-current population-dependent branching process with multiple death types. At first, we compare and show that the existing warning mechanism significantly under-performs in the presence of adversaries. Then, we design new mechanisms which remarkably perform better than the existing mechanism by cleverly eliminating the influence of the responses of the adversaries. Finally, we propose another enhanced mechanism which assumes minimal knowledge about the user-specific parameters. The theoretical results are validated using Monte-Carlo simulations.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102372"},"PeriodicalIF":2.2,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91959209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01DOI: 10.1016/j.peva.2023.102369
Xiaoding Guan, Noman Bashir, David Irwin, Prashant Shenoy
Datacenter capacity is growing exponentially to satisfy the increasing demand for many emerging computationally-intensive applications, such as deep learning. This trend has led to concerns over datacenters’ increasing energy consumption and carbon footprint. The most basic prerequisite for optimizing a datacenter’s energy- and carbon-efficiency is accurately monitoring and attributing energy consumption to specific users and applications. Since datacenter servers tend to be multi-tenant, i.e., they host many applications, server- and rack-level power monitoring alone does not provide insight into the energy usage and carbon emissions of their resident applications. At the same time, current application-level energy monitoring and attribution techniques are intrusive: they require privileged access to servers and necessitate coordinated support in hardware and software, neither of which is always possible in cloud environments. To address the problem, we design WattScope, a system for non-intrusively estimating the power consumption of individual applications using external measurements of a server’s aggregate power usage and without requiring direct access to the server’s operating system or applications. Our key insight is that, based on an analysis of production traces, the power characteristics of datacenter workloads, e.g., low variability, low magnitude, and high periodicity, are highly amenable to disaggregation of a server’s total power consumption into application-specific values. WattScope adapts and extends a machine learning-based technique for disaggregating building power and applies it to server- and rack-level power meter measurements that are already available in data centers. We evaluate WattScope’s accuracy on a production workload and show that it yields high accuracy, e.g., often 10% normalized mean absolute error, and is thus a potentially useful tool for datacenters in externally monitoring application-level power usage.
{"title":"WattScope: Non-intrusive application-level power disaggregation in datacenters","authors":"Xiaoding Guan, Noman Bashir, David Irwin, Prashant Shenoy","doi":"10.1016/j.peva.2023.102369","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102369","url":null,"abstract":"<div><p><span>Datacenter capacity is growing exponentially to satisfy the increasing demand for many emerging computationally-intensive applications, such as deep learning. This trend has led to concerns over datacenters’ increasing energy consumption and carbon footprint. The most basic prerequisite for optimizing a datacenter’s energy- and carbon-efficiency is accurately monitoring and attributing energy consumption to specific users and applications. Since datacenter servers tend to be multi-tenant, i.e., they host many applications, server- and rack-level power monitoring alone does not provide insight into the energy usage and carbon emissions of their resident applications. At the same time, current application-level energy monitoring and attribution techniques are </span><em>intrusive</em>: they require privileged access to servers and necessitate coordinated support in hardware and software, neither of which is always possible in cloud environments. To address the problem, we design <span>WattScope</span>, a system for non-intrusively estimating the power consumption of individual applications using external measurements of a server’s aggregate power usage and without requiring direct access to the server’s operating system or applications. Our key insight is that, based on an analysis of production traces, the power characteristics of datacenter workloads, e.g., low variability, low magnitude, and high periodicity, are highly amenable to disaggregation of a server’s total power consumption into application-specific values. <span>WattScope</span><span> adapts and extends a machine learning-based technique for disaggregating building power and applies it to server- and rack-level power meter measurements that are already available in data centers. We evaluate </span><span>WattScope</span>’s accuracy on a production workload and show that it yields high accuracy, e.g., often <span><math><mrow><mo><</mo><mo>∼</mo></mrow></math></span><span>10% normalized mean absolute error, and is thus a potentially useful tool for datacenters in externally monitoring application-level power usage.</span></p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102369"},"PeriodicalIF":2.2,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92025543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-29DOI: 10.1016/j.peva.2023.102377
Yige Hong , Ziv Scully
How should we schedule jobs to minimize mean queue length? In the preemptive M/G/1 queue, we know the optimal policy is the Gittins policy, which uses any available information about jobs’ remaining service times to dynamically prioritize jobs. For models more complex than the M/G/1, optimal scheduling is generally intractable. This leads us to ask: beyond the M/G/1, does Gittins still perform well?
Recent results show Gittins performs well in the M/G/k, meaning that its additive suboptimality gap is bounded by an expression which is negligible in heavy traffic. But allowing multiple servers is just one way to extend the M/G/1, and most other extensions remain open. Does Gittins still perform well with non-Poisson arrival processes? Or if servers require setup times when transitioning from idle to busy?
In this paper, we give the first analysis of the Gittins policy that can handle any combination of (a) multiple servers, (b) non-Poisson arrivals, and (c) setup times. Our results thus cover the G/G/1 and G/G/k, with and without setup times, bounding Gittins’s suboptimality gap in each case. Each of (a), (b), and (c) adds a term to our bound, but all the terms are negligible in heavy traffic, thus implying Gittins’s heavy-traffic optimality in all the systems we consider. Another consequence of our results is that Gittins is optimal in the M/G/1 with setup times at all loads.
{"title":"Performance of the Gittins policy in the G/G/1 and G/G/k, with and without setup times","authors":"Yige Hong , Ziv Scully","doi":"10.1016/j.peva.2023.102377","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102377","url":null,"abstract":"<div><p>How should we schedule jobs to minimize mean queue length? In the preemptive M/G/1 queue, we know the optimal policy is the Gittins policy, which uses any available information about jobs’ remaining service times to dynamically prioritize jobs. For models more complex than the M/G/1, optimal scheduling is generally intractable. This leads us to ask: beyond the M/G/1, does Gittins still perform well?</p><p>Recent results show Gittins performs well in the M/G/<em>k</em>, meaning that its additive suboptimality gap is bounded by an expression which is negligible in heavy traffic. But allowing multiple servers is just one way to extend the M/G/1, and most other extensions remain open. Does Gittins still perform well with non-Poisson arrival processes? Or if servers require setup times when transitioning from idle to busy?</p><p>In this paper, we give the first analysis of the Gittins policy that can handle any combination of (a) multiple servers, (b) non-Poisson arrivals, and (c) setup times. Our results thus cover the G/G/1 and G/G/<em>k</em>, with and without setup times, bounding Gittins’s suboptimality gap in each case. Each of (a), (b), and (c) adds a term to our bound, but all the terms are negligible in heavy traffic, thus implying Gittins’s heavy-traffic optimality in all the systems we consider. Another consequence of our results is that Gittins is optimal in the M/G/1 with setup times at all loads.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"163 ","pages":"Article 102377"},"PeriodicalIF":2.2,"publicationDate":"2023-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531623000470/pdfft?md5=688fb1b83300cd7ea4fea9d191278825&pid=1-s2.0-S0166531623000470-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92030400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-18DOI: 10.1016/j.peva.2023.102380
Yaron Yeger , Onno Boxma , Jacques Resing , Maria Vlasiou
The Asymmetric Inclusion Process (ASIP) tandem queue is a model of stations in series with a gate after each station. At a gate opening, all customers in that station instantaneously move to the next station unidirectionally. In our study, we enhance the ASIP model by introducing the capability for individual customers to independently move from one station to the next, and by allowing both individual customers and batches of customers from any station to exit the system. The model is inspired by the process by which macromolecules are transported within cells.
We present a comprehensive analysis of various aspects of the queue length in the ASIP tandem model. Specifically, we provide an exact analysis of queue length moments and correlations and, under certain circumstances, of the queue length distribution. Furthermore, we propose an approximation for the joint queue length distribution. This approximation is derived using three different approaches, one of which employs the concept of the replica mean-field limit. Among other results, our analysis offers insight into the extent to which nutrients can support the survival of a cell.
{"title":"ASIP tandem queues with consumption","authors":"Yaron Yeger , Onno Boxma , Jacques Resing , Maria Vlasiou","doi":"10.1016/j.peva.2023.102380","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102380","url":null,"abstract":"<div><p>The Asymmetric Inclusion Process (ASIP) tandem queue is a model of stations in series with a gate after each station. At a gate opening, all customers in that station instantaneously move to the next station unidirectionally. In our study, we enhance the ASIP model by introducing the capability for individual customers to independently move from one station to the next, and by allowing both individual customers and batches of customers from any station to exit the system. The model is inspired by the process by which macromolecules are transported within cells.</p><p>We present a comprehensive analysis of various aspects of the queue length in the ASIP tandem model. Specifically, we provide an exact analysis of queue length moments and correlations and, under certain circumstances, of the queue length distribution. Furthermore, we propose an approximation for the joint queue length distribution. This approximation is derived using three different approaches, one of which employs the concept of the replica mean-field limit. Among other results, our analysis offers insight into the extent to which nutrients can support the survival of a cell.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"163 ","pages":"Article 102380"},"PeriodicalIF":2.2,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531623000500/pdfft?md5=979d6daae1fd3cf701761a51f472a8ff&pid=1-s2.0-S0166531623000500-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92030401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-13DOI: 10.1016/j.peva.2023.102371
Spandan Senapati , Rahul Vaze
We consider the online convex optimization (OCO) problem with quadratic and linear switching cost in the limited information setting, where an online algorithm can choose its action using only gradient information about the previous objective function. For -smooth and -strongly convex objective functions, we propose an online multiple gradient descent (OMGD) algorithm and show that its competitive ratio for the OCO problem with quadratic switching cost is at most . The competitive ratio upper bound for OMGD is also shown to be order-wise tight in terms of . In addition, we show that the competitive ratio of any online algorithm is in the limited information setting when the switching cost is quadratic. We also show that the OMGD algorithm achieves the optimal (order-wise) dynamic regret in the limited information setting. For the linear switching cost, the competitive ratio upper bound of the OMGD algorithm is shown to depend on both the path length and the squared path length of the problem instance, in addition to , and is shown to be order-wise, the best competitive ratio any online algorithm can achieve. Consequently, we conclude that the optimal competitive ratio for the quadratic and linear switching costs are fundamentally different in the limited information setting.
{"title":"Online convex optimization with switching cost and delayed gradients","authors":"Spandan Senapati , Rahul Vaze","doi":"10.1016/j.peva.2023.102371","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102371","url":null,"abstract":"<div><p>We consider the <span><em>online </em><em>convex optimization</em><em> (OCO)</em></span> problem with <em>quadratic</em> and <em>linear</em> switching cost in the <em>limited information</em><span> setting, where an online algorithm can choose its action using only gradient information about the previous objective function. For </span><span><math><mi>L</mi></math></span>-smooth and <span><math><mi>μ</mi></math></span><span><span>-strongly convex objective functions, we propose an online multiple gradient descent (OMGD) algorithm and show that its </span>competitive ratio for the OCO problem with quadratic switching cost is at most </span><span><math><mrow><mn>4</mn><mrow><mo>(</mo><mi>L</mi><mo>+</mo><mn>5</mn><mo>)</mo></mrow><mo>+</mo><mfrac><mrow><mn>16</mn><mrow><mo>(</mo><mi>L</mi><mo>+</mo><mn>5</mn><mo>)</mo></mrow></mrow><mrow><mi>μ</mi></mrow></mfrac></mrow></math></span>. The competitive ratio upper bound for OMGD is also shown to be order-wise tight in terms of <span><math><mrow><mi>L</mi><mo>,</mo><mi>μ</mi></mrow></math></span>. In addition, we show that the competitive ratio of any online algorithm is <span><math><mrow><mo>max</mo><mrow><mo>{</mo><mi>Ω</mi><mrow><mo>(</mo><mi>L</mi><mo>)</mo></mrow><mo>,</mo><mi>Ω</mi><mrow><mo>(</mo><mfrac><mrow><mi>L</mi></mrow><mrow><msqrt><mrow><mi>μ</mi></mrow></msqrt></mrow></mfrac><mo>)</mo></mrow><mo>}</mo></mrow></mrow></math></span> in the limited information setting when the switching cost is quadratic. We also show that the OMGD algorithm achieves the optimal (order-wise) dynamic regret in the limited information setting. For the linear switching cost, the competitive ratio upper bound of the OMGD algorithm is shown to depend on both the path length and the squared path length of the problem instance, in addition to <span><math><mrow><mi>L</mi><mo>,</mo><mi>μ</mi></mrow></math></span>, and is shown to be order-wise, the best competitive ratio any online algorithm can achieve. Consequently, we conclude that the optimal competitive ratio for the quadratic and linear switching costs are fundamentally different in the limited information setting.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102371"},"PeriodicalIF":2.2,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49874155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-12DOI: 10.1016/j.peva.2023.102375
Simon Scherrer, Seyedali Tabaeiaghdaei, Adrian Perrig
Internet service providers (ISPs) have a variety of quality attributes that determine their attractiveness for data transmission, ranging from quality-of-service metrics such as jitter to security properties such as the presence of DDoS defense systems. ISPs should optimize these attributes in line with their profit objective, i.e., maximize revenue from attracted traffic while minimizing attribute-related cost, all in the context of alternative offers by competing ISPs. However, this attribute optimization is difficult not least because many aspects of ISP competition are barely understood on a systematic level, e.g., the multi-dimensional and cost-driving nature of path quality, and the distributed decision making of ISPs on the same path.
In this paper, we improve this understanding by analyzing how ISP competition affects path quality and ISP profits. To that end, we develop a game-theoretic model in which ISPs (i) affect path quality via multiple attributes that entail costs, (ii) are on paths together with other selfish ISPs, and (iii) are in competition with alternative paths when attracting traffic. The model enables an extensive theoretical analysis, surprisingly showing that competition can have both positive and negative effects on path quality and ISP profits, depending on the network topology and the cost structure of ISPs. However, a large-scale simulation, which draws on real-world data to instantiate the model, shows that the positive effects will likely prevail in practice: If the number of selectable paths towards any destination increases from 1 to 5, the prevalence of quality attributes increases by at least 50%, while 75% of ISPs improve their profit.
{"title":"Quality competition among internet service providers","authors":"Simon Scherrer, Seyedali Tabaeiaghdaei, Adrian Perrig","doi":"10.1016/j.peva.2023.102375","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102375","url":null,"abstract":"<div><p>Internet service providers (ISPs) have a variety of quality attributes that determine their attractiveness for data transmission, ranging from quality-of-service metrics such as jitter to security properties such as the presence of DDoS defense systems. ISPs should optimize these attributes in line with their profit objective, i.e., maximize revenue from attracted traffic while minimizing attribute-related cost, all in the context of alternative offers by competing ISPs. However, this attribute optimization is difficult not least because many aspects of ISP competition are barely understood on a systematic level, e.g., the multi-dimensional and cost-driving nature of path quality, and the distributed decision making of ISPs on the same path.</p><p>In this paper, we improve this understanding by analyzing how ISP competition affects path quality and ISP profits. To that end, we develop a game-theoretic model in which ISPs (i) affect path quality via multiple attributes that entail costs, (ii) are on paths together with other selfish ISPs, and (iii) are in competition with alternative paths when attracting traffic. The model enables an extensive theoretical analysis, surprisingly showing that competition can have both positive and negative effects on path quality and ISP profits, depending on the network topology and the cost structure of ISPs. However, a large-scale simulation, which draws on real-world data to instantiate the model, shows that the positive effects will likely prevail in practice: If the number of selectable paths towards any destination increases from 1 to 5, the prevalence of quality attributes increases by at least 50%, while 75% of ISPs improve their profit.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102375"},"PeriodicalIF":2.2,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49874153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dispatching policies such as join the shortest queue (JSQ), join the queue with smallest workload (JSW), and their power of two variants are used in load balancing systems where the instantaneous queue length or workload information at all queues or a subset of them can be queried. In situations where the dispatcher has an associated memory, one can minimize this query overhead by maintaining a list of idle servers to which jobs can be dispatched. Recent alternative approaches that do not require querying such information include the cancel-on-start and cancel-on-complete replication policies. The downside of such policies however is that the servers must communicate either the start or the completion time instant of each service to the dispatcher and must allow the coordinated and instantaneous cancellation of all redundant replicas. In practice, the requirements of query messaging, memory, and replica cancellation pose challenges in their implementation and their advantages are not clear. In this work, we consider load-balancing policies that do not need to query load information, do not need memory, and do not need to cancel replicas. Our policies allow the dispatcher to append a timer to each job or its replica. A job or a replica is discarded if its timer expires before it starts receiving service. We analyze several variants of this policy which are novel and simple to implement. We numerically observe that the variants of the proposed policy outperform popular feedback-based policies for low arrival rates, despite no feedback from servers to the dispatcher.
{"title":"Load balancing policies without feedback using timed replicas","authors":"Rooji Jinan , Ajay Badita , Tejas Bodas , Parimal Parag","doi":"10.1016/j.peva.2023.102381","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102381","url":null,"abstract":"<div><p>Dispatching policies such as join the shortest queue (JSQ), join the queue with smallest workload (JSW), and their power of two variants are used in load balancing systems where the instantaneous queue length or workload information at all queues or a subset of them can be queried. In situations where the dispatcher has an associated memory, one can minimize this query overhead by maintaining a list of idle servers to which jobs can be dispatched. Recent alternative approaches that do not require querying such information include the cancel-on-start and cancel-on-complete replication policies. The downside of such policies however is that the servers must communicate either the start or the completion time instant of each service to the dispatcher and must allow the coordinated and instantaneous cancellation of all redundant replicas. In practice, the requirements of query messaging, memory, and replica cancellation pose challenges in their implementation and their advantages are not clear. In this work, we consider load-balancing policies that do not need to query load information, do not need memory, and do not need to cancel replicas. Our policies allow the dispatcher to append a timer to each job or its replica. A job or a replica is discarded if its timer expires before it starts receiving service. We analyze several variants of this policy which are novel and simple to implement. We numerically observe that the variants of the proposed policy outperform popular feedback-based policies for low arrival rates, despite no feedback from servers to the dispatcher.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102381"},"PeriodicalIF":2.2,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49874551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-11DOI: 10.1016/j.peva.2023.102378
Isaac Grosof, Yige Hong, Mor Harchol-Balter, Alan Scheller-Wolf
Multiserver-job (MSJ) systems, where jobs need to run concurrently across many servers, are increasingly common in practice. The default service ordering in many settings is First-Come First-Served (FCFS) service. Virtually all theoretical work on MSJ FCFS models focuses on characterizing the stability region, with almost nothing known about mean response time.
We derive the first explicit characterization of mean response time in the MSJ FCFS system. Our formula characterizes mean response time up to an additive constant, which becomes negligible as arrival rate approaches throughput, and allows for general phase-type job durations.
We derive our result by utilizing two key techniques: REduction to Saturated for Expected Time (RESET) and MArkovian Relative Completions (MARC).
Using our novel RESET technique, we reduce the problem of characterizing mean response time in the MSJ FCFS system to an M/M/1 with Markovian service rate (MMSR). The Markov chain controlling the service rate is based on the saturated system, a simpler closed system which is far more analytically tractable.
Unfortunately, the MMSR has no explicit characterization of mean response time. We therefore use our novel MARC technique to give the first explicit characterization of mean response time in the MMSR, again up to constant additive error. We specifically introduce the concept of “relative completions,” which is the cornerstone of our MARC technique.
{"title":"The RESET and MARC techniques, with application to multiserver-job analysis","authors":"Isaac Grosof, Yige Hong, Mor Harchol-Balter, Alan Scheller-Wolf","doi":"10.1016/j.peva.2023.102378","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102378","url":null,"abstract":"<div><p>Multiserver-job (MSJ) systems, where jobs need to run concurrently across many servers, are increasingly common in practice. The default service ordering in many settings is First-Come First-Served (FCFS) service. Virtually all theoretical work on MSJ FCFS models focuses on characterizing the stability region, with almost nothing known about mean response time.</p><p><span>We derive the first explicit characterization of mean response time in the MSJ FCFS system. Our formula characterizes mean response time up to an additive constant, which becomes negligible as </span>arrival rate approaches throughput, and allows for general phase-type job durations.</p><p>We derive our result by utilizing two key techniques: REduction to Saturated for Expected Time (RESET) and MArkovian Relative Completions (MARC).</p><p>Using our novel RESET technique, we reduce the problem of characterizing mean response time in the MSJ FCFS system to an M/M/1 with Markovian service rate (MMSR). The Markov chain controlling the service rate is based on the saturated system, a simpler closed system which is far more analytically tractable.</p><p>Unfortunately, the MMSR has no explicit characterization of mean response time. We therefore use our novel MARC technique to give the first explicit characterization of mean response time in the MMSR, again up to constant additive error. We specifically introduce the concept of “relative completions,” which is the cornerstone of our MARC technique.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102378"},"PeriodicalIF":2.2,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49874588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-10DOI: 10.1016/j.peva.2023.102376
Hengquan Guo , Hongchen Cao , Jingzhu He, Xin Liu, Yuanming Shi
Resource management in microservices is challenging due to the uncertain latency–resource relationship, dynamic environment, and strict Service-Level Agreement (SLA) guarantees. This paper presents a Pessimistic and Optimistic Bayesian Optimization framework, named POBO, for safe and optimal resource configuration for microservice applications. POBO leverages Bayesian learning to estimate the uncertain latency–resource functions and combines primal–dual and penalty-based optimization to maximize resource efficiency while guaranteeing strict SLAs. We prove that POBO can achieve sublinear regret and SLA violation against the optimal resource configuration in hindsight. We have implemented a prototype of POBO and conducted extensive experiments on a real-world microservice application. Our results show that POBO can find the safe and optimal configuration efficiently, outperforming Kubernetes’ built-in auto-scaling module and the state-of-the-art algorithms.
{"title":"POBO: Safe and optimal resource management for cloud microservices","authors":"Hengquan Guo , Hongchen Cao , Jingzhu He, Xin Liu, Yuanming Shi","doi":"10.1016/j.peva.2023.102376","DOIUrl":"https://doi.org/10.1016/j.peva.2023.102376","url":null,"abstract":"<div><p>Resource management in microservices<span> is challenging due to the uncertain latency–resource relationship, dynamic environment, and strict Service-Level Agreement (SLA) guarantees. This paper presents a Pessimistic and Optimistic Bayesian Optimization<span><span> framework, named POBO, for safe and optimal resource configuration for microservice applications. POBO leverages </span>Bayesian learning to estimate the uncertain latency–resource functions and combines primal–dual and penalty-based optimization to maximize resource efficiency while guaranteeing strict SLAs. We prove that POBO can achieve sublinear regret and SLA violation against the optimal resource configuration in hindsight. We have implemented a prototype of POBO and conducted extensive experiments on a real-world microservice application. Our results show that POBO can find the safe and optimal configuration efficiently, outperforming Kubernetes’ built-in auto-scaling module and the state-of-the-art algorithms.</span></span></p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"162 ","pages":"Article 102376"},"PeriodicalIF":2.2,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49874154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}