首页 > 最新文献

Performance Evaluation最新文献

英文 中文
Statistical properties of a class of randomized binary search algorithms
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-03-05 DOI: 10.1016/j.peva.2025.102478
Ye Xia
In this paper, we analyze the statistical properties of a randomized binary search algorithm and its variants. These algorithms have applications in caching and load balancing in distributed environments such as peer-to-peer networks, cloud storage, data centers, and content distribution networks. The basic discrete version of the problem is as follows. Suppose there are m servers, numbered 1, 2, …, m, out of which the first k servers are marked as special, where k is unknown. These k servers may contain a particular file or service that clients want. The objective is to select one of the marked servers uniformly at random. Considering the intended applications, we impose the constraint that there is no central controller to facilitate the selection process. We start with a basic algorithm: In each step, the client requesting the service chooses a number y uniformly at random from 1,2,,x, where x is the number chosen in the previous step, initially set to m in the first step. A query is then sent to server y asking whether y is marked. If the answer is yes, the algorithm returns y; otherwise, the process is repeated with xy. In this paper, we primarily consider two batch versions of this algorithm in which multiple numbers are chosen in each step and multiple queries are made in parallel. We derive the mean and variance (exact and/or asymptotic) for the number of search steps in each version of the algorithm, and when possible, we give its distribution. Additionally, we analyze the access pattern of queries across the entire search space.
{"title":"Statistical properties of a class of randomized binary search algorithms","authors":"Ye Xia","doi":"10.1016/j.peva.2025.102478","DOIUrl":"10.1016/j.peva.2025.102478","url":null,"abstract":"<div><div>In this paper, we analyze the statistical properties of a randomized binary search algorithm and its variants. These algorithms have applications in caching and load balancing in distributed environments such as peer-to-peer networks, cloud storage, data centers, and content distribution networks. The basic discrete version of the problem is as follows. Suppose there are <span><math><mi>m</mi></math></span> servers, numbered 1, 2, …, <span><math><mi>m</mi></math></span>, out of which the first <span><math><mi>k</mi></math></span> servers are marked as special, where <span><math><mi>k</mi></math></span> is unknown. These <span><math><mi>k</mi></math></span> servers may contain a particular file or service that clients want. The objective is to select one of the marked servers uniformly at random. Considering the intended applications, we impose the constraint that there is no central controller to facilitate the selection process. We start with a basic algorithm: In each step, the client requesting the service chooses a number <span><math><mi>y</mi></math></span> uniformly at random from <span><math><mrow><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>x</mi></mrow></math></span>, where <span><math><mi>x</mi></math></span> is the number chosen in the previous step, initially set to <span><math><mi>m</mi></math></span> in the first step. A query is then sent to server <span><math><mi>y</mi></math></span> asking whether <span><math><mi>y</mi></math></span> is marked. If the answer is yes, the algorithm returns <span><math><mi>y</mi></math></span>; otherwise, the process is repeated with <span><math><mrow><mi>x</mi><mo>←</mo><mi>y</mi></mrow></math></span>. In this paper, we primarily consider two batch versions of this algorithm in which multiple numbers are chosen in each step and multiple queries are made in parallel. We derive the mean and variance (exact and/or asymptotic) for the number of search steps in each version of the algorithm, and when possible, we give its distribution. Additionally, we analyze the access pattern of queries across the entire search space.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102478"},"PeriodicalIF":1.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational algorithms and arrival theorem for non-conventional product-form solutions
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-02-21 DOI: 10.1016/j.peva.2025.102469
Diletta Olliaro , Gianfranco Balbo , Andrea Marin , Matteo Sereno
Queuing networks with finite capacity are widely discussed in performance analysis literature. One approach to address the finite capacity of stations involves the implementation of a skip-over policy. Under this policy, when a customer arrives at a saturated station, service at that station is skipped, and the customer is rerouted based on the predefined network routing protocol.
Skip-over networks have been extensively investigated, and they exhibit a product-form stationary distribution under the exponential assumptions of Jackson networks. However, a comprehensive understanding of the celebrated Arrival Theorem for this class of product-form models is still lacking and relies on certain conjectures.
This paper makes three contributions: (i) it provides an in-depth comprehension of the Arrival Theorem for skip-over networks by offering a proof for the conjectures outlined in existing literature, (ii) it introduces a Mean Value Analysis (MVA) algorithm tailored for this type of queuing networks, and (iii) it explores the implications of these findings on the class of product-form queuing networks with fetching and repetitive service discipline.
{"title":"Computational algorithms and arrival theorem for non-conventional product-form solutions","authors":"Diletta Olliaro ,&nbsp;Gianfranco Balbo ,&nbsp;Andrea Marin ,&nbsp;Matteo Sereno","doi":"10.1016/j.peva.2025.102469","DOIUrl":"10.1016/j.peva.2025.102469","url":null,"abstract":"<div><div>Queuing networks with finite capacity are widely discussed in performance analysis literature. One approach to address the finite capacity of stations involves the implementation of a <em>skip-over</em> policy. Under this policy, when a customer arrives at a saturated station, service at that station is skipped, and the customer is rerouted based on the predefined network routing protocol.</div><div>Skip-over networks have been extensively investigated, and they exhibit a product-form stationary distribution under the exponential assumptions of Jackson networks. However, a comprehensive understanding of the celebrated <em>Arrival Theorem</em> for this class of product-form models is still lacking and relies on certain conjectures.</div><div>This paper makes three contributions: (i) it provides an in-depth comprehension of the Arrival Theorem for skip-over networks by offering a proof for the conjectures outlined in existing literature, (ii) it introduces a Mean Value Analysis (MVA) algorithm tailored for this type of queuing networks, and (iii) it explores the implications of these findings on the class of product-form queuing networks with fetching and repetitive service discipline.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102469"},"PeriodicalIF":1.0,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-performance tradeoffs in server farms with batch services and setup times
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-01-30 DOI: 10.1016/j.peva.2025.102468
Thu Le-Anh , Tuan Phung-Duc
Data centers consume a large amount of energy, much of which is wasted due to idle servers. Turning off idle servers might be an effective power-saving solution; however, there is a trade-off between energy savings and system performance. Hence, we propose a setup queueing model with a batching policy that allows servers to process a set of jobs simultaneously to minimize power consumption while maintaining acceptable performance. We consider an M/M/c/SET–BATCH queue, a multi-server batch service queue with a fixed batch size and setup times, and some variants, including systems in which idle servers delay before turning off or systems in which the batch size is dynamic. We analyze the steady-state probabilities and system performance of the M/M/c/SET–BATCH system and its variants. Our analysis of the M/M/c/SET–BATCH system with lower computational complexity is made possible by utilizing the special structure of the model. In addition, we use simulations to compare the M/M/c/SET–BATCH model with some other variants with different setup time distributions. The results suggest that the model performs better when the setup time has a larger coefficient of variation. Our results indicate that the batching policy enhances the system performance, especially when we allow servers to be idle before turning them off.
{"title":"Energy-performance tradeoffs in server farms with batch services and setup times","authors":"Thu Le-Anh ,&nbsp;Tuan Phung-Duc","doi":"10.1016/j.peva.2025.102468","DOIUrl":"10.1016/j.peva.2025.102468","url":null,"abstract":"<div><div>Data centers consume a large amount of energy, much of which is wasted due to idle servers. Turning off idle servers might be an effective power-saving solution; however, there is a trade-off between energy savings and system performance. Hence, we propose a setup queueing model with a batching policy that allows servers to process a set of jobs simultaneously to minimize power consumption while maintaining acceptable performance. We consider an M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH queue, a multi-server batch service queue with a fixed batch size and setup times, and some variants, including systems in which idle servers delay before turning off or systems in which the batch size is dynamic. We analyze the steady-state probabilities and system performance of the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH system and its variants. Our analysis of the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH system with lower computational complexity is made possible by utilizing the special structure of the model. In addition, we use simulations to compare the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH model with some other variants with different setup time distributions. The results suggest that the model performs better when the setup time has a larger coefficient of variation. Our results indicate that the batching policy enhances the system performance, especially when we allow servers to be idle before turning them off.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102468"},"PeriodicalIF":1.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foreword - Special Issue - MASCOTS 2023
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-01-03 DOI: 10.1016/j.peva.2025.102467
Maria Carla Calzarossa , Anshul Gandhi
{"title":"Foreword - Special Issue - MASCOTS 2023","authors":"Maria Carla Calzarossa ,&nbsp;Anshul Gandhi","doi":"10.1016/j.peva.2025.102467","DOIUrl":"10.1016/j.peva.2025.102467","url":null,"abstract":"","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102467"},"PeriodicalIF":1.0,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143182336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coupled queues with server interruptions: Some solutions
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-18 DOI: 10.1016/j.peva.2024.102466
Herwig Bruneel, Arnaud Devos
<div><div>We study three different <em>discrete-time</em> queueing systems, which accommodate two types of customers, named type 1 and type 2. New customers arrive independently from slot to slot, but the numbers of arrivals of both types in any slot are possibly mutually dependent; their joint probability generating function (<em>pgf</em>) is <span><math><mrow><mi>A</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow></mrow></math></span>. The service times of all customers are deterministically equal to one time slot.</div><div>We first consider a scenario (<em>Option</em> <span><math><mi>A</mi></math></span>) with <em>one single server</em> which is to be shared by the two customer types. Here, we assume that type-1 customers have <em>absolute service priority</em> over type-2 customers. Moreover, the server is subject to <em>random server interruptions</em>, which occur independently from slot to slot. We derive a functional equation for the steady-state joint pgf <span><math><mrow><mi>U</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow></mrow></math></span> of the numbers of type-1 and type-2 customers in the system. Relying on the application of Rouché’s theorem, we are able to explicitly solve the functional equation for <em>arbitrary</em> arrival pgfs <span><math><mrow><mi>A</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow></mrow></math></span>, but more elegant results are obtained for some specific choices of <span><math><mrow><mi>A</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow></mrow></math></span>.</div><div>Next, we focus on two different scenarios (<em>Option</em> <span><math><mi>B</mi></math></span> and <em>Option</em> <span><math><mi>C</mi></math></span>) where both customer types have their <em>own dedicated server</em>. Here, there are no service priorities involved. In Option <span><math><mi>B</mi></math></span>, the two servers experience <em>simultaneous</em> interruptions, whereas in Option <span><math><mi>C</mi></math></span>, <em>only one</em> of the servers is subject to interruptions. Again, we derive functional equations for the pgf <span><math><mrow><mi>U</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>)</mo></mrow></mrow></math></span>. Although solving these equations for arbitrary arrival pgfs <span><math><mrow><mi>A</mi><mrow><mo>(</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mn>2</mn></mrow></ms
{"title":"Coupled queues with server interruptions: Some solutions","authors":"Herwig Bruneel,&nbsp;Arnaud Devos","doi":"10.1016/j.peva.2024.102466","DOIUrl":"10.1016/j.peva.2024.102466","url":null,"abstract":"&lt;div&gt;&lt;div&gt;We study three different &lt;em&gt;discrete-time&lt;/em&gt; queueing systems, which accommodate two types of customers, named type 1 and type 2. New customers arrive independently from slot to slot, but the numbers of arrivals of both types in any slot are possibly mutually dependent; their joint probability generating function (&lt;em&gt;pgf&lt;/em&gt;) is &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. The service times of all customers are deterministically equal to one time slot.&lt;/div&gt;&lt;div&gt;We first consider a scenario (&lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;) with &lt;em&gt;one single server&lt;/em&gt; which is to be shared by the two customer types. Here, we assume that type-1 customers have &lt;em&gt;absolute service priority&lt;/em&gt; over type-2 customers. Moreover, the server is subject to &lt;em&gt;random server interruptions&lt;/em&gt;, which occur independently from slot to slot. We derive a functional equation for the steady-state joint pgf &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;U&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; of the numbers of type-1 and type-2 customers in the system. Relying on the application of Rouché’s theorem, we are able to explicitly solve the functional equation for &lt;em&gt;arbitrary&lt;/em&gt; arrival pgfs &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;, but more elegant results are obtained for some specific choices of &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;Next, we focus on two different scenarios (&lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt; and &lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;) where both customer types have their &lt;em&gt;own dedicated server&lt;/em&gt;. Here, there are no service priorities involved. In Option &lt;span&gt;&lt;math&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;, the two servers experience &lt;em&gt;simultaneous&lt;/em&gt; interruptions, whereas in Option &lt;span&gt;&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;, &lt;em&gt;only one&lt;/em&gt; of the servers is subject to interruptions. Again, we derive functional equations for the pgf &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;U&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. Although solving these equations for arbitrary arrival pgfs &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/ms","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102466"},"PeriodicalIF":1.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formal error bounds for the state space reduction of Markov chains 马尔可夫链状态空间缩减的形式误差边界
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-18 DOI: 10.1016/j.peva.2024.102464
Fabian Michel, Markus Siegle
We study the approximation of a Markov chain on a reduced state space, for both discrete- and continuous-time Markov chains. In this context, we extend the existing theory of formal error bounds for the approximated transient distributions. In the discrete-time setting, we bound the stepwise increment of the error, and in the continuous-time setting, we bound the rate at which the error grows. In addition, the same error bounds can also be applied to bound how far an approximated stationary distribution is from stationarity. As a special case, we consider aggregated (or lumped) Markov chains, where the state space reduction is achieved by partitioning the state space into macro states. Subsequently, we compare the error bounds with relevant concepts from the literature, such as exact and ordinary lumpability, as well as deflatability and aggregatability. These concepts provide stricter than necessary conditions for settings in which the aggregation error is zero. We also present possible algorithms for finding suitable aggregations for which the formal error bounds are low, and we analyze first experiments with these algorithms on a range of different models.
我们研究了离散和连续时间马尔可夫链在缩小状态空间上的近似。在此背景下,我们扩展了近似瞬态分布的现有正式误差约束理论。在离散时间环境中,我们对误差的逐步增量进行了约束;在连续时间环境中,我们对误差的增长率进行了约束。此外,同样的误差约束也可用于约束近似静态分布离静态的距离。作为特例,我们考虑了聚集(或拼凑)马尔可夫链,通过将状态空间划分为宏状态来实现状态空间的缩小。随后,我们将误差边界与文献中的相关概念进行比较,如精确可凑合性和普通可凑合性,以及可放缩性和可聚合性。这些概念为聚集误差为零的设置提供了比必要条件更严格的条件。我们还提出了一些可能的算法,用于寻找形式误差边界较低的合适集合,并分析了这些算法在一系列不同模型上的首次实验。
{"title":"Formal error bounds for the state space reduction of Markov chains","authors":"Fabian Michel,&nbsp;Markus Siegle","doi":"10.1016/j.peva.2024.102464","DOIUrl":"10.1016/j.peva.2024.102464","url":null,"abstract":"<div><div>We study the approximation of a Markov chain on a reduced state space, for both discrete- and continuous-time Markov chains. In this context, we extend the existing theory of formal error bounds for the approximated transient distributions. In the discrete-time setting, we bound the stepwise increment of the error, and in the continuous-time setting, we bound the rate at which the error grows. In addition, the same error bounds can also be applied to bound how far an approximated stationary distribution is from stationarity. As a special case, we consider aggregated (or lumped) Markov chains, where the state space reduction is achieved by partitioning the state space into macro states. Subsequently, we compare the error bounds with relevant concepts from the literature, such as exact and ordinary lumpability, as well as deflatability and aggregatability. These concepts provide stricter than necessary conditions for settings in which the aggregation error is zero. We also present possible algorithms for finding suitable aggregations for which the formal error bounds are low, and we analyze first experiments with these algorithms on a range of different models.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102464"},"PeriodicalIF":1.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Special issue on Performance Analysis and Evaluation of Systems for Artificial Intelligence
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-13 DOI: 10.1016/j.peva.2024.102465
Anshul Gandhi , Bo Jiang , Shaolei Ren
{"title":"Editorial: Special issue on Performance Analysis and Evaluation of Systems for Artificial Intelligence","authors":"Anshul Gandhi ,&nbsp;Bo Jiang ,&nbsp;Shaolei Ren","doi":"10.1016/j.peva.2024.102465","DOIUrl":"10.1016/j.peva.2024.102465","url":null,"abstract":"","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102465"},"PeriodicalIF":1.0,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143182335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Job assignment in machine learning inference systems with accuracy constraints 具有准确性约束的机器学习推理系统中的任务分配
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-12 DOI: 10.1016/j.peva.2024.102463
Tuhinangshu Choudhury , Gauri Joshi , Weina Wang
Modern machine learning inference systems often host multiple models that can perform the same task with different levels of accuracy and latency. For example, a large model can be more accurate but slow, whereas a smaller and less accurate can be faster in serving inference queries. Amidst the rapid advancements in Large Language Models (LLMs), it is paramount for such systems to strike the best trade-off between latency and accuracy. In this paper, we consider the problem of designing job assignment policies for a multi-server queueing system where servers have heterogeneous rates and accuracies, and our goal is to minimize the expected inference latency while meeting an average accuracy target. Such queueing systems with constraints have been sparsely studied in prior literature to the best of our knowledge. We first identify a lower bound on the minimum achievable latency under any policy that achieves the target accuracy a using a linear programming (LP) formulation. Building on the LP solution, we introduce a Randomized-Join-the Idle Queue (R-JIQ) policy, which consistently meets the accuracy target and asymptotically (as system size increases) achieves the optimal latency TLP-LB(λ). However, the R-JIQ policy relies on the knowledge of the arrival rate λ to solve the LP. To address this limitation, we propose the Prioritize Ordered Pairs (POP) policy that incorporates the concept of ordered pairs of servers into waterfilling to iteratively solve the LP. This allows the POP policy to function without relying on the arrival rate. Experiments suggest that POP performs robustly across different system sizes and load scenarios, achieving near-optimal performance.
现代机器学习推理系统通常包含多个模型,这些模型可以以不同的准确度和延迟执行相同的任务。例如,大型模型可能更准确但速度较慢,而较小且准确度较低的模型在提供推理查询时可能更快。随着大型语言模型(LLM)的快速发展,此类系统必须在延迟和准确性之间取得最佳平衡。在本文中,我们考虑了为多服务器队列系统设计任务分配策略的问题,在该系统中,服务器具有不同的速率和准确度,我们的目标是在满足平均准确度目标的同时最大限度地减少预期推理延迟。据我们所知,以前的文献中对这种具有约束条件的队列系统的研究很少。我们首先使用线性规划(LP)公式确定了在任何可达到目标精度 a∗ 的策略下可实现的最小延迟下限。在 LP 解法的基础上,我们引入了随机加入空闲队列(R-JIQ)策略,该策略可持续满足精度目标,并渐进地(随着系统规模的增加)实现最佳延迟 TLP-LB(λ)。然而,R-JIQ 策略依赖于到达率 λ 的知识来求解 LP。为解决这一局限性,我们提出了优先有序对(POP)策略,该策略将服务器有序对的概念融入到注水中,以迭代方式求解 LP。这使得 POP 策略无需依赖到达率即可发挥作用。实验表明,POP 在不同的系统规模和负载情况下都表现稳健,达到了接近最优的性能。
{"title":"Job assignment in machine learning inference systems with accuracy constraints","authors":"Tuhinangshu Choudhury ,&nbsp;Gauri Joshi ,&nbsp;Weina Wang","doi":"10.1016/j.peva.2024.102463","DOIUrl":"10.1016/j.peva.2024.102463","url":null,"abstract":"<div><div>Modern machine learning inference systems often host multiple models that can perform the same task with different levels of accuracy and latency. For example, a large model can be more accurate but slow, whereas a smaller and less accurate can be faster in serving inference queries. Amidst the rapid advancements in Large Language Models (LLMs), it is paramount for such systems to strike the best trade-off between latency and accuracy. In this paper, we consider the problem of designing job assignment policies for a multi-server queueing system where servers have heterogeneous rates and accuracies, and our goal is to minimize the expected inference latency while meeting an average accuracy target. Such queueing systems with constraints have been sparsely studied in prior literature to the best of our knowledge. We first identify a lower bound on the minimum achievable latency under any policy that achieves the target accuracy <span><math><msup><mrow><mi>a</mi></mrow><mrow><mo>∗</mo></mrow></msup></math></span> using a linear programming (LP) formulation. Building on the LP solution, we introduce a Randomized-Join-the Idle Queue (R-JIQ) policy, which consistently meets the accuracy target and asymptotically (as system size increases) achieves the optimal latency <span><math><mrow><msub><mrow><mi>T</mi></mrow><mrow><mtext>LP-LB</mtext></mrow></msub><mrow><mo>(</mo><mi>λ</mi><mo>)</mo></mrow></mrow></math></span>. However, the R-JIQ policy relies on the knowledge of the arrival rate <span><math><mi>λ</mi></math></span> to solve the LP. To address this limitation, we propose the Prioritize Ordered Pairs (POP) policy that incorporates the concept of <em>ordered pairs</em> of servers into waterfilling to iteratively solve the LP. This allows the POP policy to function without relying on the arrival rate. Experiments suggest that POP performs robustly across different system sizes and load scenarios, achieving near-optimal performance.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102463"},"PeriodicalIF":1.0,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimensioning leaky buckets in stochastic environments
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-04 DOI: 10.1016/j.peva.2024.102461
Peter Buchholz , András Mészáros , Miklós Telek
Leaky buckets are commonly used for access control in networks, where access control stands for traffic regulation at the ingress of the network. In network calculus, which is often applied for performance analysis or dimensioning of networks, leaky buckets are the model behind piecewise linear arrival curves that specify an input bound to a network. In this paper we present the analysis of leaky bucket based access control under stochastic arrivals using fluid queues, when the access control is implemented by possibly more than one leaky buckets. This results in methods to dimension parameters of access control for different stochastic arrival processes including correlated arrivals. The approach is one step to bridge the gap between classical stochastic analysis using queuing networks and deterministic analysis using network calculus. Results are presented for stochastic arrival processes using numerical methods and for measured arrivals using trace driven simulation.
泄漏桶通常用于网络接入控制,其中接入控制代表网络入口处的流量调节。在网络微积分中,泄漏桶是指定网络输入边界的片断线性到达曲线背后的模型,通常用于网络的性能分析或尺寸确定。在本文中,我们利用流体队列分析了随机到达情况下基于漏斗的访问控制,此时访问控制可能由多个漏斗实现。这就产生了针对不同随机到达过程(包括相关到达)的访问控制参数维度的方法。这种方法是弥合使用队列网络的经典随机分析与使用网络微积分的确定性分析之间差距的一步。文中介绍了使用数值方法对随机到达过程进行分析的结果,以及使用轨迹驱动模拟对测量到达过程进行分析的结果。
{"title":"Dimensioning leaky buckets in stochastic environments","authors":"Peter Buchholz ,&nbsp;András Mészáros ,&nbsp;Miklós Telek","doi":"10.1016/j.peva.2024.102461","DOIUrl":"10.1016/j.peva.2024.102461","url":null,"abstract":"<div><div>Leaky buckets are commonly used for access control in networks, where access control stands for traffic regulation at the ingress of the network. In network calculus, which is often applied for performance analysis or dimensioning of networks, leaky buckets are the model behind piecewise linear arrival curves that specify an input bound to a network. In this paper we present the analysis of leaky bucket based access control under stochastic arrivals using fluid queues, when the access control is implemented by possibly more than one leaky buckets. This results in methods to dimension parameters of access control for different stochastic arrival processes including correlated arrivals. The approach is one step to bridge the gap between classical stochastic analysis using queuing networks and deterministic analysis using network calculus. Results are presented for stochastic arrival processes using numerical methods and for measured arrivals using trace driven simulation.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102461"},"PeriodicalIF":1.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preface: Special issue on ITC 2023
IF 1 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-04 DOI: 10.1016/j.peva.2024.102462
Sara Alouf , Oliver Hohlfeld , Zhiyuan Jiang
{"title":"Preface: Special issue on ITC 2023","authors":"Sara Alouf ,&nbsp;Oliver Hohlfeld ,&nbsp;Zhiyuan Jiang","doi":"10.1016/j.peva.2024.102462","DOIUrl":"10.1016/j.peva.2024.102462","url":null,"abstract":"","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102462"},"PeriodicalIF":1.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143182243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Performance Evaluation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1