Pub Date : 2024-05-25DOI: 10.1016/j.peva.2024.102423
Myeongjin Kwak, Jeonggeun Kim, Yongtae Kim
This paper introduces TorchAxf1, a framework for fast simulation of diverse approximate deep neural network (DNN) models, including spiking neural networks (SNNs). The proposed framework utilizes various approximate adders and multipliers, supports industrial standard reduced precision floating-point formats, such as bfloat16, and accommodates user-customized precision representations. Leveraging GPU acceleration on the PyTorch framework, TorchAxf accelerates approximate DNN training and inference. In addition, it allows seamless integration of arbitrary approximate arithmetic algorithms with C/C++ behavioral models to emulate approximate DNN hardware accelerators.
We utilize the proposed TorchAxf framework to assess twelve popular DNN models under approximate multiply-and-accumulate (MAC) operations. Through comprehensive experiments, we determine the suitable degree of floating-point arithmetic approximation for these DNN models without significant accuracy loss and offer the optimal reduced precision formats for each DNN model. Additionally, we demonstrate that approximate-aware re-training can rectify errors and enhance pre-trained DNN models under reduced precision formats. Furthermore, TorchAxf, operating on GPU, remarkably reduces simulation time for complex DNN models using approximate arithmetic by up to 131.38 compared to the baseline optimized CPU implementation. Finally, we compare the proposed framework with state-of-the-art frameworks to highlight its superiority.
{"title":"A comprehensive exploration of approximate DNN models with a novel floating-point simulation framework","authors":"Myeongjin Kwak, Jeonggeun Kim, Yongtae Kim","doi":"10.1016/j.peva.2024.102423","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102423","url":null,"abstract":"<div><p>This paper introduces <em>TorchAxf</em><span><sup>1</sup></span>, a framework for fast simulation of diverse approximate deep neural network (DNN) models, including spiking neural networks (SNNs). The proposed framework utilizes various approximate adders and multipliers, supports industrial standard reduced precision floating-point formats, such as <span>bfloat16</span>, and accommodates user-customized precision representations. Leveraging GPU acceleration on the PyTorch framework, <em>TorchAxf</em> accelerates approximate DNN training and inference. In addition, it allows seamless integration of arbitrary approximate arithmetic algorithms with C/C++ behavioral models to emulate approximate DNN hardware accelerators.</p><p>We utilize the proposed <em>TorchAxf</em> framework to assess twelve popular DNN models under approximate multiply-and-accumulate (MAC) operations. Through comprehensive experiments, we determine the suitable degree of floating-point arithmetic approximation for these DNN models without significant accuracy loss and offer the optimal reduced precision formats for each DNN model. Additionally, we demonstrate that approximate-aware re-training can rectify errors and enhance pre-trained DNN models under reduced precision formats. Furthermore, <em>TorchAxf</em>, operating on GPU, remarkably reduces simulation time for complex DNN models using approximate arithmetic by up to 131.38<span><math><mo>×</mo></math></span> compared to the baseline optimized CPU implementation. Finally, we compare the proposed framework with state-of-the-art frameworks to highlight its superiority.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"165 ","pages":"Article 102423"},"PeriodicalIF":2.2,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141239841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-25DOI: 10.1016/j.peva.2024.102424
Dieter Fiems , Tuan Phung-Duc
We consider a Markovian retrial queueing system with customer collisions and abandonment in the context of carrier-sense multiple access systems. Using -transform techniques, we find a set of first-order differential equations for the probability generating functions of the orbit size when the server is empty, busy, or in the collision phase. We then rely on series expansion techniques to extract approximations for relevant performance measures from this set of differential equations. More precisely, we construct a numerical algorithm to calculate the terms in the series expansions of various factorial moments of the orbit size. To improve the accuracy of our series expansion approach, we apply Wynn’s epsilon algorithm which not only speeds up convergence, but also extends the region of convergence. We illustrate the accuracy of our approach by means of some numerical examples, and find that the method is both fast and accurate for a wide range of the parameter values.
我们在载波感应多路访问系统的背景下,考虑了一个具有客户碰撞和放弃的马尔可夫重试排队系统。利用 z 变换技术,我们找到了一组一阶微分方程,用于计算服务器空闲、繁忙或处于碰撞阶段时轨道大小的概率生成函数。然后,我们利用数列展开技术,从这组微分方程中提取相关性能指标的近似值。更准确地说,我们构建了一种数值算法,用于计算轨道大小各种阶乘矩的级数展开项。为了提高数列展开方法的精度,我们采用了 Wynn 的ε算法,该算法不仅加快了收敛速度,还扩大了收敛区域。我们通过一些数值示例来说明我们的方法的准确性,并发现该方法在很大的参数值范围内既快速又准确。
{"title":"Performance analysis of a collision channel with abandonments","authors":"Dieter Fiems , Tuan Phung-Duc","doi":"10.1016/j.peva.2024.102424","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102424","url":null,"abstract":"<div><p>We consider a Markovian retrial queueing system with customer collisions and abandonment in the context of carrier-sense multiple access systems. Using <span><math><mi>z</mi></math></span>-transform techniques, we find a set of first-order differential equations for the probability generating functions of the orbit size when the server is empty, busy, or in the collision phase. We then rely on series expansion techniques to extract approximations for relevant performance measures from this set of differential equations. More precisely, we construct a numerical algorithm to calculate the terms in the series expansions of various factorial moments of the orbit size. To improve the accuracy of our series expansion approach, we apply Wynn’s epsilon algorithm which not only speeds up convergence, but also extends the region of convergence. We illustrate the accuracy of our approach by means of some numerical examples, and find that the method is both fast and accurate for a wide range of the parameter values.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"165 ","pages":"Article 102424"},"PeriodicalIF":2.2,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141286199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-23DOI: 10.1016/j.peva.2024.102414
A. Bušić , J. Doncel , J.M. Fourneau
Energy Packet Networks (EPNs) model the interaction between renewable sources generating energy following a random process and communication devices that consume energy. This network is formed by cells and, in each cell, there is a queue that handles energy packets and another queue that handles data packets. We assume Poisson arrivals of energy packets and of data packets to all the cells and exponential service times. We consider an EPN model with a dynamic load balancing where a cell without data packets can poll other cells to migrate jobs. This migration can only take place when there is enough energy in both interacting cells, in which case a batch of data packets is transferred and the required energy is consumed (i.e. it disappears). We consider that data packet also consume energy to be routed to the next station. Our main result shows that the steady-state distribution of jobs in the queues admits a product form solution provided that a stable solution of a fixed point equation exists. We prove sufficient conditions for irreducibility. Under these conditions and when the fixed point equation has a solution, the Markov chain is ergodic. We also provide sufficient conditions for the existence of a solution of the fixed point equation. We then focus on layered networks and we study the polling rates that must be set to achieve a fair load balancing, i.e., such that, in the same layer, the load of the queues handling data packets is the same. Our numerical experiments illustrate that dynamic load balancing satisfies several interesting properties such as performance improvement or fair load balancing.
{"title":"Dynamic load balancing in energy packet networks","authors":"A. Bušić , J. Doncel , J.M. Fourneau","doi":"10.1016/j.peva.2024.102414","DOIUrl":"10.1016/j.peva.2024.102414","url":null,"abstract":"<div><p>Energy Packet Networks (EPNs) model the interaction between renewable sources generating energy following a random process and communication devices that consume energy. This network is formed by cells and, in each cell, there is a queue that handles energy packets and another queue that handles data packets. We assume Poisson arrivals of energy packets and of data packets to all the cells and exponential service times. We consider an EPN model with a dynamic load balancing where a cell without data packets can poll other cells to migrate jobs. This migration can only take place when there is enough energy in both interacting cells, in which case a batch of data packets is transferred and the required energy is consumed (i.e. it disappears). We consider that data packet also consume energy to be routed to the next station. Our main result shows that the steady-state distribution of jobs in the queues admits a product form solution provided that a stable solution of a fixed point equation exists. We prove sufficient conditions for irreducibility. Under these conditions and when the fixed point equation has a solution, the Markov chain is ergodic. We also provide sufficient conditions for the existence of a solution of the fixed point equation. We then focus on layered networks and we study the polling rates that must be set to achieve a fair load balancing, i.e., such that, in the same layer, the load of the queues handling data packets is the same. Our numerical experiments illustrate that dynamic load balancing satisfies several interesting properties such as performance improvement or fair load balancing.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"165 ","pages":"Article 102414"},"PeriodicalIF":2.2,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531624000191/pdfft?md5=fb96a067de593ae411502a62f32b10ab&pid=1-s2.0-S0166531624000191-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140777407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-22DOI: 10.1016/j.peva.2024.102422
Yassine Hadjadj-Aoul , Maël Le Treust , Patrick Maillé , Bruno Tuffin
Network slicing is a key component of 5G-and-beyond networks but induces many questions related to an associated business model and its need to be regulated due to its difficult co-existence with the network neutrality debate. We propose in this paper a slicing model in the case of heterogeneous users/applications where a service provider may purchase a slice in a wireless network and offer a “premium” service where the improved quality stems from higher prices leading to less demand and less congestion than the basic service offered by the network owner, a scheme known as Paris Metro Pricing. We obtain thanks to game theory the economically-optimal slice size and prices charged by all actors. We also compare with the case of a unique “pipe” (no premium service) corresponding to a fully-neutral scenario and with the case of vertical integration to evaluate the impact of slicing on all actors and identify the “best” economic scenario and the eventual need for regulation.
{"title":"Network slicing: Is it worth regulating in a network neutrality context?","authors":"Yassine Hadjadj-Aoul , Maël Le Treust , Patrick Maillé , Bruno Tuffin","doi":"10.1016/j.peva.2024.102422","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102422","url":null,"abstract":"<div><p>Network slicing is a key component of 5G-and-beyond networks but induces many questions related to an associated business model and its need to be regulated due to its difficult co-existence with the network neutrality debate. We propose in this paper a slicing model in the case of heterogeneous users/applications where a service provider may purchase a slice in a wireless network and offer a “premium” service where the improved quality stems from higher prices leading to less demand and less congestion than the basic service offered by the network owner, a scheme known as Paris Metro Pricing. We obtain thanks to game theory the economically-optimal slice size and prices charged by all actors. We also compare with the case of a unique “pipe” (no premium service) corresponding to a fully-neutral scenario and with the case of vertical integration to evaluate the impact of slicing on all actors and identify the “best” economic scenario and the eventual need for regulation.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"165 ","pages":"Article 102422"},"PeriodicalIF":2.2,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531624000270/pdfft?md5=5fc9a43434d102c218d3c6bee4bb9f76&pid=1-s2.0-S0166531624000270-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140645416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-15DOI: 10.1016/j.peva.2024.102415
Tamer E. Fahim , Sherif I. Rabia , Ahmed H. Abd El-Malek , Waheed K. Zahra
Motivated by real-life applications, a special research-work interest has been recently directed towards the prioritized status update systems, which prioritize the update streams according to their timeliness constraints. The preferential service treatment between priority classes is commonly based on classical disciplines, preemption and non-preemption. However, both disciplines fail to give an even satisfaction between all classes. In our work, an interruption-based hybrid preemptive/non-preemptive discipline is proposed under a single-buffer system modeled as an M/M/1/2 priority queueing system. Each class being served (resp. buffered) can be preempted unless its recorded number of service preemptions reaches the predetermined in-service (resp. in-waiting) threshold. All thresholds between classes are the controlling parameters of the whole system’s performance. Using the stochastic hybrid system approach, the age of information (AoI) performance metric is analyzed in terms of its statistical average along with the higher-order moments, considering a general number of priority classes. Closed-form results are also obtained for some special cases, giving analytical insights about the AoI stability in heavy loading conditions. The average AoI and its dispersion are numerically investigated for the case of a three-class network. The significance of the proposed model is manifested in achieving a compromise satisfaction between all priority classes by a thorough adjustment of its threshold parameters. Two approaches are proposed to clarify the adjustment of these parameters. It turned out that the proposed hybrid discipline compensates for the limited buffer resource, achieving more promising performance with low design complexity and low cost. Moreover, the proposed scheme can operate under a wider span of the total offered load, through which the whole network satisfaction can be optimized under some legitimate constraints on the age-sensitive classes.
{"title":"Analyzing the age of information in prioritized status update systems under an interruption-based hybrid discipline","authors":"Tamer E. Fahim , Sherif I. Rabia , Ahmed H. Abd El-Malek , Waheed K. Zahra","doi":"10.1016/j.peva.2024.102415","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102415","url":null,"abstract":"<div><p>Motivated by real-life applications, a special research-work interest has been recently directed towards the prioritized status update systems, which prioritize the update streams according to their timeliness constraints. The preferential service treatment between priority classes is commonly based on classical disciplines, preemption and non-preemption. However, both disciplines fail to give an even satisfaction between all classes. In our work, an interruption-based hybrid preemptive/non-preemptive discipline is proposed under a single-buffer system modeled as an M/M/1/2 priority queueing system. Each class being served (resp. buffered) can be preempted unless its recorded number of service preemptions reaches the predetermined in-service (resp. in-waiting) threshold. All thresholds between classes are the controlling parameters of the whole system’s performance. Using the stochastic hybrid system approach, the age of information (AoI) performance metric is analyzed in terms of its statistical average along with the higher-order moments, considering a general number of priority classes. Closed-form results are also obtained for some special cases, giving analytical insights about the AoI stability in heavy loading conditions. The average AoI and its dispersion are numerically investigated for the case of a three-class network. The significance of the proposed model is manifested in achieving a compromise satisfaction between all priority classes by a thorough adjustment of its threshold parameters. Two approaches are proposed to clarify the adjustment of these parameters. It turned out that the proposed hybrid discipline compensates for the limited buffer resource, achieving more promising performance with low design complexity and low cost. Moreover, the proposed scheme can operate under a wider span of the total offered load, through which the whole network satisfaction can be optimized under some legitimate constraints on the age-sensitive classes.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"165 ","pages":"Article 102415"},"PeriodicalIF":2.2,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140640916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-05DOI: 10.1016/j.peva.2024.102413
Raffaela Groner , Peter Bellmann , Stefan Höppner , Patrick Thiam , Friedhelm Schwenker , Hans A. Kestler , Matthias Tichy
Model transformation languages are domain-specific languages used to define transformations of models. These transformations consist of the translation from one modeling formalism into another or just the updating of a given model. Such transformations are often described declaratively and are often implemented based on very small models that cover the language of the input model. As a result, transformation developers are often unable to assess the time required to transform a larger model.
Hence, we propose a prediction approach based on machine learning which uses a set of model characteristics as input and provides a prediction of the execution time of a transformation defined in the Atlas Transformation Language (ATL). In our previous work (Groner et al., 2023), we already showed that support vector regression in combination with a model characterization based on the number of model elements, the number of references, and the number of attributes is the best choice in terms of usability and prediction accuracy for the transformations considered in our experiments.
A major weakness of our previous approach is that it fails to predict the performance of transformations that also transform attribute values of arbitrary length, such as string values. Therefore, we investigate in this work whether an extension of our feature sets that describes the average size of string attributes can help to overcome this weakness.
Our results show that the random forest approach in combination with model characterizations based on the number of model elements, the number of references, the number of attributes, and the average size of string attributes filtered by the 85th percentile of their variance is the best choice in terms of the simple way to describe a model and the quality of the obtained prediction. With this combination, we obtained a mean absolute percentage error (MAPE) of 5.07% over all modules and a MAPE of 4.82% over all modules excluding the transformation for which our previous approach failed. Whereas, we obtained previously a MAPE of 38.48% over all modules and a MAPE of 4.45% over all modules excluding the transformation for which our previous approach failed.
{"title":"Enhanced performance prediction of ATL model transformations","authors":"Raffaela Groner , Peter Bellmann , Stefan Höppner , Patrick Thiam , Friedhelm Schwenker , Hans A. Kestler , Matthias Tichy","doi":"10.1016/j.peva.2024.102413","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102413","url":null,"abstract":"<div><p>Model transformation languages are domain-specific languages used to define transformations of models. These transformations consist of the translation from one modeling formalism into another or just the updating of a given model. Such transformations are often described declaratively and are often implemented based on very small models that cover the language of the input model. As a result, transformation developers are often unable to assess the time required to transform a larger model.</p><p>Hence, we propose a prediction approach based on machine learning which uses a set of model characteristics as input and provides a prediction of the execution time of a transformation defined in the Atlas Transformation Language (ATL). In our previous work (Groner et al., 2023), we already showed that support vector regression in combination with a model characterization based on the number of model elements, the number of references, and the number of attributes is the best choice in terms of usability and prediction accuracy for the transformations considered in our experiments.</p><p>A major weakness of our previous approach is that it fails to predict the performance of transformations that also transform attribute values of arbitrary length, such as string values. Therefore, we investigate in this work whether an extension of our feature sets that describes the average size of string attributes can help to overcome this weakness.</p><p>Our results show that the random forest approach in combination with model characterizations based on the number of model elements, the number of references, the number of attributes, and the average size of string attributes filtered by the 85th percentile of their variance is the best choice in terms of the simple way to describe a model and the quality of the obtained prediction. With this combination, we obtained a mean absolute percentage error (MAPE) of 5.07% over all modules and a MAPE of 4.82% over all modules excluding the transformation for which our previous approach failed. Whereas, we obtained previously a MAPE of 38.48% over all modules and a MAPE of 4.45% over all modules excluding the transformation for which our previous approach failed.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"164 ","pages":"Article 102413"},"PeriodicalIF":2.2,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S016653162400018X/pdfft?md5=58a866ca1d0c949f2646b2162533ef3f&pid=1-s2.0-S016653162400018X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140555333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-19DOI: 10.1016/j.peva.2024.102412
Mahsa Noroozi, Markus Fidler
Age-of-information is a metric that quantifies the freshness of information obtained by sampling a remote sensor. In signal-agnostic sampling, sensor updates are triggered at certain times without being conditioned on the actual sensor signal. Optimal update policies have been researched and it is accepted that periodic updates achieve smaller age-of-information than random updates. We contribute a study of a signal-aware policy, where updates are triggered randomly by a defined sensor event. By definition, this implies random updates and as a consequence inferior age-of-information. Considering a notion of deviation-of-information as a signal-aware metric, our results show, however, that event-triggered systems can perform equally well as time-triggered systems while causing smaller mean network utilization. We use the stochastic network calculus to derive bounds of age- and deviation-of-information that are exceeded at most with a small, defined probability. We include simulation results that confirm the tail decay of the bounds. We also evaluate a hybrid time- and event-triggered policy where the event-triggered system is complemented by a minimal and a maximal update interval.
{"title":"Age- and deviation-of-information of hybrid time- and event-triggered systems: What matters more, determinism or resource conservation?","authors":"Mahsa Noroozi, Markus Fidler","doi":"10.1016/j.peva.2024.102412","DOIUrl":"10.1016/j.peva.2024.102412","url":null,"abstract":"<div><p>Age-of-information is a metric that quantifies the freshness of information obtained by sampling a remote sensor. In signal-agnostic sampling, sensor updates are triggered at certain times without being conditioned on the actual sensor signal. Optimal update policies have been researched and it is accepted that periodic updates achieve smaller age-of-information than random updates. We contribute a study of a signal-aware policy, where updates are triggered randomly by a defined sensor event. By definition, this implies random updates and as a consequence inferior age-of-information. Considering a notion of deviation-of-information as a signal-aware metric, our results show, however, that event-triggered systems can perform equally well as time-triggered systems while causing smaller mean network utilization. We use the stochastic network calculus to derive bounds of age- and deviation-of-information that are exceeded at most with a small, defined probability. We include simulation results that confirm the tail decay of the bounds. We also evaluate a hybrid time- and event-triggered policy where the event-triggered system is complemented by a minimal and a maximal update interval.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"164 ","pages":"Article 102412"},"PeriodicalIF":2.2,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531624000178/pdfft?md5=8cf8b229cde1a9343b74d531721bff2d&pid=1-s2.0-S0166531624000178-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to scalability requirements and the split of large software development projects into small agile teams, there is a current trend toward the migration of monolith systems to the microservice architecture. However, the split of the monolith into microservices, its encapsulation through well-defined interfaces, and the introduction of inter-microservice communication add a cost in terms of performance. In this paper, we describe a case study of the migration of a monolith to a microservice architecture, where a modular monolith architecture is used as an intermediate step. The impact on migration effort and performance is measured for both steps. Current state-of-the-art analyses the migration of monolith systems to a microservice architecture, but we observed that migration effort and performance issues are already significant in the migration to a modular monolith. Therefore, a clear distinction is established for each of the steps, which may inform software architects on the planning of the migration of monolith systems. In particular, we consider the trade-offs of doing all the migration process or just migrating to a modular monolith.
{"title":"Stepwise migration of a monolith to a microservice architecture: Performance and migration effort evaluation","authors":"Diogo Faustino , Nuno Gonçalves , Manuel Portela , António Rito Silva","doi":"10.1016/j.peva.2024.102411","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102411","url":null,"abstract":"<div><p>Due to scalability requirements and the split of large software development projects into small agile teams, there is a current trend toward the migration of monolith systems to the microservice architecture. However, the split of the monolith into microservices, its encapsulation through well-defined interfaces, and the introduction of inter-microservice communication add a cost in terms of performance. In this paper, we describe a case study of the migration of a monolith to a microservice architecture, where a modular monolith architecture is used as an intermediate step. The impact on migration effort and performance is measured for both steps. Current state-of-the-art analyses the migration of monolith systems to a microservice architecture, but we observed that migration effort and performance issues are already significant in the migration to a modular monolith. Therefore, a clear distinction is established for each of the steps, which may inform software architects on the planning of the migration of monolith systems. In particular, we consider the trade-offs of doing all the migration process or just migrating to a modular monolith.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"164 ","pages":"Article 102411"},"PeriodicalIF":2.2,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140141848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider a system with unit-rate servers where jobs arrive according a Poisson process with rate (). In the standard Power-of- or Pod scheme with , for each incoming job, a dispatcher samples servers uniformly at random and sends the incoming job to the least loaded of the sampled servers. However, in practice, load comparisons may not always be accurate. In this paper, we analyse the effects of noisy load comparisons on the performance of the Pod scheme. To test the robustness of the Pod scheme against load comparison errors, we assume an adversarial setting where, in the event of an error, the adversary assigns the incoming job to the worst possible server, i.e., the server with the maximum load among the sampled servers. We consider two error models: load-dependent and load-independent errors. In the load-dependent error model, the adversary has limited power in that it is able to cause an error with probability only when the difference in the minimum and the maximum queue lengths of the sampled servers is bounded by a constant threshold . For this type of errors, we show that, in the large system limit, the benefits of the Pod scheme are retained even if and are arbitrarily large as long as the system is heavily loaded, i.e., is close to 1. In the load-independent error model, the adversary is assumed to be more powerful in that it can cause an error with probability independent of the loads of the sampled servers. For this model, we show that the performance benefits of the Pod scheme are retained only if ; for we show that the stability region of the system reduces and the system performs poorly in comparison to the random scheme. Our mean-field analysis uses a new approach to characterise fixed points which neither have closed form solutions nor admit any recursion. Furthermore, we develop a generic approach to prove tightness and stability for any state-dependent load balancing scheme.
我们考虑一个有 n 台单位速率服务器的系统,在这个系统中,作业是按照速率为 nλ (λ<1) 的泊松过程到达的。在 d≥2 的标准 Power-of-d 或 Pod 方案中,对于每个到达的作业,调度员都会随机均匀地抽样 d 台服务器,并将到达的作业发送给 d 台抽样服务器中负载最小的一台。然而,在实际操作中,负载比较不一定总是准确的。本文分析了有噪声的负载比较对 Pod 方案性能的影响。为了测试 Pod 方案对负载比较误差的鲁棒性,我们假设了一个对抗环境,在出现误差的情况下,对抗者会将接收到的任务分配给最差的服务器,即 d 个采样服务器中负载最大的服务器。我们考虑了两种错误模型:与负载相关的错误和与负载无关的错误。在与负载相关的错误模型中,对手的能力有限,只有当 d 台采样服务器的最小队列长度和最大队列长度之差被一个恒定阈值 g≥0 限定时,它才能以概率ϵ∈[0,1]引发错误。对于这类错误,我们的研究表明,在大系统极限中,只要系统负载较重,即使 g 和ϵ 任意大,Pod 方案的优势依然存在,即在与负载无关的错误模型中,假定对手更强大,因为它能以与采样服务器负载无关的概率 ϵ 引发错误。对于这个模型,我们表明,只有当ϵ≤1/d 时,才能保留 Pod 方案的性能优势;当ϵ>1/d 时,我们表明系统的稳定区域缩小,与随机方案相比,系统性能较差。我们的均值场分析采用了一种新方法来描述固定点的特征,这些固定点既没有封闭形式解,也不允许任何递归。此外,我们还开发了一种通用方法来证明任何与状态相关的负载平衡方案的严密性和稳定性。
{"title":"The impact of load comparison errors on the power-of-d load balancing","authors":"Sanidhay Bhambay , Arpan Mukhopadhyay , Thirupathaiah Vasantam","doi":"10.1016/j.peva.2024.102408","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102408","url":null,"abstract":"<div><p>We consider a system with <span><math><mi>n</mi></math></span> unit-rate servers where jobs arrive according a Poisson process with rate <span><math><mrow><mi>n</mi><mi>λ</mi></mrow></math></span> (<span><math><mrow><mi>λ</mi><mo><</mo><mn>1</mn></mrow></math></span>). In the standard <em>Power-of-</em><span><math><mi>d</mi></math></span> or Pod scheme with <span><math><mrow><mi>d</mi><mo>≥</mo><mn>2</mn></mrow></math></span>, for each incoming job, a dispatcher samples <span><math><mi>d</mi></math></span> servers uniformly at random and sends the incoming job to the least loaded of the <span><math><mi>d</mi></math></span> sampled servers. However, in practice, load comparisons may not always be accurate. In this paper, we analyse the effects of noisy load comparisons on the performance of the Pod scheme. To test the robustness of the Pod scheme against load comparison errors, we assume an adversarial setting where, in the event of an error, the adversary assigns the incoming job to the worst possible server, i.e., the server with the maximum load among the <span><math><mi>d</mi></math></span> sampled servers. We consider two error models: <em>load-dependent</em> and <em>load-independent</em> errors. In the load-dependent error model, the adversary has limited power in that it is able to cause an error with probability <span><math><mrow><mi>ϵ</mi><mo>∈</mo><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow></mrow></math></span> only when the difference in the minimum and the maximum queue lengths of the <span><math><mi>d</mi></math></span> sampled servers is bounded by a constant threshold <span><math><mrow><mi>g</mi><mo>≥</mo><mn>0</mn></mrow></math></span>. For this type of errors, we show that, in the large system limit, the benefits of the Pod scheme are retained even if <span><math><mi>g</mi></math></span> and <span><math><mi>ϵ</mi></math></span> are arbitrarily large as long as the system is heavily loaded, i.e., <span><math><mi>λ</mi></math></span> is close to 1. In the load-independent error model, the adversary is assumed to be more powerful in that it can cause an error with probability <span><math><mi>ϵ</mi></math></span> independent of the loads of the sampled servers. For this model, we show that the performance benefits of the Pod scheme are retained only if <span><math><mrow><mi>ϵ</mi><mo>≤</mo><mn>1</mn><mo>/</mo><mi>d</mi></mrow></math></span>; for <span><math><mrow><mi>ϵ</mi><mo>></mo><mn>1</mn><mo>/</mo><mi>d</mi></mrow></math></span> we show that the stability region of the system reduces and the system performs poorly in comparison to the <em>random scheme</em>. Our mean-field analysis uses a new approach to characterise fixed points which neither have closed form solutions nor admit any recursion. Furthermore, we develop a generic approach to prove tightness and stability for any state-dependent load balancing scheme.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"164 ","pages":"Article 102408"},"PeriodicalIF":2.2,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0166531624000130/pdfft?md5=e219034bb5ef6f93c589b57673e3885d&pid=1-s2.0-S0166531624000130-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139993482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-28DOI: 10.1016/j.peva.2024.102409
Yawen Zheng , Chenji Han , Tingting Zhang , Fuxin Zhang , Jian Wang
As the complexity of processor microarchitecture and applications increases, obtaining performance optimization knowledge, such as critical dependent chains, becomes more challenging. To tackle this issue, this paper employs pattern mining methods to analyze the critical path of processor micro-execution dependence graphs. We propose a high average utility pattern mining algorithm called Dependence Graph Miner (DG-Miner) based on the characteristics of dependence graphs. DG-Miner overcomes the limitations of current pattern mining algorithms for dependence graph pattern mining by offering support for variable utility, candidate generation using endpoint matching, the adjustable upper bound, and the concise pattern judgment mechanism. Experiments reveal that, compared with existing upper bound candidate generation methods, the adjustable upper bound reduces the number of candidate patterns by 28.14% and the running time by 27% on average. The concise pattern judgment mechanism enhances the conciseness of mining results by 16.31% and reduces the running time by 39.82%. Furthermore, DG-Miner aids in identifying critical dependent chains, critical program regions, and performance exceptions.
{"title":"A dependence graph pattern mining method for processor performance analysis","authors":"Yawen Zheng , Chenji Han , Tingting Zhang , Fuxin Zhang , Jian Wang","doi":"10.1016/j.peva.2024.102409","DOIUrl":"https://doi.org/10.1016/j.peva.2024.102409","url":null,"abstract":"<div><p>As the complexity of processor microarchitecture and applications increases, obtaining performance optimization knowledge, such as critical dependent chains, becomes more challenging. To tackle this issue, this paper employs pattern mining methods to analyze the critical path of processor micro-execution dependence graphs. We propose a high average utility pattern mining algorithm called Dependence Graph Miner (DG-Miner) based on the characteristics of dependence graphs. DG-Miner overcomes the limitations of current pattern mining algorithms for dependence graph pattern mining by offering support for variable utility, candidate generation using endpoint matching, the adjustable upper bound, and the concise pattern judgment mechanism. Experiments reveal that, compared with existing upper bound candidate generation methods, the adjustable upper bound reduces the number of candidate patterns by 28.14% and the running time by 27% on average. The concise pattern judgment mechanism enhances the conciseness of mining results by 16.31% and reduces the running time by 39.82%. Furthermore, DG-Miner aids in identifying critical dependent chains, critical program regions, and performance exceptions.</p></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"164 ","pages":"Article 102409"},"PeriodicalIF":2.2,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140014628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}