ACM Transactions on Modeling and Performance Evaluation of Computing Systems最新文献_第2页

On the Cost of Near-Perfect Wear Leveling in Flash-Based SSDs 基于闪存的ssd近乎完美磨损均衡的成本

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2023-01-01 DOI: 10.1145/3576855

B. V. Houdt

引用次数: 0

Pulsed Power Load Coordination in Mission- and Time-critical Cyber-physical Systems 任务和时间关键网络物理系统中的脉冲功率负载协调

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-12-01 DOI: 10.1145/3573197

Tianming Zhao, Wei Li, Bo Qin, Ling Wang, Albert Y. Zomaya

Many mission- and time-critical cyber-physical systems deploy an isolated power system for their power supply. Under extreme conditions, the power system must process critical missions by maximizing the Pulsed Power Load (PPL) utility while maintaining the normal loads in the cyber-physical system. Optimal operation requires careful coordination of PPL deployment and power supply processes. In this work, we formulate the coordination problem for maximizing PPL utility under available resources, capacity, and demand constraints. The coordination problem has two scenarios for different use cases, fixed and general normal loads. We develop an exact pseudo-polynomial time dynamic programming algorithm for each scenario with a proven guarantee to produce an optimal coordination schedule. The performance of the algorithms is also experimentally evaluated, and the results agree with our theoretical analysis, showing the practicality of the solutions.

许多任务和时间关键的网络物理系统为其电源部署了一个隔离的电源系统。在极端条件下，电力系统必须通过最大化脉冲电力负载（PPL）效用来处理关键任务，同时保持网络物理系统中的正常负载。最佳运行需要仔细协调PPL部署和供电过程。在这项工作中，我们制定了在可用资源、容量和需求约束下最大化PPL效用的协调问题。对于不同的用例，协调问题有两种场景，即固定负载和一般正常负载。我们为每个场景开发了一个精确的伪多项式时间动态规划算法，并证明了生成最优协调调度的保证。对算法的性能也进行了实验评估，结果与我们的理论分析一致，表明了解决方案的实用性。

引用次数: 0

On the Analysis and Evaluation of Proximity-based Load-balancing Policies 基于邻近性的负载均衡策略分析与评价

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-07-21 DOI: 10.1145/3549933

Nitish K. Panigrahy, Thirupathaiah Vasantam, P. Basu, D. Towsley, A. Swami, K. Leung

Distributed load balancing is the act of allocating jobs among a set of servers as evenly as possible. The static interpretation of distributed load balancing leads to formulating the load-balancing problem as a classical balls-and-bins problem with jobs (balls) never leaving the system and accumulating at the servers (bins). While most of the previous work in the static setting focus on studying the maximum number of jobs allocated to a server or maximum load, little importance has been given to the implementation cost, or the cost of moving a job/data to/from its allocated server, for such policies. This article designs and evaluates server proximity aware static load-balancing policies with a goal to reduce the implementation cost. We consider a class of proximity aware Power of Two (POT) choice-based assignment policies for allocating jobs to servers, where both jobs and servers are located on a two-dimensional Euclidean plane. In this framework, we investigate the tradeoff between the implementation cost and load-balancing performance of different allocation policies. To this end, we first design and evaluate a Spatial Power of two (sPOT) policy in which each job is allocated to the least loaded server among its two geographically nearest servers. We provide expressions for the lower bound on the asymptotic expected maximum load on the servers and prove that sPOT does not achieve classical POT load-balancing benefits. However, experimental results suggest the efficacy of sPOT with respect to expected implementation cost. We also propose two non-uniform server sampling-based POT policies that achieve the best of both implementation cost and load-balancing performance. We then extend our analysis to the case where servers are interconnected as an n-vertex graph G(S, E). We assume each job arrives at one of the servers, u, chosen uniformly at random from the vertex set S. We then assign each job to the server with minimum load among servers u and v where v is chosen according to one of the following two policies: (i) Unif-POT(k): Sample a server v uniformly at random from k-hop neighborhood of u; (ii) InvSq-POT(k): Sample a server v from k-hop neighborhood of u with probability proportional to the inverse square of the distance between u and v. An extensive simulation over a wide range of topologies validates the efficacy of both the policies. Our simulation results show that both policies consistently produce a load distribution that is much similar to that of a classical POT. Depending on topology, we observe the total variation distance to be of the order of 0.002–0.08 for both the policies while achieving a 8%–99% decrease in implementation cost as compared to the classical POT.

分布式负载平衡是在一组服务器之间尽可能均匀地分配作业的行为。分布式负载平衡的静态解释导致将负载平衡问题公式化为经典的ball和bins问题，其中作业（ball）永远不会离开系统并在服务器（bins）处累积。虽然以前在静态设置中的大多数工作都集中在研究分配给服务器的最大作业数或最大负载，但对于此类策略，实现成本或将作业/数据移动到其分配的服务器/从其移动数据的成本几乎没有被重视。本文设计并评估了服务器邻近感知静态负载平衡策略，旨在降低实现成本。我们考虑了一类基于接近感知二次幂（POT）选择的分配策略，用于将作业分配给服务器，其中作业和服务器都位于二维欧几里得平面上。在这个框架中，我们研究了不同分配策略的实现成本和负载平衡性能之间的权衡。为此，我们首先设计并评估了一种空间二次方（sPOT）策略，在该策略中，每个作业都被分配给其两个地理位置最近的服务器中负载最小的服务器。我们提供了服务器上渐近预期最大负载的下界的表达式，并证明了sPOT没有实现经典的POT负载平衡优势。然而，实验结果表明了sPOT相对于预期实施成本的有效性。我们还提出了两种基于非均匀服务器采样的POT策略，实现了实现成本和负载平衡性能的最佳化。然后，我们将分析扩展到服务器作为n顶点图G（S，E）互连的情况。我们假设每个作业到达从顶点集S随机均匀选择的服务器u之一。然后，我们将每个作业分配给服务器u和v中负载最小的服务器，其中v是根据以下两个策略之一选择的：（i）Unif POT（k）：从u的k跳邻域随机均匀采样服务器v；（ii）InvSq POT（k）：从u的k跳邻域采样服务器v，其概率与u和v之间距离的平方反比。在广泛的拓扑结构上进行的广泛模拟验证了这两种策略的有效性。我们的模拟结果表明，这两种策略一致地产生了与经典POT非常相似的负载分布。根据拓扑结构，我们观察到两种策略的总变化距离在0.002–0.08的数量级，同时与经典POT相比，实现成本降低了8%–99%。

{"title":"On the Analysis and Evaluation of Proximity-based Load-balancing Policies","authors":"Nitish K. Panigrahy, Thirupathaiah Vasantam, P. Basu, D. Towsley, A. Swami, K. Leung","doi":"10.1145/3549933","DOIUrl":"https://doi.org/10.1145/3549933","url":null,"abstract":"Distributed load balancing is the act of allocating jobs among a set of servers as evenly as possible. The static interpretation of distributed load balancing leads to formulating the load-balancing problem as a classical balls-and-bins problem with jobs (balls) never leaving the system and accumulating at the servers (bins). While most of the previous work in the static setting focus on studying the maximum number of jobs allocated to a server or maximum load, little importance has been given to the implementation cost, or the cost of moving a job/data to/from its allocated server, for such policies. This article designs and evaluates server proximity aware static load-balancing policies with a goal to reduce the implementation cost. We consider a class of proximity aware Power of Two (POT) choice-based assignment policies for allocating jobs to servers, where both jobs and servers are located on a two-dimensional Euclidean plane. In this framework, we investigate the tradeoff between the implementation cost and load-balancing performance of different allocation policies. To this end, we first design and evaluate a Spatial Power of two (sPOT) policy in which each job is allocated to the least loaded server among its two geographically nearest servers. We provide expressions for the lower bound on the asymptotic expected maximum load on the servers and prove that sPOT does not achieve classical POT load-balancing benefits. However, experimental results suggest the efficacy of sPOT with respect to expected implementation cost. We also propose two non-uniform server sampling-based POT policies that achieve the best of both implementation cost and load-balancing performance. We then extend our analysis to the case where servers are interconnected as an n-vertex graph G(S, E). We assume each job arrives at one of the servers, u, chosen uniformly at random from the vertex set S. We then assign each job to the server with minimum load among servers u and v where v is chosen according to one of the following two policies: (i) Unif-POT(k): Sample a server v uniformly at random from k-hop neighborhood of u; (ii) InvSq-POT(k): Sample a server v from k-hop neighborhood of u with probability proportional to the inverse square of the distance between u and v. An extensive simulation over a wide range of topologies validates the efficacy of both the policies. Our simulation results show that both policies consistently produce a load distribution that is much similar to that of a classical POT. Depending on topology, we observe the total variation distance to be of the order of 0.002–0.08 for both the policies while achieving a 8%–99% decrease in implementation cost as compared to the classical POT.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 27"},"PeriodicalIF":0.6,"publicationDate":"2022-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42045120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Focused Layered Performance Modelling by Aggregation 基于聚合的集中分层性能建模

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-07-20 DOI: 10.1145/3549539

Farhana Islam, D. Petriu, M. Woodside

Performance models of server systems, based on layered queues, may be very complex. This is particularly true for cloud-based systems based on microservices, which may have hundreds of distinct components, and for models derived by automated data analysis. Often only a few of these many components determine the system performance, and a smaller simplified model is all that is needed. To assist an analyst, this work describes a focused model that includes the important components (the focus) and aggregates the rest in groups, called dependency groups. The method Focus-based Simplification with Preservation of Tasks described here fills an important gap in a previous method by the same authors. The use of focused models for sensitivity predictions is evaluated empirically in the article on a large set of randomly generated models. It is found that the accuracy depends on a “saturation ratio” (SR) between the highest utilization value in the model and the highest value of a component excluded from the focus; evidence suggests that SR must be at least 2 and must be larger to evaluate larger model changes. This dependency was captured in an “Accurate Sensitivity Hypothesis” based on SR, which can be used to indicate trustable sensitivity results.

基于分层队列的服务器系统的性能模型可能非常复杂。对于基于微服务的基于云的系统来说尤其如此，这些系统可能有数百个不同的组件，对于由自动数据分析导出的模型来说更是如此。通常，这些组件中只有少数几个决定系统性能，并且需要更小的简化模型。为了帮助分析人员，这项工作描述了一个集中的模型，该模型包括重要的组件(焦点)，并将其余的组件聚集在组中，称为依赖组。本文描述的基于焦点的任务保存简化方法填补了同一作者先前方法的一个重要空白。本文在一组随机生成的模型上对焦点模型的敏感性预测进行了实证评估。研究发现，模型的精度取决于模型中最高利用率值与被排除在焦点之外的组件的最高值之间的“饱和度比”(SR);有证据表明，SR必须至少为2，并且必须更大才能评估更大的模型变化。这种依赖性在基于SR的“精确灵敏度假设”中被捕获，该假设可用于指示可信赖的灵敏度结果。

{"title":"Focused Layered Performance Modelling by Aggregation","authors":"Farhana Islam, D. Petriu, M. Woodside","doi":"10.1145/3549539","DOIUrl":"https://doi.org/10.1145/3549539","url":null,"abstract":"Performance models of server systems, based on layered queues, may be very complex. This is particularly true for cloud-based systems based on microservices, which may have hundreds of distinct components, and for models derived by automated data analysis. Often only a few of these many components determine the system performance, and a smaller simplified model is all that is needed. To assist an analyst, this work describes a focused model that includes the important components (the focus) and aggregates the rest in groups, called dependency groups. The method Focus-based Simplification with Preservation of Tasks described here fills an important gap in a previous method by the same authors. The use of focused models for sensitivity predictions is evaluated empirically in the article on a large set of randomly generated models. It is found that the accuracy depends on a “saturation ratio” (SR) between the highest utilization value in the model and the highest value of a component excluded from the focus; evidence suggests that SR must be at least 2 and must be larger to evaluate larger model changes. This dependency was captured in an “Accurate Sensitivity Hypothesis” based on SR, which can be used to indicate trustable sensitivity results.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 23"},"PeriodicalIF":0.6,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42589837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Big Winners and Small Losers of Zero-rating 零评级的大赢家和小输家

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-05-30 DOI: 10.1145/3539731

Niloofar Bayat, Richard T. B. Ma, V. Misra, D. Rubenstein

An objective of network neutrality is to design regulations for the Internet and ensure that it remains a public, open platform where innovations can thrive. While there is broad agreement that preserving the content quality of service falls under the purview of net neutrality, the role of differential pricing, especially the practice of zero-rating, remains controversial. Zero-rating refers to the practice of providing free Internet access to some users under certain conditions, which usually concurs with differentiation among users or content providers. Even though some countries (India, Canada) have banned zero-rating, others have either taken no stance or explicitly allowed it (South Africa, Kenya, U.S.). In this article, we model zero-rating between Internet service providers and content providers (CPs) to better understand the conditions under which offering zero-rating is preferred, and who gains in utility. We develop a formulation in which providers’ incomes vary, from low-income startups to high-income incumbents, where their decisions to zero-rate are a variation of the traditional prisoner’s dilemma game. We find that if zero-rating is permitted, low-income CPs often lose utility, whereas high-income CPs often gain utility. We also study the competitiveness of the CP markets via the Herfindahl Index. Our findings suggest that in most cases the introduction of zero-rating reduces competitiveness.

网络中立性的一个目标是为互联网设计规则，并确保它仍然是一个公共的、开放的平台，创新可以蓬勃发展。虽然人们普遍认为，保持服务内容质量属于网络中立性的范畴，但差别定价的作用，特别是零费率的做法，仍然存在争议。零评级是指在一定条件下向部分用户提供免费上网服务的做法，通常与用户或内容提供商之间的差异化相一致。尽管一些国家(印度，加拿大)已经禁止零评级，但其他国家要么没有立场，要么明确允许(南非，肯尼亚，美国)。在本文中，我们对互联网服务提供商和内容提供商(CPs)之间的零评级进行了建模，以更好地理解在哪些条件下更倾向于提供零评级，以及谁在效用中获益。我们开发了一个公式，其中提供者的收入各不相同，从低收入的初创企业到高收入的老牌企业，他们决定零利率是传统囚徒困境游戏的一种变体。我们发现，如果允许零评级，低收入的CPs往往失去效用，而高收入的CPs往往获得效用。我们还通过赫芬达尔指数研究了CP市场的竞争力。我们的研究结果表明，在大多数情况下，零评级的引入降低了竞争力。

{"title":"Big Winners and Small Losers of Zero-rating","authors":"Niloofar Bayat, Richard T. B. Ma, V. Misra, D. Rubenstein","doi":"10.1145/3539731","DOIUrl":"https://doi.org/10.1145/3539731","url":null,"abstract":"An objective of network neutrality is to design regulations for the Internet and ensure that it remains a public, open platform where innovations can thrive. While there is broad agreement that preserving the content quality of service falls under the purview of net neutrality, the role of differential pricing, especially the practice of zero-rating, remains controversial. Zero-rating refers to the practice of providing free Internet access to some users under certain conditions, which usually concurs with differentiation among users or content providers. Even though some countries (India, Canada) have banned zero-rating, others have either taken no stance or explicitly allowed it (South Africa, Kenya, U.S.). In this article, we model zero-rating between Internet service providers and content providers (CPs) to better understand the conditions under which offering zero-rating is preferred, and who gains in utility. We develop a formulation in which providers’ incomes vary, from low-income startups to high-income incumbents, where their decisions to zero-rate are a variation of the traditional prisoner’s dilemma game. We find that if zero-rating is permitted, low-income CPs often lose utility, whereas high-income CPs often gain utility. We also study the competitiveness of the CP markets via the Herfindahl Index. Our findings suggest that in most cases the introduction of zero-rating reduces competitiveness.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 24"},"PeriodicalIF":0.6,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42152848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PMU-Events-Driven DVFS Techniques for Improving Energy Efficiency of Modern Processors 提高现代处理器能效的pmu -事件驱动DVFS技术

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-05-25 DOI: 10.1145/3538645

Ranjan Hebbar, A. Milenković

This paper describes the results of our measurement-based study, conducted on an Intel Core i7 processor running the SPEC CPU2017 benchmark suites, that evaluates the impact of dynamic voltage frequency scaling (DVFS) on performance (P), energy efficiency (EE), and their product (PxEE). The results indicate that the default DVFS-based power management techniques heavily favor performance, resulting in poor energy efficiency. To remedy this problem, we introduce, implement, and evaluate four DVFS-based power management techniques driven by the following metrics derived from the processor's performance monitoring unit: (i) the total pipeline slot stall ratio (FS-PS), (ii) the total cycle stall ratio (FS-TS), (iii) the total memory-related cycle stall ratio (FS-MS), and (iv) the number of last level cache misses per kilo instructions (FS-LLCM). The proposed techniques linearly map these metrics onto the available processor clock frequencies. The experimental evaluation results show that the proposed techniques significantly improve EE and PxEE metrics compared to the existing approaches. Specifically, EE improves from 44% to 92%, and PxEE improves from 31% to 48% when all the benchmarks are considered together. Furthermore, we find that the proposed techniques are particularly effective for a class of memory-intensive benchmarks – they improve EE from 121% to 183% and PxEE from 100% to 141%. Finally, we elucidate the advantages and disadvantages of each of the proposed techniques and offer recommendations on using them.

本文描述了我们在运行SPEC CPU2017基准套件的英特尔酷睿i7处理器上进行的基于测量的研究结果，该研究评估了动态电压频率缩放(DVFS)对性能(P)，能效(EE)及其产品(PxEE)的影响。结果表明，默认的基于dvfs的电源管理技术严重倾向于性能，导致较差的能源效率。为了解决这个问题，我们引入、实施并评估了四种基于dvfs的电源管理技术，这些技术由以下指标驱动，这些指标来自处理器的性能监控单元:(i)总管道槽失速比(FS-PS)， (ii)总周期失速比(FS-TS)， (iii)与内存相关的总周期失速比(FS-MS)，以及(iv)每千克指令的最后一级缓存丢失次数(FS-LLCM)。所提出的技术将这些指标线性映射到可用的处理器时钟频率上。实验评估结果表明，与现有方法相比，所提出的技术显著提高了EE和PxEE指标。具体来说，当综合考虑所有基准测试时，EE从44%提高到92%，而PxEE从31%提高到48%。此外，我们发现所提出的技术对于一类内存密集型基准测试特别有效——它们将EE从121%提高到183%，将PxEE从100%提高到141%。最后，我们阐明了每种提出的技术的优点和缺点，并提供了使用它们的建议。

{"title":"PMU-Events-Driven DVFS Techniques for Improving Energy Efficiency of Modern Processors","authors":"Ranjan Hebbar, A. Milenković","doi":"10.1145/3538645","DOIUrl":"https://doi.org/10.1145/3538645","url":null,"abstract":"This paper describes the results of our measurement-based study, conducted on an Intel Core i7 processor running the SPEC CPU2017 benchmark suites, that evaluates the impact of dynamic voltage frequency scaling (DVFS) on performance (P), energy efficiency (EE), and their product (PxEE). The results indicate that the default DVFS-based power management techniques heavily favor performance, resulting in poor energy efficiency. To remedy this problem, we introduce, implement, and evaluate four DVFS-based power management techniques driven by the following metrics derived from the processor's performance monitoring unit: (i) the total pipeline slot stall ratio (FS-PS), (ii) the total cycle stall ratio (FS-TS), (iii) the total memory-related cycle stall ratio (FS-MS), and (iv) the number of last level cache misses per kilo instructions (FS-LLCM). The proposed techniques linearly map these metrics onto the available processor clock frequencies. The experimental evaluation results show that the proposed techniques significantly improve EE and PxEE metrics compared to the existing approaches. Specifically, EE improves from 44% to 92%, and PxEE improves from 31% to 48% when all the benchmarks are considered together. Furthermore, we find that the proposed techniques are particularly effective for a class of memory-intensive benchmarks – they improve EE from 121% to 183% and PxEE from 100% to 141%. Finally, we elucidate the advantages and disadvantages of each of the proposed techniques and offer recommendations on using them.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"7 1","pages":"1 - 31"},"PeriodicalIF":0.6,"publicationDate":"2022-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47197635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

PathTracer: Understanding Response Time of Signal Processing Applications on Heterogeneous MPSoCs PathTracer:了解异构MPSoC上信号处理应用程序的响应时间

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-02-14 DOI: 10.1145/3513003

Claudio Rubattu, F. Palumbo, S. Bhattacharyya, M. Pelcat

In embedded and cyber-physical systems, the design of a desired functionality under constraints increasingly requires parallel execution of a set of tasks on a heterogeneous architecture. The nature of such parallel systems complicates the process of understanding and predicting performance in terms of response time. Indeed, response time depends on many factors related to both the functionality and the target architecture. State-of-the-art strategies derive response time by examining the operations required by each task for both processing and accessing shared resources. This procedure is often followed by the addition or elimination of potential interference due to task concurrency. However, such approaches require an advanced knowledge of the software and hardware details, rarely available in practice. This work presents an alternative “top-down” strategy, called PathTracer, aimed at understanding software response time and extending the cases in which it can be analyzed and estimated. PathTracer leverages on dataflow-based application representation and response time estimation of signal processing applications mapped on heterogeneous Multiprocessor Systems-on-a-Chip (MPSoCs). Experimental results demonstrate that PathTracer provides (i) information on the nature of the application (work-dominated, span-dominated, or balanced parallel), and (ii) response time modeling which can reach high accuracy when performed post-execution, leading to prediction errors with average and standard deviation under 5% and 3% respectively.

在嵌入式和网络物理系统中，在约束下设计期望的功能越来越需要在异构架构上并行执行一组任务。这种并行系统的性质使理解和预测响应时间方面的性能的过程变得复杂。实际上，响应时间取决于与功能和目标体系结构相关的许多因素。最先进的策略通过检查每个任务处理和访问共享资源所需的操作来获得响应时间。在此过程之后，通常会添加或消除由于任务并发性而产生的潜在干扰。然而，这种方法需要对软件和硬件细节有深入的了解，而这在实践中很少能得到。这项工作提出了另一种“自上而下”的策略，称为PathTracer，旨在理解软件响应时间，并扩展可以分析和估计的情况。PathTracer利用基于数据流的应用程序表示和映射在异构多处理器单片系统(mpsoc)上的信号处理应用程序的响应时间估计。实验结果表明，PathTracer提供了(i)关于应用程序性质的信息(工作主导、跨度主导或平衡并行)，以及(ii)响应时间建模，在执行后执行时可以达到很高的精度，导致平均和标准偏差分别在5%和3%以下。

{"title":"PathTracer: Understanding Response Time of Signal Processing Applications on Heterogeneous MPSoCs","authors":"Claudio Rubattu, F. Palumbo, S. Bhattacharyya, M. Pelcat","doi":"10.1145/3513003","DOIUrl":"https://doi.org/10.1145/3513003","url":null,"abstract":"In embedded and cyber-physical systems, the design of a desired functionality under constraints increasingly requires parallel execution of a set of tasks on a heterogeneous architecture. The nature of such parallel systems complicates the process of understanding and predicting performance in terms of response time. Indeed, response time depends on many factors related to both the functionality and the target architecture. State-of-the-art strategies derive response time by examining the operations required by each task for both processing and accessing shared resources. This procedure is often followed by the addition or elimination of potential interference due to task concurrency. However, such approaches require an advanced knowledge of the software and hardware details, rarely available in practice. This work presents an alternative “top-down” strategy, called PathTracer, aimed at understanding software response time and extending the cases in which it can be analyzed and estimated. PathTracer leverages on dataflow-based application representation and response time estimation of signal processing applications mapped on heterogeneous Multiprocessor Systems-on-a-Chip (MPSoCs). Experimental results demonstrate that PathTracer provides (i) information on the nature of the application (work-dominated, span-dominated, or balanced parallel), and (ii) response time modeling which can reach high accuracy when performed post-execution, leading to prediction errors with average and standard deviation under 5% and 3% respectively.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 30"},"PeriodicalIF":0.6,"publicationDate":"2022-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44899266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MDP-based Network Friendly Recommendations 基于mdp的网络友好推荐

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2022-02-11 DOI: 10.1145/3513131

Theodoros Giannakas, A. Giovanidis, T. Spyropoulos

Controlling the network cost by delivering popular content to users, as well as improving streaming quality and overall user experience, have been key goals for content providers (CP) in recent years. While proposals to improve performance, through caching or other mechanisms (DASH, multicasting, etc.) abound, recent works have proposed to turn the problem on its head and complement such efforts. Instead of trying to reduce the cost to deliver every possible content to a user, a potentially very expensive endeavour, one could leverage omnipresent recommendations systems to nudge users towards the content of low(er) network cost, regardless of where this cost is coming from. In this paper, we focus on this latter problem, namely optimal policies for “Network Friendly Recommendations” (NFR). A key contribution is the use of a Markov Decision Process (MDP) framework that offers significant advantages, compared to existing works, in terms of both modeling flexibility as well as computational efficiency. Specifically we show that this framework subsumes some state-of-the-art approaches, and can also optimally tackle additional, more sophisticated setups. We validate our findings with real traces that suggest up to almost 2X in cost performance, and 10X in computational speed-up compared to recent state-of-the-art works.

近年来，通过向用户提供流行内容来控制网络成本，以及提高流媒体质量和整体用户体验，一直是内容提供商(CP)的主要目标。虽然通过缓存或其他机制(DASH、多播等)提高性能的建议很多，但最近的工作已经提出要彻底解决这个问题，并补充这些努力。与其试图降低向用户提供所有可能的内容的成本，这可能是一项非常昂贵的努力，不如利用无所不在的推荐系统，将用户推向网络成本较低的内容，而不管这些成本来自何处。在本文中，我们关注后一个问题，即“网络友好推荐”(NFR)的最优策略。一个关键的贡献是使用了马尔可夫决策过程(MDP)框架，与现有的工作相比，它在建模灵活性和计算效率方面都具有显著的优势。具体来说，我们表明该框架包含了一些最先进的方法，并且还可以最佳地处理额外的，更复杂的设置。我们用真实的痕迹验证了我们的发现，与最近最先进的作品相比，我们的性价比提高了近2倍，计算速度提高了10倍。

{"title":"MDP-based Network Friendly Recommendations","authors":"Theodoros Giannakas, A. Giovanidis, T. Spyropoulos","doi":"10.1145/3513131","DOIUrl":"https://doi.org/10.1145/3513131","url":null,"abstract":"Controlling the network cost by delivering popular content to users, as well as improving streaming quality and overall user experience, have been key goals for content providers (CP) in recent years. While proposals to improve performance, through caching or other mechanisms (DASH, multicasting, etc.) abound, recent works have proposed to turn the problem on its head and complement such efforts. Instead of trying to reduce the cost to deliver every possible content to a user, a potentially very expensive endeavour, one could leverage omnipresent recommendations systems to nudge users towards the content of low(er) network cost, regardless of where this cost is coming from. In this paper, we focus on this latter problem, namely optimal policies for “Network Friendly Recommendations” (NFR). A key contribution is the use of a Markov Decision Process (MDP) framework that offers significant advantages, compared to existing works, in terms of both modeling flexibility as well as computational efficiency. Specifically we show that this framework subsumes some state-of-the-art approaches, and can also optimally tackle additional, more sophisticated setups. We validate our findings with real traces that suggest up to almost 2X in cost performance, and 10X in computational speed-up compared to recent state-of-the-art works.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"13 3","pages":"1 - 29"},"PeriodicalIF":0.6,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41294070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Modeling Communication over Terrain for Realistic Simulation of Outdoor Sensor Network Deployments 用于户外传感器网络部署真实仿真的地形通信建模

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2021-12-31 DOI: 10.1145/3510306

Sam Mansfield, K. Veenstra, K. Obraczka

Popular wireless network simulators have few available propagation models for outdoor Internet of Things applications. Of the available models, only a handful use real terrain data, yet an inaccurate propagation model can skew the results of simulations. In this article, we present TerrainLOS, a low-overhead propagation model for outdoor Internet of Things applications that uses real terrain data to determine whether two nodes can communicate. To the best of our knowledge, TerrainLOS is the first terrain-aware propagation model that specifically targets outdoor IoT deployments and that uses height maps to represent terrain. In addition, we present a new terrain classification method based on terrain “roughness,” which allows us to select a variety of terrain samples to demonstrate how TerrainLOS can capture the effects of terrain on communication. We also propose a technique to generate synthetic terrain samples based on “roughness.” Furthermore, we implemented TerrainLOS in the COOJA-Contiki network simulation/emulation platform, which targets IoT deployments and uses TerrainLOS to evaluate how often a network is fully connected based on the roughness of terrain, as well as how two popular power-aware routing protocols, RPL and ORPL, perform when terrain is considered.

目前流行的无线网络模拟器很少有可用于户外物联网应用的传播模型。在现有的模型中，只有少数使用真实地形数据，而不准确的传播模型可能会影响模拟结果。在本文中，我们提出了TerrainLOS，这是一种用于户外物联网应用的低开销传播模型，它使用真实地形数据来确定两个节点是否可以通信。据我们所知，TerrainLOS是第一个专门针对室外物联网部署的地形感知传播模型，并使用高度图来表示地形。此外，我们提出了一种基于地形“粗糙度”的新的地形分类方法，这使我们能够选择各种地形样本来演示TerrainLOS如何捕捉地形对通信的影响。我们还提出了一种基于“粗糙度”生成合成地形样本的技术。此外，我们在COOJA-Contiki网络仿真/仿真平台中实现了TerrainLOS，该平台针对物联网部署，并使用TerrainLOS来评估基于地形粗糙度的网络完全连接的频率，以及考虑地形时两种流行的功率感知路由协议RPL和ORPL的执行情况。

{"title":"Modeling Communication over Terrain for Realistic Simulation of Outdoor Sensor Network Deployments","authors":"Sam Mansfield, K. Veenstra, K. Obraczka","doi":"10.1145/3510306","DOIUrl":"https://doi.org/10.1145/3510306","url":null,"abstract":"Popular wireless network simulators have few available propagation models for outdoor Internet of Things applications. Of the available models, only a handful use real terrain data, yet an inaccurate propagation model can skew the results of simulations. In this article, we present TerrainLOS, a low-overhead propagation model for outdoor Internet of Things applications that uses real terrain data to determine whether two nodes can communicate. To the best of our knowledge, TerrainLOS is the first terrain-aware propagation model that specifically targets outdoor IoT deployments and that uses height maps to represent terrain. In addition, we present a new terrain classification method based on terrain “roughness,” which allows us to select a variety of terrain samples to demonstrate how TerrainLOS can capture the effects of terrain on communication. We also propose a technique to generate synthetic terrain samples based on “roughness.” Furthermore, we implemented TerrainLOS in the COOJA-Contiki network simulation/emulation platform, which targets IoT deployments and uses TerrainLOS to evaluate how often a network is fully connected based on the roughness of terrain, as well as how two popular power-aware routing protocols, RPL and ORPL, perform when terrain is considered.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 22"},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47320693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Adversarial Deep Learning for Online Resource Allocation 用于在线资源分配的对抗性深度学习

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems

Pub Date : 2021-11-19 DOI: 10.1145/3494526

Bingqian Du, Zhiyi Huang, Chuan Wu

Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.

在线算法是算法设计中的一个重要分支。设计具有有界竞争比（就最坏情况下的性能而言）的在线算法可能很困难，并且通常依赖于特定于问题的假设。受生成对抗性网络的对抗性训练以及在线算法的竞争比基于最坏情况输入的事实的启发，我们采用深度神经网络（NN）从头开始学习资源分配和定价问题的在线算法，目标是对于最坏情况的输入，可以最小化离线最优算法和学习的在线算法之间的性能差距。具体来说，我们分别利用两个NN作为算法和对手，让它们玩零和游戏，对手负责生成最坏情况的输入，而算法则根据对手提供的输入学习最佳策略。为了确保算法网络更好地收敛（到所需的在线算法），我们提出了一种新的每轮更新方法来处理顺序决策，以打破不同轮之间的复杂依赖关系，从而可以对每一个可能的动作进行更新，而不仅仅是采样动作。据我们所知，我们的工作是首次使用深度神经网络从最坏情况性能保证的角度设计在线算法。实证研究表明，我们的更新方法确保了收敛到纳什均衡，并且在各种设置下，所学习的算法优于最先进的在线算法。

{"title":"Adversarial Deep Learning for Online Resource Allocation","authors":"Bingqian Du, Zhiyi Huang, Chuan Wu","doi":"10.1145/3494526","DOIUrl":"https://doi.org/10.1145/3494526","url":null,"abstract":"Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 25"},"PeriodicalIF":0.6,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43210152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4