ACM Transactions on Computer Systems (TOCS)最新文献

Boosting Inter-process Communication with Architectural Support 通过架构支持促进进程间通信

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-05-09 DOI: 10.1145/3532861

Yubin Xia, Dong Du, Zhichao Hua, B. Zang, Haibo Chen, Haibing Guan

IPC (inter-process communication) is a critical mechanism for modern OSes, including not only microkernels such as seL4, QNX, and Fuchsia where system functionalities are deployed in user-level processes, but also monolithic kernels like Android where apps frequently communicate with plenty of user-level services. However, existing IPC mechanisms still suffer from long latency. Previous software optimizations of IPC usually cannot bypass the kernel that is responsible for domain switching and message copying/remapping across different address spaces; hardware solutions such as tagged memory or capability replace page tables for isolation, but usually require non-trivial modification to existing software stack to adapt to the new hardware primitives. In this article, we propose a hardware-assisted OS primitive, XPC (Cross Process Call), for efficient and secure synchronous IPC. XPC enables direct switch between IPC caller and callee without trapping into the kernel and supports secure message passing across multiple processes without copying. We have implemented a prototype of XPC based on the ARM AArch64 with Gem5 simulator and RISC-V architecture with FPGA boards. The evaluation shows that XPC can reduce IPC call latency from 664 to 21 cycles, 14×–123× improvement on Android Binder (ARM), and improve the performance of real-world applications on microkernels by 1.6× on Sqlite3.

IPC(进程间通信)是现代操作系统的关键机制，不仅包括像seL4、QNX和Fuchsia这样的微内核，其中系统功能部署在用户级进程中，还包括像Android这样的单片内核，其中应用程序经常与大量用户级服务进行通信。但是，现有的IPC机制仍然存在长延迟的问题。以前的IPC软件优化通常不能绕过负责跨不同地址空间的域切换和消息复制/重新映射的内核;硬件解决方案(如标记内存或功能)取代页表来实现隔离，但通常需要对现有软件堆栈进行重大修改，以适应新的硬件原语。在本文中，我们提出了一个硬件辅助的操作系统原语，XPC(跨进程调用)，用于高效和安全的同步IPC。XPC支持在IPC调用者和被调用者之间直接切换，而不会陷入内核，并支持跨多个进程传递安全消息，而无需复制。我们实现了一个基于ARM AArch64的XPC原型机，采用Gem5模拟器和基于FPGA的RISC-V架构。评估表明，XPC可以将IPC调用延迟从664个周期减少到21个周期，在Android Binder (ARM)上提高14×-123×，在Sqlite3上将实际应用程序在微内核上的性能提高1.6倍。

{"title":"Boosting Inter-process Communication with Architectural Support","authors":"Yubin Xia, Dong Du, Zhichao Hua, B. Zang, Haibo Chen, Haibing Guan","doi":"10.1145/3532861","DOIUrl":"https://doi.org/10.1145/3532861","url":null,"abstract":"IPC (inter-process communication) is a critical mechanism for modern OSes, including not only microkernels such as seL4, QNX, and Fuchsia where system functionalities are deployed in user-level processes, but also monolithic kernels like Android where apps frequently communicate with plenty of user-level services. However, existing IPC mechanisms still suffer from long latency. Previous software optimizations of IPC usually cannot bypass the kernel that is responsible for domain switching and message copying/remapping across different address spaces; hardware solutions such as tagged memory or capability replace page tables for isolation, but usually require non-trivial modification to existing software stack to adapt to the new hardware primitives. In this article, we propose a hardware-assisted OS primitive, XPC (Cross Process Call), for efficient and secure synchronous IPC. XPC enables direct switch between IPC caller and callee without trapping into the kernel and supports secure message passing across multiple processes without copying. We have implemented a prototype of XPC based on the ARM AArch64 with Gem5 simulator and RISC-V architecture with FPGA boards. The evaluation shows that XPC can reduce IPC call latency from 664 to 21 cycles, 14×–123× improvement on Android Binder (ARM), and improve the performance of real-world applications on microkernels by 1.6× on Sqlite3.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128802265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing H-Container:在边缘计算中启用异构isa容器迁移

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-03-28 DOI: 10.1145/3524452

Tong Xing, A. Barbalace, Pierre Olivier, M. L. Karaoui, Wen Wang, B. Ravindran

Edge computing is a recent computing paradigm that brings cloud services closer to the client. Among other features, edge computing offers extremely low client/server latencies. To consistently provide such low latencies, services should run on edge nodes that are physically as close as possible to their clients. Thus, when the physical location of a client changes, a service should migrate between edge nodes to maintain proximity. Differently from cloud nodes, edge nodes integrate CPUs of different Instruction Set Architectures (ISAs), hence a program natively compiled for a given ISA cannot migrate to a server equipped with a CPU of a different ISA. This hinders migration to the closest node. We introduce H-Container, a system that migrates natively compiled containerized applications across compute nodes featuring CPUs of different ISAs. H-Container advances over existing heterogeneous-ISA migration systems by being (a) highly compatible – no user’s source-code nor compiler toolchain modifications are needed; (b) easily deployable – fully implemented in user space, thus without any OS or hypervisor dependency, and (c) largely Linux-compliant – it can migrate most Linux software, including server applications and dynamically linked binaries. H-Container targets Linux and its already-compiled executables, adopts LLVM, extends CRIU, and integrates with Docker. Experiments demonstrate that H-Container adds no overheads during program execution, while 10–100 ms are added during migration. Furthermore, we show the benefits of H-Container in real-world scenarios, demonstrating, for example, up to 94% increase in Redis throughput when client/server proximity is maintained through heterogeneous container migration.

边缘计算是一种最新的计算范式，它使云服务更接近客户端。在其他特性中，边缘计算提供极低的客户机/服务器延迟。为了始终如一地提供如此低的延迟，服务应该在物理上尽可能靠近其客户端的边缘节点上运行。因此，当客户机的物理位置发生变化时，服务应该在边缘节点之间迁移以保持邻近性。与云节点不同，边缘节点集成了不同ISA (Instruction Set Architectures)的CPU，因此针对某个ISA本地编译的程序无法迁移到配备不同ISA CPU的服务器上。这阻碍了迁移到最近的节点。我们介绍H-Container，这是一个系统，可以在具有不同isa cpu的计算节点之间迁移本地编译的容器化应用程序。H-Container在现有异构isa迁移系统上的进步是:(a)高度兼容——不需要用户的源代码或编译器工具链修改;(b)易于部署——完全在用户空间中实现，因此没有任何操作系统或管理程序依赖;(c)很大程度上与Linux兼容——它可以迁移大多数Linux软件，包括服务器应用程序和动态链接的二进制文件。H-Container以Linux及其已编译的可执行文件为目标，采用LLVM，扩展CRIU，并与Docker集成。实验表明，H-Container在程序执行期间没有增加开销，而在迁移期间增加了10-100 ms。此外，我们还展示了H-Container在实际场景中的优势，例如，当客户端/服务器通过异构容器迁移保持接近时，Redis吞吐量增加高达94%。

{"title":"H-Container: Enabling Heterogeneous-ISA Container Migration in Edge Computing","authors":"Tong Xing, A. Barbalace, Pierre Olivier, M. L. Karaoui, Wen Wang, B. Ravindran","doi":"10.1145/3524452","DOIUrl":"https://doi.org/10.1145/3524452","url":null,"abstract":"Edge computing is a recent computing paradigm that brings cloud services closer to the client. Among other features, edge computing offers extremely low client/server latencies. To consistently provide such low latencies, services should run on edge nodes that are physically as close as possible to their clients. Thus, when the physical location of a client changes, a service should migrate between edge nodes to maintain proximity. Differently from cloud nodes, edge nodes integrate CPUs of different Instruction Set Architectures (ISAs), hence a program natively compiled for a given ISA cannot migrate to a server equipped with a CPU of a different ISA. This hinders migration to the closest node. We introduce H-Container, a system that migrates natively compiled containerized applications across compute nodes featuring CPUs of different ISAs. H-Container advances over existing heterogeneous-ISA migration systems by being (a) highly compatible – no user’s source-code nor compiler toolchain modifications are needed; (b) easily deployable – fully implemented in user space, thus without any OS or hypervisor dependency, and (c) largely Linux-compliant – it can migrate most Linux software, including server applications and dynamically linked binaries. H-Container targets Linux and its already-compiled executables, adopts LLVM, extends CRIU, and integrates with Docker. Experiments demonstrate that H-Container adds no overheads during program execution, while 10–100 ms are added during migration. Furthermore, we show the benefits of H-Container in real-world scenarios, demonstrating, for example, up to 94% increase in Redis throughput when client/server proximity is maintained through heterogeneous container migration.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128995993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

ROME: All Overlays Lead to Aggregation, but Some Are Faster than Others 罗马:所有的叠加都会导致聚合，但有些会比其他更快

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-03-16 DOI: 10.1145/3516430

Marcel Blöcher, Emilio Coppa, Pascal Kleber, P. Eugster, W. Culhane, Masoud Saeida Ardekani

Aggregation is common in data analytics and crucial to distilling information from large datasets, but current data analytics frameworks do not fully exploit the potential for optimization in such phases. The lack of optimization is particularly notable in current “online” approaches that store data in main memory across nodes, shifting the bottleneck away from disk I/O toward network and compute resources, thus increasing the relative performance impact of distributed aggregation phases. We present ROME, an aggregation system for use within data analytics frameworks or in isolation. ROME uses a set of novel heuristics based primarily on basic knowledge of aggregation functions combined with deployment constraints to efficiently aggregate results from computations performed on individual data subsets across nodes (e.g., merging sorted lists resulting from top-k). The user can either provide minimal information that allows our heuristics to be applied directly, or ROME can autodetect the relevant information at little cost. We integrated ROME as a subsystem into the Spark and Flink data analytics frameworks. We use real-world data to experimentally demonstrate speedups up to 3× over single-level aggregation overlays, up to 21% over other multi-level overlays, and 50% for iterative algorithms like gradient descent at 100 iterations.

聚合在数据分析中很常见，对于从大型数据集中提取信息至关重要，但当前的数据分析框架并没有充分利用这些阶段的优化潜力。当前的“在线”方法将数据跨节点存储在主存中，将瓶颈从磁盘I/O转移到网络和计算资源，从而增加了分布式聚合阶段的相对性能影响，这种方法尤其缺乏优化。我们提出了ROME，一个用于数据分析框架或单独使用的聚合系统。ROME使用一组新颖的启发式方法，主要基于聚合函数的基本知识，结合部署约束，有效地聚合跨节点对单个数据子集执行的计算结果(例如，合并由top-k产生的排序列表)。用户可以提供最少的信息，以便直接应用我们的启发式算法，或者ROME可以以很少的成本自动检测相关信息。我们将ROME作为一个子系统集成到Spark和Flink数据分析框架中。我们使用真实世界的数据来实验证明，与单级聚合叠加相比，加速高达3倍，与其他多级叠加相比，加速高达21%，对于像梯度下降这样的迭代算法，在100次迭代中加速高达50%。

{"title":"ROME: All Overlays Lead to Aggregation, but Some Are Faster than Others","authors":"Marcel Blöcher, Emilio Coppa, Pascal Kleber, P. Eugster, W. Culhane, Masoud Saeida Ardekani","doi":"10.1145/3516430","DOIUrl":"https://doi.org/10.1145/3516430","url":null,"abstract":"Aggregation is common in data analytics and crucial to distilling information from large datasets, but current data analytics frameworks do not fully exploit the potential for optimization in such phases. The lack of optimization is particularly notable in current “online” approaches that store data in main memory across nodes, shifting the bottleneck away from disk I/O toward network and compute resources, thus increasing the relative performance impact of distributed aggregation phases. We present ROME, an aggregation system for use within data analytics frameworks or in isolation. ROME uses a set of novel heuristics based primarily on basic knowledge of aggregation functions combined with deployment constraints to efficiently aggregate results from computations performed on individual data subsets across nodes (e.g., merging sorted lists resulting from top-k). The user can either provide minimal information that allows our heuristics to be applied directly, or ROME can autodetect the relevant information at little cost. We integrated ROME as a subsystem into the Spark and Flink data analytics frameworks. We use real-world data to experimentally demonstrate speedups up to 3× over single-level aggregation overlays, up to 21% over other multi-level overlays, and 50% for iterative algorithms like gradient descent at 100 iterations.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115737223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Role of Compute in Autonomous Micro Aerial Vehicles: Optimizing for Mission Time and Energy Efficiency 计算在自主微型飞行器中的作用:优化任务时间和能量效率

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-02-14 DOI: 10.1145/3511210

Behzad Boroujerdian, Hasan Genç, Srivatsan Krishnan, B. P. Duisterhof, Brian Plancher, Kayvan Mansoorshahi, M. Almeida, Wenzhi Cui, Aleksandra Faust, V. Reddi

Autonomous and mobile cyber-physical machines are becoming an inevitable part of our future. In particular, Micro Aerial Vehicles (MAVs) have seen a resurgence in activity. With multiple use cases, such as surveillance, search and rescue, package delivery, and more, these unmanned aerial systems are on the cusp of demonstrating their full potential. Despite such promises, these systems face many challenges, one of the most prominent of which is their low endurance caused by their limited onboard energy. Since the success of a mission depends on whether the drone can finish it within such duration and before it runs out of battery, improving both the time and energy associated with the mission are of high importance. Such improvements have traditionally been arrived at through the use of better algorithms. But our premise is that more powerful and efficient onboard compute can also address the problem. In this article, we investigate how the compute subsystem, in a cyber-physical mobile machine such as a Micro Aerial Vehicle, can impact mission time (time to complete a mission) and energy. Specifically, we pose the question as what is the role of computing for cyber-physical mobile robots? We show that compute and motion are tightly intertwined, and as such a close examination of cyber and physical processes and their impact on one another is necessary. We show different “impact paths” through which compute impacts mission metrics and examine them using a combination of analytical models, simulation, and micro and end-to-end benchmarking. To enable similar studies, we open sourced MAVBench, our tool-set, which consists of (1) a closed-loop real-time feedback simulator and (2) an end-to-end benchmark suite composed of state-of-the-art kernels. By combining MAVBench, analytical modeling, and an understanding of various compute impacts, we show up to 2X and 1.8X improvements for mission time and mission energy for two optimization case studies, respectively. Our investigations, as well as our optimizations, show that cyber-physical co-design, a methodology with which both the cyber and physical processes/quantities of the robot are developed with consideration of one another, similar to hardware-software co-design, is necessary for arriving at the design of the optimal robot.

自主和移动的网络物理机器正在成为我们未来不可避免的一部分。特别是，微型飞行器(MAVs)的活动已经复苏。随着监视、搜索和救援、包裹递送等多个用例的出现，这些无人机系统正处于展示其全部潜力的尖端。尽管有这样的承诺，但这些系统面临着许多挑战，其中最突出的是由有限的机载能量引起的低续航能力。由于任务的成功取决于无人机能否在这样的时间内完成任务，并且在电池耗尽之前完成任务，因此提高与任务相关的时间和能量非常重要。传统上，这种改进是通过使用更好的算法来实现的。但我们的前提是，更强大、更高效的机载计算也可以解决这个问题。在本文中，我们研究了计算子系统在网络物理移动机器(如微型飞行器)中如何影响任务时间(完成任务的时间)和能量。具体来说，我们提出的问题是计算在网络物理移动机器人中的作用是什么?我们表明计算和运动是紧密交织在一起的，因此对网络和物理过程及其相互影响的密切检查是必要的。我们展示了不同的“影响路径”，通过计算影响任务指标，并使用分析模型、仿真、微观和端到端基准测试的组合来检查它们。为了进行类似的研究，我们开源了MAVBench，我们的工具集，它包括(1)一个闭环实时反馈模拟器和(2)一个由最先进的内核组成的端到端基准套件。通过结合MAVBench、分析建模和对各种计算影响的理解，我们在两个优化案例研究中分别显示了任务时间和任务能量的2倍和1.8倍的改进。我们的研究和优化表明，网络-物理协同设计是一种方法，机器人的网络和物理过程/数量都是相互考虑的，类似于硬件-软件协同设计，对于达到最佳机器人的设计是必要的。

{"title":"The Role of Compute in Autonomous Micro Aerial Vehicles: Optimizing for Mission Time and Energy Efficiency","authors":"Behzad Boroujerdian, Hasan Genç, Srivatsan Krishnan, B. P. Duisterhof, Brian Plancher, Kayvan Mansoorshahi, M. Almeida, Wenzhi Cui, Aleksandra Faust, V. Reddi","doi":"10.1145/3511210","DOIUrl":"https://doi.org/10.1145/3511210","url":null,"abstract":"Autonomous and mobile cyber-physical machines are becoming an inevitable part of our future. In particular, Micro Aerial Vehicles (MAVs) have seen a resurgence in activity. With multiple use cases, such as surveillance, search and rescue, package delivery, and more, these unmanned aerial systems are on the cusp of demonstrating their full potential. Despite such promises, these systems face many challenges, one of the most prominent of which is their low endurance caused by their limited onboard energy. Since the success of a mission depends on whether the drone can finish it within such duration and before it runs out of battery, improving both the time and energy associated with the mission are of high importance. Such improvements have traditionally been arrived at through the use of better algorithms. But our premise is that more powerful and efficient onboard compute can also address the problem. In this article, we investigate how the compute subsystem, in a cyber-physical mobile machine such as a Micro Aerial Vehicle, can impact mission time (time to complete a mission) and energy. Specifically, we pose the question as what is the role of computing for cyber-physical mobile robots? We show that compute and motion are tightly intertwined, and as such a close examination of cyber and physical processes and their impact on one another is necessary. We show different “impact paths” through which compute impacts mission metrics and examine them using a combination of analytical models, simulation, and micro and end-to-end benchmarking. To enable similar studies, we open sourced MAVBench, our tool-set, which consists of (1) a closed-loop real-time feedback simulator and (2) an end-to-end benchmark suite composed of state-of-the-art kernels. By combining MAVBench, analytical modeling, and an understanding of various compute impacts, we show up to 2X and 1.8X improvements for mission time and mission energy for two optimization case studies, respectively. Our investigations, as well as our optimizations, show that cyber-physical co-design, a methodology with which both the cyber and physical processes/quantities of the robot are developed with consideration of one another, similar to hardware-software co-design, is necessary for arriving at the design of the optimal robot.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"5 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125614718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous Nodes OpenMP运行时跨缓存不一致异构节点的透明工作共享

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-02-04 DOI: 10.1145/3505224

Robert Lyerly, Carlos Bilbao, Changwoo Min, C. Rossbach, B. Ravindran

In this work, we present libHetMP, an OpenMP runtime for automatically and transparently distributing parallel computation across heterogeneous nodes. libHetMP targets platforms comprising CPUs with different instruction set architectures (ISA) coupled by a high-speed memory interconnect, where cross-ISA binary incompatibility and non-coherent caches require application data be marshaled to be shared across CPUs. Because of this, work distribution decisions must take into account both relative compute performance of asymmetric CPUs and communication overheads. libHetMP drives workload distribution decisions without programmer intervention by measuring performance characteristics during cross-node execution. A novel HetProbe loop iteration scheduler decides if cross-node execution is beneficial and either distributes work according to the relative performance of CPUs when it is or places all work on the set of homogeneous CPUs providing the best performance when it is not. We evaluate libHetMP using compute kernels from several OpenMP benchmark suites and show a geometric mean 41% speedup in execution time across asymmetric CPUs. Because some workloads may showcase irregular behavior among iterations, we extend libHetMP with a second scheduler called HetProbe-I. The evaluation of HetProbe-I shows it can further improve speedup for irregular computation, in some cases up to a 24%, by triggering periodic distribution decisions.

在这项工作中，我们提出了libHetMP，一个OpenMP运行时，用于在异构节点上自动透明地分布并行计算。libHetMP的目标平台包括具有不同指令集架构(ISA)的cpu，通过高速内存互连，其中跨ISA二进制不兼容性和非相干缓存要求应用程序数据被封送以跨cpu共享。因此，工作分配决策必须同时考虑非对称cpu的相对计算性能和通信开销。libHetMP通过测量跨节点执行期间的性能特征，在没有程序员干预的情况下驱动工作负载分配决策。一个新颖的HetProbe循环迭代调度器决定跨节点执行是否有益，并根据cpu的相对性能分配工作，或者将所有工作放在提供最佳性能的同构cpu集上。我们使用来自几个OpenMP基准套件的计算内核来评估libHetMP，并显示在非对称cpu上执行时间的几何平均加速为41%。由于某些工作负载可能在迭代中表现出不规则的行为，因此我们使用第二个称为HetProbe-I的调度器扩展libHetMP。对HetProbe-I的评估表明，通过触发周期性分配决策，它可以进一步提高不规则计算的加速速度，在某些情况下高达24%。

{"title":"An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous Nodes","authors":"Robert Lyerly, Carlos Bilbao, Changwoo Min, C. Rossbach, B. Ravindran","doi":"10.1145/3505224","DOIUrl":"https://doi.org/10.1145/3505224","url":null,"abstract":"In this work, we present libHetMP, an OpenMP runtime for automatically and transparently distributing parallel computation across heterogeneous nodes. libHetMP targets platforms comprising CPUs with different instruction set architectures (ISA) coupled by a high-speed memory interconnect, where cross-ISA binary incompatibility and non-coherent caches require application data be marshaled to be shared across CPUs. Because of this, work distribution decisions must take into account both relative compute performance of asymmetric CPUs and communication overheads. libHetMP drives workload distribution decisions without programmer intervention by measuring performance characteristics during cross-node execution. A novel HetProbe loop iteration scheduler decides if cross-node execution is beneficial and either distributes work according to the relative performance of CPUs when it is or places all work on the set of homogeneous CPUs providing the best performance when it is not. We evaluate libHetMP using compute kernels from several OpenMP benchmark suites and show a geometric mean 41% speedup in execution time across asymmetric CPUs. Because some workloads may showcase irregular behavior among iterations, we extend libHetMP with a second scheduler called HetProbe-I. The evaluation of HetProbe-I shows it can further improve speedup for irregular computation, in some cases up to a 24%, by triggering periodic distribution decisions.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124623553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories 支持混合内存上多个大数据处理框架的统一整体内存管理

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2022-02-04 DOI: 10.1145/3511211

Lei Chen, Jiacheng Zhao, Chenxi Wang, Ting Cao, J. Zigman, Haris Volos, O. Mutlu, Fang Lv, Xiaobing Feng, G. Xu, Huimin Cui

To process real-world datasets, modern data-parallel systems often require extremely large amounts of memory, which are both costly and energy inefficient. Emerging non-volatile memory (NVM) technologies offer high capacity compared to DRAM and low energy compared to SSDs. Hence, NVMs have the potential to fundamentally change the dichotomy between DRAM and durable storage in Big Data processing. However, most Big Data applications are written in managed languages and executed on top of a managed runtime that already performs various dimensions of memory management. Supporting hybrid physical memories adds a new dimension, creating unique challenges in data replacement. This article proposes Panthera, a semantics-aware, fully automated memory management technique for Big Data processing over hybrid memories. Panthera analyzes user programs on a Big Data system to infer their coarse-grained access patterns, which are then passed to the Panthera runtime for efficient data placement and migration. For Big Data applications, the coarse-grained data division information is accurate enough to guide the GC for data layout, which hardly incurs overhead in data monitoring and moving. We implemented Panthera in OpenJDK and Apache Spark. Based on Big Data applications’ memory access pattern, we also implemented a new profiling-guided optimization strategy, which is transparent to applications. With this optimization, our extensive evaluation demonstrates that Panthera reduces energy by 32–53% at less than 1% time overhead on average. To show Panthera’s applicability, we extend it to QuickCached, a pure Java implementation of Memcached. Our evaluation results show that Panthera reduces energy by 28.7% at 5.2% time overhead on average.

为了处理真实世界的数据集，现代数据并行系统通常需要非常大的内存，这既昂贵又节能。新兴的非易失性存储器(NVM)技术与DRAM相比容量大，与ssd相比能耗低。因此，nvm有可能从根本上改变大数据处理中DRAM和耐用存储之间的二分法。然而，大多数大数据应用程序是用托管语言编写的，并在托管运行时上执行，该运行时已经执行了各种内存管理。支持混合物理存储器增加了一个新的维度，在数据替换方面带来了独特的挑战。本文提出了Panthera，一种语义感知的、完全自动化的内存管理技术，用于混合内存上的大数据处理。Panthera分析大数据系统上的用户程序，以推断其粗粒度访问模式，然后将其传递给Panthera运行时，以实现有效的数据放置和迁移。对于大数据应用，粗粒度的数据划分信息足够精确，可以指导GC进行数据布局，几乎不会产生数据监控和移动的开销。我们在OpenJDK和Apache Spark中实现了Panthera。基于大数据应用的内存访问模式，我们还实现了一种新的基于性能分析的优化策略，该策略对应用透明。通过这种优化，我们的广泛评估表明，Panthera在平均不到1%的时间开销下减少了32-53%的能源。为了展示Panthera的适用性，我们将其扩展到QuickCached，这是Memcached的纯Java实现。我们的评估结果表明，Panthera在平均5.2%的时间开销下减少了28.7%的能源。

{"title":"Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories","authors":"Lei Chen, Jiacheng Zhao, Chenxi Wang, Ting Cao, J. Zigman, Haris Volos, O. Mutlu, Fang Lv, Xiaobing Feng, G. Xu, Huimin Cui","doi":"10.1145/3511211","DOIUrl":"https://doi.org/10.1145/3511211","url":null,"abstract":"To process real-world datasets, modern data-parallel systems often require extremely large amounts of memory, which are both costly and energy inefficient. Emerging non-volatile memory (NVM) technologies offer high capacity compared to DRAM and low energy compared to SSDs. Hence, NVMs have the potential to fundamentally change the dichotomy between DRAM and durable storage in Big Data processing. However, most Big Data applications are written in managed languages and executed on top of a managed runtime that already performs various dimensions of memory management. Supporting hybrid physical memories adds a new dimension, creating unique challenges in data replacement. This article proposes Panthera, a semantics-aware, fully automated memory management technique for Big Data processing over hybrid memories. Panthera analyzes user programs on a Big Data system to infer their coarse-grained access patterns, which are then passed to the Panthera runtime for efficient data placement and migration. For Big Data applications, the coarse-grained data division information is accurate enough to guide the GC for data layout, which hardly incurs overhead in data monitoring and moving. We implemented Panthera in OpenJDK and Apache Spark. Based on Big Data applications’ memory access pattern, we also implemented a new profiling-guided optimization strategy, which is transparent to applications. With this optimization, our extensive evaluation demonstrates that Panthera reduces energy by 32–53% at less than 1% time overhead on average. To show Panthera’s applicability, we extend it to QuickCached, a pure Java implementation of Memcached. Our evaluation results show that Panthera reduces energy by 28.7% at 5.2% time overhead on average.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133318447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Metron 密特隆

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2021-07-01 DOI: 10.1145/3465628

Georgios P. Katsikas, Tom Barbette, Dejan Kostic, Gerald Q. Maguire, Rebecca Steinert

Deployment of 100Gigabit Ethernet (GbE) links challenges the packet processing limits of commodity hardware used for Network Functions Virtualization (NFV). Moreover, realizing chained network functions (i.e., service chains) necessitates the use of multiple CPU cores, or even multiple servers, to process packets from such high speed links. Our system Metron jointly exploits the underlying network and commodity servers’ resources: (i) to offload part of the packet processing logic to the network, (ii) by using smart tagging to setup and exploit the affinity of traffic classes, and (iii) by using tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ cores, with zero inter-core communication. Moreover, Metron transparently integrates, manages, and load balances proprietary “blackboxes” together with Metron service chains. Metron realizes stateful network functions at the speed of 100GbE network cards on a single server, while elastically and rapidly adapting to changing workload volumes. Our experiments demonstrate that Metron service chains can coexist with heterogeneous blackboxes, while still leveraging Metron’s accurate dispatching and load balancing. In summary, Metron has (i) 2.75–8× better efficiency, up to (ii) 4.7× lower latency, and (iii) 7.8× higher throughput than OpenBox, a state-of-the-art NFV system.

100千兆以太网(GbE)链路的部署挑战了用于网络功能虚拟化(NFV)的商用硬件的数据包处理限制。此外，实现链式网络功能(即服务链)需要使用多个CPU核心，甚至多个服务器来处理来自这种高速链路的数据包。我们的系统Metron共同利用底层网络和商品服务器的资源:(i)将部分数据包处理逻辑卸载到网络，(ii)使用智能标签来设置和利用流量类别的亲和性，以及(iii)使用基于标签的硬件调度，以服务器核心的速度进行剩余的数据包处理，核间通信为零。此外，Metron透明地将专有的“黑盒”与Metron服务链集成、管理和负载平衡。Metron在单个服务器上以100GbE网卡的速度实现有状态网络功能，同时灵活快速地适应不断变化的工作负载量。我们的实验表明，Metron服务链可以与异构黑箱共存，同时仍然利用Metron的精确调度和负载平衡。综上所述，与最先进的NFV系统OpenBox相比，Metron的效率提高了2.75 - 8倍，延迟降低了4.7倍，吞吐量提高了7.8倍。

{"title":"Metron","authors":"Georgios P. Katsikas, Tom Barbette, Dejan Kostic, Gerald Q. Maguire, Rebecca Steinert","doi":"10.1145/3465628","DOIUrl":"https://doi.org/10.1145/3465628","url":null,"abstract":"Deployment of 100Gigabit Ethernet (GbE) links challenges the packet processing limits of commodity hardware used for Network Functions Virtualization (NFV). Moreover, realizing chained network functions (i.e., service chains) necessitates the use of multiple CPU cores, or even multiple servers, to process packets from such high speed links. Our system Metron jointly exploits the underlying network and commodity servers’ resources: (i) to offload part of the packet processing logic to the network, (ii) by using smart tagging to setup and exploit the affinity of traffic classes, and (iii) by using tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ cores, with zero inter-core communication. Moreover, Metron transparently integrates, manages, and load balances proprietary “blackboxes” together with Metron service chains. Metron realizes stateful network functions at the speed of 100GbE network cards on a single server, while elastically and rapidly adapting to changing workload volumes. Our experiments demonstrate that Metron service chains can coexist with heterogeneous blackboxes, while still leveraging Metron’s accurate dispatching and load balancing. In summary, Metron has (i) 2.75–8× better efficiency, up to (ii) 4.7× lower latency, and (iii) 7.8× higher throughput than OpenBox, a state-of-the-art NFV system.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116053405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan 用Graspan系统化大规模系统代码的过程间静态分析

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2021-07-01 DOI: 10.1145/3466820

Zhiqiang Zuo, Kai Wang, Aftab Hussain, A. A. Sani, Yiyu Zhang, S. Lu, Wensheng Dou, Linzhang Wang, Xuandong Li, Chenxi Wang, G. Xu

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability. In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism. We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.

使用静态分析来查找Linux等系统中的bug已经有十多年的历史了。为这些系统开发的大多数现有静态分析都是基于模式匹配查找错误的简单检查器。尽管存在许多复杂的过程间分析，但由于其复杂的实现和较差的可伸缩性，它们很少被用于改进系统代码的检查器。在本文中，我们从“大数据”的角度重新审视过程间静态分析的可扩展性问题。也就是说，我们将复杂的代码分析转化为大数据分析，并利用新颖的数据处理技术来解决这个传统的编程语言问题。我们提出了一个基于磁盘的并行图系统Graspan，它使用以边对为中心的计算模型来计算非常大的程序图上的动态传递闭包。我们为Graspan开发了两个后端，即运行在cpu上的Graspan- c和运行在gpu上的Graspan- g，并在文章中介绍了它们的设计。Graspan-C可以分析任何商用PC上的大规模系统代码，同时，如果GPU可用，Graspan-G可以很容易地通过利用GPU的大规模并行性来实现数量级的加速。我们已经在Graspan上实现了完全上下文敏感的指针/别名和数据流分析。在用多种语言(如Linux和Apache Hadoop)编写的大型代码库上对这些分析的评估表明，它们的Graspan实现与语言无关，可扩展到数百万行代码，并且比原始实现简单得多。此外，我们还展示了这些分析可以用来发现大规模系统代码中的许多真实bug。

{"title":"Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan","authors":"Zhiqiang Zuo, Kai Wang, Aftab Hussain, A. A. Sani, Yiyu Zhang, S. Lu, Wensheng Dou, Linzhang Wang, Xuandong Li, Chenxi Wang, G. Xu","doi":"10.1145/3466820","DOIUrl":"https://doi.org/10.1145/3466820","url":null,"abstract":"There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability. In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism. We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126762663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Modular and Distributed Management of Many-Core SoCs 多核soc的模块化和分布式管理

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2021-07-01 DOI: 10.1145/3458511

Marcelo Ruaro, A. Sant'Ana, A. Jantsch, F. Moraes

Many-Core Systems-on-Chip increasingly require Dynamic Multi-objective Management (DMOM) of resources. DMOM uses different management components for objectives and resources to implement comprehensive and self-adaptive system resource management. DMOMs are challenging because they require a scalable and well-organized framework to make each component modular, allowing it to be instantiated or redesigned with a limited impact on other components. This work evaluates two state-of-the-art distributed management paradigms and, motivated by their drawbacks, proposes a new one called Management Application (MA), along with a DMOM framework based on MA. MA is a distributed application, specific for management, where each task implements a management role. This paradigm favors scalability and modularity because the management design assumes different and parallel modules, decoupled from the OS. An experiment with a task mapping case study shows that MA reduces the overhead of management resources (-61.5%), latency (-66%), and communication volume (-96%) compared to state-of-the-art per-application management. Compared to cluster-based management (CBM) implemented directly as part of the OS, MA is similar in resources and communication volume, increasing only the mapping latency (+16%). Results targeting a complete DMOM control loop addressing up to three different objectives show the scalability regarding system size and adaptation frequency compared to CBM, presenting an overall management latency reduction of 17.2% and an overall monitoring messages’ latency reduction of 90.2%.

多核片上系统对资源动态多目标管理(DMOM)的要求越来越高。DMOM对目标和资源使用不同的管理组件，实现全面的、自适应的系统资源管理。dmom具有挑战性，因为它们需要一个可伸缩且组织良好的框架来使每个组件模块化，从而允许对其进行实例化或重新设计，而对其他组件的影响有限。这项工作评估了两种最先进的分布式管理范式，并根据它们的缺点，提出了一种称为管理应用程序(MA)的新范式，以及基于MA的DMOM框架。MA是一种分布式应用程序，专门用于管理，其中每个任务实现一个管理角色。这种范例有利于可伸缩性和模块化，因为管理设计采用不同的并行模块，与操作系统解耦。一个任务映射案例研究的实验表明，与最先进的每个应用程序管理相比，MA减少了管理资源的开销(-61.5%)、延迟(-66%)和通信量(-96%)。与直接作为操作系统一部分实现的基于集群的管理(CBM)相比，MA在资源和通信量方面相似，只是增加了映射延迟(+16%)。结果显示，与CBM相比，针对多达三个不同目标的完整DMOM控制回路显示了系统大小和自适应频率的可扩展性，总体管理延迟减少了17.2%，总体监控消息延迟减少了90.2%。

{"title":"Modular and Distributed Management of Many-Core SoCs","authors":"Marcelo Ruaro, A. Sant'Ana, A. Jantsch, F. Moraes","doi":"10.1145/3458511","DOIUrl":"https://doi.org/10.1145/3458511","url":null,"abstract":"Many-Core Systems-on-Chip increasingly require Dynamic Multi-objective Management (DMOM) of resources. DMOM uses different management components for objectives and resources to implement comprehensive and self-adaptive system resource management. DMOMs are challenging because they require a scalable and well-organized framework to make each component modular, allowing it to be instantiated or redesigned with a limited impact on other components. This work evaluates two state-of-the-art distributed management paradigms and, motivated by their drawbacks, proposes a new one called Management Application (MA), along with a DMOM framework based on MA. MA is a distributed application, specific for management, where each task implements a management role. This paradigm favors scalability and modularity because the management design assumes different and parallel modules, decoupled from the OS. An experiment with a task mapping case study shows that MA reduces the overhead of management resources (-61.5%), latency (-66%), and communication volume (-96%) compared to state-of-the-art per-application management. Compared to cluster-based management (CBM) implemented directly as part of the OS, MA is similar in resources and communication volume, increasing only the mapping latency (+16%). Results targeting a complete DMOM control loop addressing up to three different objectives show the scalability regarding system size and adaptation frequency compared to CBM, presenting an overall management latency reduction of 17.2% and an overall monitoring messages’ latency reduction of 90.2%.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121072796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Simulation Software for the Evaluation of Vulnerabilities in Reputation Management Systems 信誉管理系统漏洞评估仿真软件

ACM Transactions on Computer Systems (TOCS)

Pub Date : 2021-06-01 DOI: 10.1145/3458510

V. Agate, A. D. Paola, G. Re, M. Morana

Multi-agent distributed systems are characterized by autonomous entities that interact with each other to provide, and/or request, different kinds of services. In several contexts, especially when a reward is offered according to the quality of service, individual agents (or coordinated groups) may act in a selfish way. To prevent such behaviours, distributed Reputation Management Systems (RMSs) provide every agent with the capability of computing the reputation of the others according to direct past interactions, as well as indirect opinions reported by their neighbourhood. This last point introduces a weakness on gossiped information that makes RMSs vulnerable to malicious agents’ intent on disseminating false reputation values. Given the variety of application scenarios in which RMSs can be adopted, as well as the multitude of behaviours that agents can implement, designers need RMS evaluation tools that allow them to predict the robustness of the system to security attacks, before its actual deployment. To this aim, we present a simulation software for the vulnerability evaluation of RMSs and illustrate three case studies in which this tool was effectively used to model and assess state-of-the-art RMSs.

多代理分布式系统的特点是自治实体相互交互，以提供和/或请求不同类型的服务。在某些情况下，特别是当根据服务质量提供奖励时，个体代理(或协调的群体)可能会以自私的方式行事。为了防止这种行为，分布式声誉管理系统(RMSs)为每个代理提供了根据过去的直接互动以及邻居报告的间接意见计算其他代理的声誉的能力。最后一点引入了流言信息的弱点，这使得RMSs容易受到恶意代理意图传播虚假声誉值的攻击。考虑到可以采用RMS的各种应用程序场景，以及代理可以实现的大量行为，设计人员需要RMS评估工具，使他们能够在实际部署之前预测系统对安全攻击的健壮性。为此，我们提出了一个用于RMSs脆弱性评估的模拟软件，并举例说明了三个案例研究，其中该工具被有效地用于建模和评估最先进的RMSs。

{"title":"A Simulation Software for the Evaluation of Vulnerabilities in Reputation Management Systems","authors":"V. Agate, A. D. Paola, G. Re, M. Morana","doi":"10.1145/3458510","DOIUrl":"https://doi.org/10.1145/3458510","url":null,"abstract":"Multi-agent distributed systems are characterized by autonomous entities that interact with each other to provide, and/or request, different kinds of services. In several contexts, especially when a reward is offered according to the quality of service, individual agents (or coordinated groups) may act in a selfish way. To prevent such behaviours, distributed Reputation Management Systems (RMSs) provide every agent with the capability of computing the reputation of the others according to direct past interactions, as well as indirect opinions reported by their neighbourhood. This last point introduces a weakness on gossiped information that makes RMSs vulnerable to malicious agents’ intent on disseminating false reputation values. Given the variety of application scenarios in which RMSs can be adopted, as well as the multitude of behaviours that agents can implement, designers need RMS evaluation tools that allow them to predict the robustness of the system to security attacks, before its actual deployment. To this aim, we present a simulation software for the vulnerability evaluation of RMSs and illustrate three case studies in which this tool was effectively used to model and assess state-of-the-art RMSs.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116457150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6