Proceedings of the ACM on Measurement and Analysis of Computing Systems最新文献_第8页

Free2Shard Free2Shard

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508031

Ranvir Rana, Sreeram Kannan, DavidN C. Tse, P. Viswanath

In this paper, we study a canonical distributed resource allocation problem arising in blockchains. While distributed resource allocation is a well-studied problem in networking, the blockchain setting additionally requires the solution to be resilient to adversarial behavior from a fraction of nodes. Scaling blockchain performance is a basic research topic; a plethora of solutions (under the umbrella of sharding ) have been proposed in recent years. Although the various sharding solutions share a common thread (they cryptographically stitch together multiple parallel chains), architectural differences lead to differing resource allocation problems. In this paper we make three main contributions: (a) we categorize the different sharding proposals under a common architectural framework, allowing for the emergence of a new, uniformly improved, uni-consensus sharding architecture. (b) We formulate and exactly solve a core resource allocation problem in the uni-consensus sharding architecture -- our solution, Free2shard, is adversary-resistant and achieves optimal throughput. The key technical contribution is a mathematical connection to the classical work of Blackwell approachability in dynamic game theory. (c) We implement the sharding architecture atop a full-stack blockchain in 3000 lines of code in Rust -- we achieve a throughput of more than 250,000 transactions per second with 6 shards, a vast improvement over state-of-the-art.

本文研究了区块链中出现的一个规范的分布式资源分配问题。虽然分布式资源分配是网络中一个研究得很好的问题，但区块链设置还要求解决方案能够适应一小部分节点的对抗行为。扩展区块链性能是一个基础研究课题;近年来已经提出了大量的解决方案(在分片的保护伞下)。尽管各种分片解决方案共享一个共同的线程(它们加密地将多个并行链连接在一起)，但体系结构的差异导致了不同的资源分配问题。在本文中，我们做出了三个主要贡献:(a)我们将不同的分片提案分类在一个共同的架构框架下，允许出现一个新的，统一改进的，统一共识的分片架构。(b)我们制定并准确解决了统一共识分片架构中的一个核心资源分配问题——我们的解决方案Free2shard是抗对抗的，并且实现了最优的吞吐量。关键的技术贡献是动态博弈论中布莱克威尔可接近性的经典工作的数学联系。(c)我们在Rust中用3000行代码在全栈区块链上实现了分片架构——我们通过6个分片实现了每秒超过25万次交易的吞吐量，这是对最先进技术的巨大改进。

{"title":"Free2Shard","authors":"Ranvir Rana, Sreeram Kannan, DavidN C. Tse, P. Viswanath","doi":"10.1145/3508031","DOIUrl":"https://doi.org/10.1145/3508031","url":null,"abstract":"In this paper, we study a canonical distributed resource allocation problem arising in blockchains. While distributed resource allocation is a well-studied problem in networking, the blockchain setting additionally requires the solution to be resilient to adversarial behavior from a fraction of nodes. Scaling blockchain performance is a basic research topic; a plethora of solutions (under the umbrella of sharding ) have been proposed in recent years. Although the various sharding solutions share a common thread (they cryptographically stitch together multiple parallel chains), architectural differences lead to differing resource allocation problems. In this paper we make three main contributions: (a) we categorize the different sharding proposals under a common architectural framework, allowing for the emergence of a new, uniformly improved, uni-consensus sharding architecture. (b) We formulate and exactly solve a core resource allocation problem in the uni-consensus sharding architecture -- our solution, Free2shard, is adversary-resistant and achieves optimal throughput. The key technical contribution is a mathematical connection to the classical work of Blackwell approachability in dynamic game theory. (c) We implement the sharding architecture atop a full-stack blockchain in 3000 lines of code in Rust -- we achieve a throughput of more than 250,000 transactions per second with 6 shards, a vast improvement over state-of-the-art.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114844897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Data-Driven Network Path Simulation with iBox 数据驱动的网络路径仿真与iBox

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508026

S. Ashok, Shubham Tiwari, Nagarajan Natarajan, V. Padmanabhan, Sundararajan Sellamanickam

While network simulation is widely used for evaluating network protocols and applications, ensuring realism remains a key challenge. There has been much work on simulating network mechanisms faithfully (e.g., links, buffers, etc.), but less attention on the critical task of configuring the simulator to reflect reality. We present iBox ("Internet in a Box"), which enables data-driven network path simulation, using input/output packet traces gathered at the sender/receiver in the target network to create a model of the end-to-end behaviour of a network path. Our work builds on recent work in this direction and makes three contributions: (1) estimation of a lightweight non reactive cross-traffic model, (2) estimation of a more powerful reactive cross-traffic model based on Bayesian optimization, and (3) evaluation of iBox in the context of congestion control variants in an Internet research testbed and also controlled experiments with known ground truth.

虽然网络仿真广泛用于评估网络协议和应用，但确保真实性仍然是一个关键挑战。在忠实地模拟网络机制(例如，链接，缓冲区等)方面已经有很多工作，但很少关注配置模拟器以反映现实的关键任务。我们提出了iBox(“盒子里的互联网”)，它允许数据驱动的网络路径模拟，使用在目标网络的发送者/接收者收集的输入/输出数据包跟踪来创建网络路径的端到端行为模型。我们的工作建立在这个方向上的最新工作的基础上，并做出了三个贡献:(1)估计轻量级的非反应性交叉流量模型，(2)估计基于贝叶斯优化的更强大的反应性交叉流量模型，以及(3)在互联网研究试验台的拥塞控制变量背景下评估iBox，并使用已知的地面事实进行控制实验。

引用次数: 2

Argus 百眼巨人

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508022

Hem Regmi, Sanjib Sur

We propose Argus, a system to enable millimeter-wave (mmWave) deployers to quickly complete site-surveys without sacrificing the accuracy and effectiveness of thorough network deployment surveys. Argus first models the mmWave reflection profile of an environment, considering dominant reflectors, and then use this model to find locations that maximize the usability of the reflectors. The key component in Argus is an efficient machine learning model that can map the visual data to the mmWave signal reflections of an environment and can accurately predict mmWave signal profile at any unobserved locations. It allows Argus to find the best picocell locations to provide maximum coverage and also lets users self-localize accurately anywhere in the environment. Furthermore, Argus allows mmWave picocells to predict device's orientation accurately and enables object tagging and retrieval for VR/AR applications. Currently, we implement and test Argus on two different buildings consisting of multiple different indoor environments. However, the generalization capability of Argus can easily update the model for unseen environments, and thus, Argus can be deployed to any indoor environment with little or no model fine-tuning.

我们提出Argus系统，使毫米波(mmWave)部署人员能够快速完成现场调查，而不会牺牲全面网络部署调查的准确性和有效性。Argus首先对环境的毫米波反射剖面进行建模，考虑主要反射面，然后使用该模型找到反射面可用性最大化的位置。Argus的关键组件是一个高效的机器学习模型，可以将视觉数据映射到环境的毫米波信号反射，并可以准确预测任何未观测位置的毫米波信号剖面。它允许Argus找到最佳的piccell位置，以提供最大的覆盖范围，并允许用户在环境中的任何地方进行准确的自我定位。此外，Argus允许毫米波皮细胞准确预测设备的方向，并为VR/AR应用程序提供对象标记和检索。目前，我们在包含多个不同室内环境的两座不同建筑上实施和测试Argus。然而，Argus的泛化能力可以很容易地针对不可见的环境更新模型，因此，Argus可以部署到任何室内环境中，几乎不需要对模型进行微调。

{"title":"Argus","authors":"Hem Regmi, Sanjib Sur","doi":"10.1145/3508022","DOIUrl":"https://doi.org/10.1145/3508022","url":null,"abstract":"We propose Argus, a system to enable millimeter-wave (mmWave) deployers to quickly complete site-surveys without sacrificing the accuracy and effectiveness of thorough network deployment surveys. Argus first models the mmWave reflection profile of an environment, considering dominant reflectors, and then use this model to find locations that maximize the usability of the reflectors. The key component in Argus is an efficient machine learning model that can map the visual data to the mmWave signal reflections of an environment and can accurately predict mmWave signal profile at any unobserved locations. It allows Argus to find the best picocell locations to provide maximum coverage and also lets users self-localize accurately anywhere in the environment. Furthermore, Argus allows mmWave picocells to predict device's orientation accurately and enables object tagging and retrieval for VR/AR applications. Currently, we implement and test Argus on two different buildings consisting of multiple different indoor environments. However, the generalization capability of Argus can easily update the model for unseen environments, and thus, Argus can be deployed to any indoor environment with little or no model fine-tuning.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115559256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Automatic Inference of BGP Location Communities BGP位置团体自动推断

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508023

B. A. D. Silva, Paulo Mol, O. Fonseca, Ítalo F. S. Cunha, R. Ferreira, Ethan Katz-Bassett

The Border Gateway Protocol (BGP) orchestrates Internet communications between and inside Autonomous Systems. BGP's flexibility allows operators to express complex policies and deploy advanced traffic engineering systems. A key mechanism to provide this flexibility is tagging route announcements with BGP communities, which have arbitrary, operator-defined semantics, to pass information or requests from router to router. Typical uses of BGP communities include attaching metadata to route announcements, such as where a route was learned or whether it was received from a customer, and controlling route propagation, for example to steer traffic to preferred paths or blackhole DDoS traffic. However, there is no standard for specifying the semantics nor a centralized repository that catalogs the meaning of BGP communities. The lack of standards and central repositories complicates the use of communities by the operator and research communities. In this paper, we present a set of techniques to infer the semantics of BGP communities from public BGP data. Our techniques infer communities related to the entities or locations traversed by a route by correlating communities with AS paths. We also propose a set of heuristics to filter incorrect inferences introduced by misbehaving networks, sharing of BGP communities among sibling autonomous systems, and inconsistent BGP dumps. We apply our techniques to billions of routing records from public BGP collectors and make available a public database with more than 15 thousand location communities. Our comparison with manually-built databases shows our techniques provide high precision (up to 93%), better coverage (up to 81% recall), and dynamic updates, complementing operators' and researchers' abilities to reason about BGP community semantics.

边界网关协议(BGP)协调自治系统之间和内部的互联网通信。BGP的灵活性使运营商可以表达复杂的策略，部署先进的流量工程系统。提供这种灵活性的一个关键机制是用BGP社区标记路由公告，它具有任意的、操作员定义的语义，以便在路由器之间传递信息或请求。BGP社区的典型用途包括将元数据附加到路由公告中，例如路由在哪里被学习或是否从客户接收，以及控制路由传播，例如将流量引导到首选路径或黑洞DDoS流量。但是，目前还没有指定语义的标准，也没有一个集中的存储库对BGP社区的含义进行编目。标准和中央存储库的缺乏使操作者和研究团体对社区的使用变得复杂。在本文中，我们提出了一套从公共BGP数据中推断BGP社区语义的技术。我们的技术通过将社区与AS路径关联来推断与路由所经过的实体或位置相关的社区。我们还提出了一组启发式方法来过滤由行为不正常的网络、兄弟自治系统之间的BGP社区共享以及不一致的BGP转储所引入的错误推断。我们将我们的技术应用于来自公共BGP收集器的数十亿路由记录，并提供一个包含超过15,000个位置社区的公共数据库。与人工构建的数据库相比，我们的技术提供了高精确度(高达93%)、更好的覆盖率(高达81%的召回率)和动态更新，补充了运营商和研究人员对BGP社区语义的推理能力。

{"title":"Automatic Inference of BGP Location Communities","authors":"B. A. D. Silva, Paulo Mol, O. Fonseca, Ítalo F. S. Cunha, R. Ferreira, Ethan Katz-Bassett","doi":"10.1145/3508023","DOIUrl":"https://doi.org/10.1145/3508023","url":null,"abstract":"The Border Gateway Protocol (BGP) orchestrates Internet communications between and inside Autonomous Systems. BGP's flexibility allows operators to express complex policies and deploy advanced traffic engineering systems. A key mechanism to provide this flexibility is tagging route announcements with BGP communities, which have arbitrary, operator-defined semantics, to pass information or requests from router to router. Typical uses of BGP communities include attaching metadata to route announcements, such as where a route was learned or whether it was received from a customer, and controlling route propagation, for example to steer traffic to preferred paths or blackhole DDoS traffic. However, there is no standard for specifying the semantics nor a centralized repository that catalogs the meaning of BGP communities. The lack of standards and central repositories complicates the use of communities by the operator and research communities. In this paper, we present a set of techniques to infer the semantics of BGP communities from public BGP data. Our techniques infer communities related to the entities or locations traversed by a route by correlating communities with AS paths. We also propose a set of heuristics to filter incorrect inferences introduced by misbehaving networks, sharing of BGP communities among sibling autonomous systems, and inconsistent BGP dumps. We apply our techniques to billions of routing records from public BGP collectors and make available a public database with more than 15 thousand location communities. Our comparison with manually-built databases shows our techniques provide high precision (up to 93%), better coverage (up to 81% recall), and dynamic updates, complementing operators' and researchers' abilities to reason about BGP community semantics.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126605209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Memory Space Recycling 内存空间回收

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508034

Jihyun Ryoo, M. Kandemir, Mustafa Karaköy

Many program codes from different application domains process very large amounts of data, making their cache memory behavior critical for high performance. Most of the existing work targeting cache memory hierarchies focus on improving data access patterns, e.g., maximizing sequential accesses to program data structures via code and/or data layout restructuring strategies. Prior work has addressed this data locality optimization problem in the context of both single-core and multi-core systems. Another dimension of optimization, which can be as equally important/beneficial as improving data access pattern is to reduce the data volume (total number of addresses) accessed by the program code. Compared to data access pattern restructuring, this volume minimization problem has relatively taken much less attention. In this work, we focus on this volume minimization problem and address it in both single-core and multi-core execution scenarios. Specifically, we explore the idea of rewriting an application program code to reduce its "memory space footprint". The main idea behind this approach is to reuse/recycle, for a given data element, a memory location that has originally been assigned to another data element, provided that the lifetimes of these two data elements do not overlap with each other. A unique aspect is that it is "distance aware", i.e., in identifying the memory/cache locations to recycle it takes into account the physical distance between the location of the core and the memory/cache location to be recycled. We present a detailed experimental evaluation of our proposed memory space recycling strategy, using five different metrics: memory space consumption, network footprint, data access distance, cache miss rate, and execution time. The experimental results show that our proposed approach brings, respectively, 33.2%, 48.6%, 46.5%, 31.8%, and 27.9% average improvements in these metrics, in the case of single-threaded applications. With the multi-threaded versions of the same applications, the achieved improvements are 39.5%, 55.5%, 53.4%, 26.2%, and 22.2%, in the same order.

来自不同应用程序域的许多程序代码处理非常大量的数据，这使得它们的缓存内存行为对高性能至关重要。大多数针对缓存层次结构的现有工作都集中在改进数据访问模式上，例如，通过代码和/或数据布局重组策略最大化对程序数据结构的顺序访问。先前的工作已经在单核和多核系统的背景下解决了这个数据局部性优化问题。与改进数据访问模式同样重要/有益的另一个优化维度是减少程序代码访问的数据量(地址总数)。与数据访问模式重构相比，这种容量最小化问题相对较少受到关注。在这项工作中，我们专注于这个体积最小化问题，并在单核和多核执行场景中解决它。具体来说，我们探讨了重写应用程序代码以减少其“内存空间占用”的想法。这种方法背后的主要思想是，对于给定的数据元素，只要这两个数据元素的生命周期不重叠，就可以重用/回收最初分配给另一个数据元素的内存位置。它的独特之处在于“距离感知”，即在确定要回收的内存/缓存位置时，它会考虑到核心位置与要回收的内存/缓存位置之间的物理距离。我们对我们提出的内存空间回收策略进行了详细的实验评估，使用了五个不同的指标:内存空间消耗、网络占用、数据访问距离、缓存缺失率和执行时间。实验结果表明，在单线程应用程序的情况下，我们提出的方法在这些指标上分别带来了33.2%、48.6%、46.5%、31.8%和27.9%的平均改进。对于相同应用程序的多线程版本，实现的改进依次为39.5%、55.5%、53.4%、26.2%和22.2%，顺序相同。

{"title":"Memory Space Recycling","authors":"Jihyun Ryoo, M. Kandemir, Mustafa Karaköy","doi":"10.1145/3508034","DOIUrl":"https://doi.org/10.1145/3508034","url":null,"abstract":"Many program codes from different application domains process very large amounts of data, making their cache memory behavior critical for high performance. Most of the existing work targeting cache memory hierarchies focus on improving data access patterns, e.g., maximizing sequential accesses to program data structures via code and/or data layout restructuring strategies. Prior work has addressed this data locality optimization problem in the context of both single-core and multi-core systems. Another dimension of optimization, which can be as equally important/beneficial as improving data access pattern is to reduce the data volume (total number of addresses) accessed by the program code. Compared to data access pattern restructuring, this volume minimization problem has relatively taken much less attention. In this work, we focus on this volume minimization problem and address it in both single-core and multi-core execution scenarios. Specifically, we explore the idea of rewriting an application program code to reduce its \"memory space footprint\". The main idea behind this approach is to reuse/recycle, for a given data element, a memory location that has originally been assigned to another data element, provided that the lifetimes of these two data elements do not overlap with each other. A unique aspect is that it is \"distance aware\", i.e., in identifying the memory/cache locations to recycle it takes into account the physical distance between the location of the core and the memory/cache location to be recycled. We present a detailed experimental evaluation of our proposed memory space recycling strategy, using five different metrics: memory space consumption, network footprint, data access distance, cache miss rate, and execution time. The experimental results show that our proposed approach brings, respectively, 33.2%, 48.6%, 46.5%, 31.8%, and 27.9% average improvements in these metrics, in the case of single-threaded applications. With the multi-threaded versions of the same applications, the achieved improvements are 39.5%, 55.5%, 53.4%, 26.2%, and 22.2%, in the same order.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131157442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Curvature-based Analysis of Network Connectivity in Private Backbone Infrastructures 基于曲率的私有骨干基础设施网络连通性分析

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508025

Loqman Salamatian, Scott Anderson, Joshua Matthews, P. Barford, W. Willinger, M. Crovella

The main premise of this work is that since large cloud providers can and do manipulate probe packets that traverse their privately owned and operated backbones, standard traceroute-based measurement techniques are no longer a reliable means for assessing network connectivity in large cloud provider infrastructures. In response to these developments, we present a new empirical approach for elucidating private connectivity in today's Internet. Our approach relies on using only "light-weight" ( i.e., simple, easily-interpretable, and readily available) measurements, but requires applying a "heavy-weight" or advanced mathematical analysis. In particular, we describe a new method for assessing the characteristics of network path connectivity that is based on concepts from Riemannian geometry ( i.e., Ricci curvature) and also relies on an array of carefully crafted visualizations ( e.g., a novel manifold view of a network's delay space). We demonstrate our method by utilizing latency measurements from RIPE Atlas anchors and virtual machines running in data centers of three large cloud providers to (i) study different aspects of connectivity in their private backbones and (ii) show how our manifold-based view enables us to expose and visualize critical aspects of this connectivity over different geographic scales.

这项工作的主要前提是，由于大型云提供商可以并且确实操纵穿越其私有和运营的骨干网的探测数据包，标准的基于跟踪路由的测量技术不再是评估大型云提供商基础设施中网络连接的可靠手段。针对这些发展，我们提出了一种新的实证方法来阐明当今互联网中的私人连接。我们的方法依赖于仅使用“轻量级”(即，简单的，易于解释的，并且随时可用的)测量，但需要应用“重量级”或高级数学分析。特别是，我们描述了一种评估网络路径连通性特征的新方法，该方法基于黎曼几何(即里奇曲率)的概念，并且还依赖于一系列精心制作的可视化(例如，网络延迟空间的新颖流形视图)。我们通过利用在三个大型云提供商的数据中心中运行的RIPE Atlas锚点和虚拟机的延迟测量来演示我们的方法，以(i)研究其私有主干中连接的不同方面，(ii)展示我们基于流形的视图如何使我们能够在不同的地理尺度上暴露和可视化这种连接的关键方面。

{"title":"Curvature-based Analysis of Network Connectivity in Private Backbone Infrastructures","authors":"Loqman Salamatian, Scott Anderson, Joshua Matthews, P. Barford, W. Willinger, M. Crovella","doi":"10.1145/3508025","DOIUrl":"https://doi.org/10.1145/3508025","url":null,"abstract":"The main premise of this work is that since large cloud providers can and do manipulate probe packets that traverse their privately owned and operated backbones, standard traceroute-based measurement techniques are no longer a reliable means for assessing network connectivity in large cloud provider infrastructures. In response to these developments, we present a new empirical approach for elucidating private connectivity in today's Internet. Our approach relies on using only \"light-weight\" ( i.e., simple, easily-interpretable, and readily available) measurements, but requires applying a \"heavy-weight\" or advanced mathematical analysis. In particular, we describe a new method for assessing the characteristics of network path connectivity that is based on concepts from Riemannian geometry ( i.e., Ricci curvature) and also relies on an array of carefully crafted visualizations ( e.g., a novel manifold view of a network's delay space). We demonstrate our method by utilizing latency measurements from RIPE Atlas anchors and virtual machines running in data centers of three large cloud providers to (i) study different aspects of connectivity in their private backbones and (ii) show how our manifold-based view enables us to expose and visualize critical aspects of this connectivity over different geographic scales.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129089720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

POMACS V6, N1, March 2022 Editorial 《POMACS》V6, N1, 2022年3月社论

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508021

Niklas Carlsson, Edith Cohen, Philippe Robert

The ACM Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) focuses on the measurement and performance evaluation of computer systems and operates in close collaboration with the ACM Special Interest Group SIGMETRICS. All papers in this issue of POMACS will be presented during the ACM SIGMETRICS/Performance 2022 conference. The issue contains papers selected by the editorial board via a rigorous review process that follows a hybrid conference and journal model, with reviews conducted by the 97 members of our POMACS editorial board. Each paper was either conditionally accepted (and shepherded), allowed a "one-shot" revision (to be resubmitted to one of the subsequent two deadlines), or rejected (with resubmission allowed after a year). For this issue, which represents the fall deadline, we accepted 25 papers out of 106 submissions (including 4 papers that had been given a "one-shot" revision opportunity). All submitted papers received at least 3 reviews and we held an online TPC meeting. Based on the indicated primary track, roughly 30% of the submissions were in the Measurement & Applied Modeling track, 28% were in the Systems track, 26% were in the Theory track, and 15% were in the Learning track. Many people contributed to the success of this issue of POMACS. First, we would like to thank the authors, who submitted their best work to SIGMETRICS/POMACS. Second, we would like to thank the TPC members who provided constructive feedback in their reviews to authors and participated in the online discussions and TPC meetings. We also thank several external reviewers who provided their expert opinion on specific submissions that required additional input. We are also grateful to the SIGMETRICS Board Chair, Giuliano Casale, and to past TPC Chairs. Finally, we are grateful to the Organization Committee and to the SIGMETRICS Board for their ongoing efforts and initiatives for creating an exciting program for ACM SIGMETRICS/Performance 2022.

ACM计算系统测量与分析(POMACS)的ACM论文集侧重于计算机系统的测量和性能评估，并与ACM特别兴趣小组SIGMETRICS密切合作。本期《POMACS》的所有论文将在ACM SIGMETRICS/Performance 2022会议期间提交。本刊包含由编辑委员会通过严格的审查程序选择的论文，该程序遵循会议和期刊混合模式，由POMACS编辑委员会的97名成员进行审查。每篇论文要么被有条件地接受(并受到指导)，允许“一次性”修改(在随后的两个截止日期之一重新提交)，要么被拒绝(一年后允许重新提交)。这一期是秋季截稿期，我们从106篇投稿中接受了25篇(包括4篇“一次性”修改机会的论文)。所有提交的论文至少接受了3次评审，并举行了在线TPC会议。根据指示的主要轨道，大约30%的提交在测量和应用建模轨道，28%在系统轨道，26%在理论轨道，15%在学习轨道。许多人对本期《POMACS》的成功做出了贡献。首先，我们要感谢作者，他们向SIGMETRICS/POMACS提交了他们最好的作品。其次，我们要感谢TPC成员在他们的评论中向作者提供了建设性的反馈，并参与了在线讨论和TPC会议。我们还要感谢几位外部审稿人，他们就需要额外投入的具体提交文件提供了专家意见。我们还要感谢SIGMETRICS董事会主席Giuliano Casale和过去的TPC主席。最后，我们感谢组织委员会和SIGMETRICS董事会为ACM SIGMETRICS/Performance 2022创建一个令人兴奋的计划所做的持续努力和倡议。

{"title":"POMACS V6, N1, March 2022 Editorial","authors":"Niklas Carlsson, Edith Cohen, Philippe Robert","doi":"10.1145/3508021","DOIUrl":"https://doi.org/10.1145/3508021","url":null,"abstract":"The ACM Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) focuses on the measurement and performance evaluation of computer systems and operates in close collaboration with the ACM Special Interest Group SIGMETRICS. All papers in this issue of POMACS will be presented during the ACM SIGMETRICS/Performance 2022 conference. The issue contains papers selected by the editorial board via a rigorous review process that follows a hybrid conference and journal model, with reviews conducted by the 97 members of our POMACS editorial board. Each paper was either conditionally accepted (and shepherded), allowed a \"one-shot\" revision (to be resubmitted to one of the subsequent two deadlines), or rejected (with resubmission allowed after a year). For this issue, which represents the fall deadline, we accepted 25 papers out of 106 submissions (including 4 papers that had been given a \"one-shot\" revision opportunity). All submitted papers received at least 3 reviews and we held an online TPC meeting. Based on the indicated primary track, roughly 30% of the submissions were in the Measurement & Applied Modeling track, 28% were in the Systems track, 26% were in the Theory track, and 15% were in the Learning track. Many people contributed to the success of this issue of POMACS. First, we would like to thank the authors, who submitted their best work to SIGMETRICS/POMACS. Second, we would like to thank the TPC members who provided constructive feedback in their reviews to authors and participated in the online discussions and TPC meetings. We also thank several external reviewers who provided their expert opinion on specific submissions that required additional input. We are also grateful to the SIGMETRICS Board Chair, Giuliano Casale, and to past TPC Chairs. Finally, we are grateful to the Organization Committee and to the SIGMETRICS Board for their ongoing efforts and initiatives for creating an exciting program for ACM SIGMETRICS/Performance 2022.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131631952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SparseP SparseP

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508041

Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, N. Koziris, G. Goumas, O. Mutlu

Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures place simple cores close to DRAM banks. Recent research demonstrates that they can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low memory access latency, thereby being a good fit to accelerate the Sparse Matrix Vector Multiplication (SpMV) kernel. SpMV has been characterized as one of the most significant and thoroughly studied scientific computation kernels. It is primarily a memory-bound kernel with intensive memory accesses due its algorithmic nature, the compressed matrix format used, and the sparsity patterns of the input matrices given. This paper provides the first comprehensive analysis of SpMV on a real-world PIM architecture, and presents SparseP, the first SpMV library for real PIM architectures. We make three key contributions. First, we implement a wide variety of software strategies on SpMV for a multithreaded PIM core, including (1) various compressed matrix formats, (2) load balancing schemes across parallel threads and (3) synchronization approaches, and characterize the computational limits of a single multithreaded PIM core. Second, we design various load balancing schemes across multiple PIM cores, and two types of data partitioning techniques to execute SpMV on thousands of PIM cores: (1) 1D-partitioned kernels to perform the complete SpMV computation only using PIM cores, and (2) 2D-partitioned kernels to strive a balance between computation and data transfer costs to PIM-enabled memory. Third, we compare SpMV execution on a real-world PIM system with 2528 PIM cores to an Intel Xeon CPU and an NVIDIA Tesla V100 GPU to study the performance and energy efficiency of various devices, i.e., both memory-centric PIM systems and conventional processor-centric CPU/GPU systems, for the SpMV kernel. SparseP software package provides 25 SpMV kernels for real PIM systems supporting the four most widely used compressed matrix formats, i.e., CSR, COO, BCSR and BCOO, and a wide range of data types. SparseP is publicly and freely available at https://github.com/CMU-SAFARI/SparseP. Our extensive evaluation using 26 matrices with various sparsity patterns provides new insights and recommendations for software designers and hardware architects to efficiently accelerate the SpMV kernel on real PIM systems.

经过几十年的研究努力，一些制造商已经开始将近银行内存处理(PIM)架构商业化。近库PIM架构将简单内核放置在DRAM库附近。最近的研究表明，通过降低数据访问成本，它们可以显著提高并行应用程序的性能和能耗。真正的PIM系统可以提供高水平的并行性、大的聚合内存带宽和低的内存访问延迟，因此非常适合加速稀疏矩阵向量乘法(SpMV)内核。SpMV是目前研究最深入、最重要的科学计算核之一。它主要是一个内存受限的内核，由于其算法性质、使用的压缩矩阵格式和给定的输入矩阵的稀疏模式，它具有密集的内存访问。本文首次在实际的PIM体系结构上对SpMV进行了全面的分析，并提出了SparseP，这是第一个用于实际PIM体系结构的SpMV库。我们做出了三个关键贡献。首先，我们在多线程PIM核心的SpMV上实现了各种各样的软件策略，包括(1)各种压缩矩阵格式，(2)跨并行线程的负载平衡方案和(3)同步方法，并表征了单个多线程PIM核心的计算限制。其次，我们设计了跨多个PIM内核的各种负载平衡方案，以及在数千个PIM内核上执行SpMV的两种类型的数据分区技术:(1)仅使用PIM内核执行完整的SpMV计算的2d分区内核，以及(2)2d分区内核努力平衡计算和数据传输到支持PIM的内存之间的成本。第三，我们将SpMV在具有2528个PIM内核的真实PIM系统上的执行情况与Intel Xeon CPU和NVIDIA Tesla V100 GPU进行比较，以研究各种设备的性能和能效，即以内存为中心的PIM系统和传统的以处理器为中心的CPU/GPU系统，用于SpMV内核。SparseP软件包为实际PIM系统提供了25个SpMV内核，支持四种最广泛使用的压缩矩阵格式，即CSR, COO, BCSR和BCOO，以及广泛的数据类型。SparseP可以在https://github.com/CMU-SAFARI/SparseP上免费公开获取。我们使用26个具有各种稀疏性模式的矩阵进行了广泛的评估，为软件设计人员和硬件架构师提供了新的见解和建议，以便在实际的PIM系统上有效地加速SpMV内核。

{"title":"SparseP","authors":"Christina Giannoula, Ivan Fernandez, Juan Gómez-Luna, N. Koziris, G. Goumas, O. Mutlu","doi":"10.1145/3508041","DOIUrl":"https://doi.org/10.1145/3508041","url":null,"abstract":"Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures place simple cores close to DRAM banks. Recent research demonstrates that they can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwidth and low memory access latency, thereby being a good fit to accelerate the Sparse Matrix Vector Multiplication (SpMV) kernel. SpMV has been characterized as one of the most significant and thoroughly studied scientific computation kernels. It is primarily a memory-bound kernel with intensive memory accesses due its algorithmic nature, the compressed matrix format used, and the sparsity patterns of the input matrices given. This paper provides the first comprehensive analysis of SpMV on a real-world PIM architecture, and presents SparseP, the first SpMV library for real PIM architectures. We make three key contributions. First, we implement a wide variety of software strategies on SpMV for a multithreaded PIM core, including (1) various compressed matrix formats, (2) load balancing schemes across parallel threads and (3) synchronization approaches, and characterize the computational limits of a single multithreaded PIM core. Second, we design various load balancing schemes across multiple PIM cores, and two types of data partitioning techniques to execute SpMV on thousands of PIM cores: (1) 1D-partitioned kernels to perform the complete SpMV computation only using PIM cores, and (2) 2D-partitioned kernels to strive a balance between computation and data transfer costs to PIM-enabled memory. Third, we compare SpMV execution on a real-world PIM system with 2528 PIM cores to an Intel Xeon CPU and an NVIDIA Tesla V100 GPU to study the performance and energy efficiency of various devices, i.e., both memory-centric PIM systems and conventional processor-centric CPU/GPU systems, for the SpMV kernel. SparseP software package provides 25 SpMV kernels for real PIM systems supporting the four most widely used compressed matrix formats, i.e., CSR, COO, BCSR and BCOO, and a wide range of data types. SparseP is publicly and freely available at https://github.com/CMU-SAFARI/SparseP. Our extensive evaluation using 26 matrices with various sparsity patterns provides new insights and recommendations for software designers and hardware architects to efficiently accelerate the SpMV kernel on real PIM systems.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133881748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

End-to-end Characterization of Game Streaming Applications on Mobile Platforms 移动平台上游戏流媒体应用的端到端特性分析

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3508030

Sandeepa Bhuyan, Shulin Zhao, Ziyu Ying, M. Kandemir, C. Das

With the advent of 5G, supporting high-quality game streaming applications on edge devices has become a reality. This is evidenced by a recent surge in cloud gaming applications on mobile devices. In contrast to video streaming applications, interactive games require much more compute power for supporting improved rendering (such as 4K streaming) with the stipulated frames-per second (FPS) constraints. This in turn consumes more battery power in a power-constrained mobile device. Thus, the state-of-the-art gaming applications suffer from lower video quality (QoS) and/or energy efficiency. While there has been a plethora of recent works on optimizing game streaming applications, to our knowledge, there is no study that systematically investigates the design pairs on the end-to-end game streaming pipeline across the cloud, network, and edge devices to understand the individual contributions of the different stages of the pipeline for improving the overall QoS and energy efficiency. In this context, this paper presents a comprehensive performance and power analysis of the entire game streaming pipeline consisting of the server/cloud side, network, and edge. Through extensive measurements with a high-end workstation mimicking the cloud end, an open-source platform (Moonlight-GameStreaming) emulating the edge device/mobile platform, and two network settings (WiFi and 5G) we conduct a detailed measurement-based study with seven representative games with different characteristics. We characterize the performance in terms of frame latency, QoS, bitrate, and energy consumption for different stages of the gaming pipeline. Our study shows that the rendering stage and the encoding stage at the cloud end are the bottlenecks to support 4K streaming. While 5G is certainly more suitable for supporting enhanced video quality with 4K streaming, it is more expensive in terms of power consumption compared to WiFi. Further, fluctuations in 5G network quality can lead to huge frame drops thus affecting QoS, which needs to be addressed by a coordinated design between the edge device and the server. Finally, the network interface and the decoder units in a mobile platform need more energy-efficient design to support high quality games at a lower cost. These observations should help in designing more cost-effective future cloud gaming platforms.

随着5G的到来，在边缘设备上支持高质量的游戏流媒体应用已经成为现实。最近移动设备上云游戏应用的激增就证明了这一点。与视频流应用程序相比，交互式游戏需要更多的计算能力来支持改进的渲染(如4K流媒体)和规定的每秒帧数(FPS)限制。这反过来又会在功率有限的移动设备中消耗更多的电池电量。因此，最先进的游戏应用程序受到较低的视频质量(QoS)和/或能源效率的影响。虽然最近有大量关于优化游戏流应用程序的工作，但据我们所知，还没有研究系统地调查跨云、网络和边缘设备的端到端游戏流管道上的设计对，以了解管道不同阶段的个人贡献，以提高整体QoS和能源效率。在此背景下，本文对整个游戏流媒体管道进行了全面的性能和功耗分析，包括服务器/云端，网络和边缘。通过模拟云端的高端工作站、模拟边缘设备/移动平台的开源平台(Moonlight-GameStreaming)和两种网络设置(WiFi和5G)的广泛测量，我们对7款具有不同特征的代表性游戏进行了详细的基于测量的研究。我们根据帧延迟、QoS、比特率和游戏管道不同阶段的能耗来描述性能。我们的研究表明，云端的渲染阶段和编码阶段是支持4K流媒体的瓶颈。虽然5G当然更适合支持4K流媒体的增强视频质量，但与WiFi相比，它在功耗方面更昂贵。此外，5G网络质量的波动会导致大量的帧丢失，从而影响QoS，这需要通过边缘设备和服务器之间的协调设计来解决。最后，移动平台中的网络接口和解码器单元需要更节能的设计，以更低的成本支持高质量的游戏。这些观察结果将有助于设计更具成本效益的未来云游戏平台。

{"title":"End-to-end Characterization of Game Streaming Applications on Mobile Platforms","authors":"Sandeepa Bhuyan, Shulin Zhao, Ziyu Ying, M. Kandemir, C. Das","doi":"10.1145/3508030","DOIUrl":"https://doi.org/10.1145/3508030","url":null,"abstract":"With the advent of 5G, supporting high-quality game streaming applications on edge devices has become a reality. This is evidenced by a recent surge in cloud gaming applications on mobile devices. In contrast to video streaming applications, interactive games require much more compute power for supporting improved rendering (such as 4K streaming) with the stipulated frames-per second (FPS) constraints. This in turn consumes more battery power in a power-constrained mobile device. Thus, the state-of-the-art gaming applications suffer from lower video quality (QoS) and/or energy efficiency. While there has been a plethora of recent works on optimizing game streaming applications, to our knowledge, there is no study that systematically investigates the design pairs on the end-to-end game streaming pipeline across the cloud, network, and edge devices to understand the individual contributions of the different stages of the pipeline for improving the overall QoS and energy efficiency. In this context, this paper presents a comprehensive performance and power analysis of the entire game streaming pipeline consisting of the server/cloud side, network, and edge. Through extensive measurements with a high-end workstation mimicking the cloud end, an open-source platform (Moonlight-GameStreaming) emulating the edge device/mobile platform, and two network settings (WiFi and 5G) we conduct a detailed measurement-based study with seven representative games with different characteristics. We characterize the performance in terms of frame latency, QoS, bitrate, and energy consumption for different stages of the gaming pipeline. Our study shows that the rendering stage and the encoding stage at the cloud end are the bottlenecks to support 4K streaming. While 5G is certainly more suitable for supporting enhanced video quality with 4K streaming, it is more expensive in terms of power consumption compared to WiFi. Further, fluctuations in 5G network quality can lead to huge frame drops thus affecting QoS, which needs to be addressed by a coordinated design between the edge device and the server. Finally, the network interface and the decoder units in a mobile platform need more energy-efficient design to support high quality games at a lower cost. These observations should help in designing more cost-effective future cloud gaming platforms.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121264983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Fusing Speed Index during Web Page Loading 在网页加载过程中融合速度索引

Proceedings of the ACM on Measurement and Analysis of Computing Systems

Pub Date : 2022-02-24 DOI: 10.1145/3511214

Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, Feng Qian

With conventional web page load metrics (e.g., Page Load Time) being blamed for deviating from actual user experiences, in recent years a more sensible and complex metric called Speed Index (SI) has been widely adopted to measure the user's quality of experience (QoE). In brief, SI indicates how quickly a page is filled up with above-the-fold visible elements (or crucial elements for short). To date, however, SI has been used as a metric for performance evaluation, rather than as an explicit heuristic to improve page loading. To demystify this, we examine the entire loading process of various pages and ascribe such incapability to three-fold fundamental uncertainties in terms of network, browser execution, and viewport size. In this paper, we design SipLoader, an SI-oriented page load scheduler through a novel cumulative reactive scheduling framework. It does not attempt to deal with uncertainties in advance or in one shot, but schedules page loading by "repairing" the anticipated (nearly) SI-optimal scheduling when uncertainties actually occur. This is achieved with a suite of efficient designs that fully exploit the cumulative nature of SI calculation. Evaluations show that SipLoader improves the median SI by 41%, and provides 1.43 times to 1.99 times more benefits than state-of-the-art solutions.

由于传统的网页加载指标(如页面加载时间)被指责偏离了实际用户体验，近年来，一种更明智、更复杂的指标——速度指数(SI)被广泛采用来衡量用户体验质量(QoE)。简而言之，SI表示页面被折叠上方可见元素(或简称为关键元素)填充的速度。然而，到目前为止，SI一直被用作性能评估的指标，而不是作为改进页面加载的显式启发式方法。为了揭开这个神秘面纱，我们检查了各种页面的整个加载过程，并将这种无法加载归因于网络、浏览器执行和视口大小方面的三个基本不确定性。本文通过一种新颖的累积响应式调度框架，设计了面向si的页面加载调度程序SipLoader。它不尝试提前或一次性处理不确定性，而是在不确定性实际发生时，通过“修复”预期的(接近)si最优调度来调度页面加载。这是通过一套有效的设计来实现的，这些设计充分利用了SI计算的累积特性。评估表明，SipLoader将SI的中位数提高了41%，提供的效益是最先进解决方案的1.43倍至1.99倍。

{"title":"Fusing Speed Index during Web Page Loading","authors":"Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, Feng Qian","doi":"10.1145/3511214","DOIUrl":"https://doi.org/10.1145/3511214","url":null,"abstract":"With conventional web page load metrics (e.g., Page Load Time) being blamed for deviating from actual user experiences, in recent years a more sensible and complex metric called Speed Index (SI) has been widely adopted to measure the user's quality of experience (QoE). In brief, SI indicates how quickly a page is filled up with above-the-fold visible elements (or crucial elements for short). To date, however, SI has been used as a metric for performance evaluation, rather than as an explicit heuristic to improve page loading. To demystify this, we examine the entire loading process of various pages and ascribe such incapability to three-fold fundamental uncertainties in terms of network, browser execution, and viewport size. In this paper, we design SipLoader, an SI-oriented page load scheduler through a novel cumulative reactive scheduling framework. It does not attempt to deal with uncertainties in advance or in one shot, but schedules page loading by \"repairing\" the anticipated (nearly) SI-optimal scheduling when uncertainties actually occur. This is achieved with a suite of efficient designs that fully exploit the cumulative nature of SI calculation. Evaluations show that SipLoader improves the median SI by 41%, and provides 1.43 times to 1.99 times more benefits than state-of-the-art solutions.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124840839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2