Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems最新文献_第3页

On the Bottleneck Structure of Congestion-Controlled Networks 拥塞控制网络的瓶颈结构研究

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394204

Jordi Ros-Giralt, Atul Bohara, Sruthi Yellamraju, Harper Langston, R. Lethin, Yuang Jiang, L. Tassiulas, Josie Li, Yuanlong Tan, M. Veeraraghavan

In this paper, we introduce the Theory of Bottleneck Ordering, a mathematical framework that reveals the bottleneck structure of data networks. This theoretical framework provides insights into the inherent topological properties of a network in at least three areas: (1) It identifies the regions of influence of each bottleneck; (2) it reveals the order in which bottlenecks (and flows traversing them) converge to their steady state transmission rates in distributed congestion control algorithms; and (3) it provides key insights into the design of optimized traffic engineering policies. We demonstrate the efficacy of the proposed theory in TCP congestion-controlled networks for two broad classes of algorithms: Congestion-based algorithms (TCP BBR) and loss-based additive-increase/multiplicative-decrease algorithms (TCP Cubic and Reno). Among other results, our network experiments show that: (1) Qualitatively, both classes of congestion control algorithms behave as predicted by the bottleneck structure of the network; (2) flows compete for bandwidth only with other flows operating at the same bottleneck level; (3) BBR flows achieve higher performance and fairness than Cubic and Reno flows due to their ability to operate at the right bottleneck level; (4) the bottleneck structure of a network is continuously changing and its levels can be folded due to variations in the flows' round trip times; and (5) against conventional wisdom, low-hitter flows can have a large impact to the overall performance of a network.

本文引入瓶颈排序理论，这是一个揭示数据网络瓶颈结构的数学框架。该理论框架至少在三个方面提供了对网络固有拓扑特性的见解:(1)它确定了每个瓶颈的影响区域;(2)揭示了分布式拥塞控制算法中瓶颈(以及流经它们的流)收敛到稳态传输速率的顺序;(3)为优化交通工程策略的设计提供了重要的见解。我们证明了所提出的理论在TCP拥塞控制网络中对两大类算法的有效性:基于拥塞的算法(TCP BBR)和基于损失的加增/乘减算法(TCP Cubic和Reno)。在其他结果中，我们的网络实验表明:(1)定性地说，两类拥塞控制算法的行为都符合网络瓶颈结构的预测;(2)流只与在同一瓶颈水平运行的其他流竞争带宽;(3)由于BBR流能够在合适的瓶颈水平运行，因此比Cubic和Reno流具有更高的性能和公平性;(4)网络的瓶颈结构是不断变化的，由于流量往返时间的变化，瓶颈的水平可以折叠;(5)与传统观点相反，低命中率流量会对网络的整体性能产生很大影响。

{"title":"On the Bottleneck Structure of Congestion-Controlled Networks","authors":"Jordi Ros-Giralt, Atul Bohara, Sruthi Yellamraju, Harper Langston, R. Lethin, Yuang Jiang, L. Tassiulas, Josie Li, Yuanlong Tan, M. Veeraraghavan","doi":"10.1145/3393691.3394204","DOIUrl":"https://doi.org/10.1145/3393691.3394204","url":null,"abstract":"In this paper, we introduce the Theory of Bottleneck Ordering, a mathematical framework that reveals the bottleneck structure of data networks. This theoretical framework provides insights into the inherent topological properties of a network in at least three areas: (1) It identifies the regions of influence of each bottleneck; (2) it reveals the order in which bottlenecks (and flows traversing them) converge to their steady state transmission rates in distributed congestion control algorithms; and (3) it provides key insights into the design of optimized traffic engineering policies. We demonstrate the efficacy of the proposed theory in TCP congestion-controlled networks for two broad classes of algorithms: Congestion-based algorithms (TCP BBR) and loss-based additive-increase/multiplicative-decrease algorithms (TCP Cubic and Reno). Among other results, our network experiments show that: (1) Qualitatively, both classes of congestion control algorithms behave as predicted by the bottleneck structure of the network; (2) flows compete for bandwidth only with other flows operating at the same bottleneck level; (3) BBR flows achieve higher performance and fairness than Cubic and Reno flows due to their ability to operate at the right bottleneck level; (4) the bottleneck structure of a network is continuously changing and its levels can be folded due to variations in the flows' round trip times; and (5) against conventional wisdom, low-hitter flows can have a large impact to the overall performance of a network.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124851450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Understanding (Mis)Behavior on the EOSIO Blockchain 理解(错误)EOSIO区块链上的行为

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394223

Yuheng Huang, Haoyu Wang, Lei Wu, Gareth Tyson, Xiapu Luo, Run Zhang, Xuanzhe Liu, Gang Huang, Xuxian Jiang

EOSIO has become one of the most popular blockchain platforms since its mainnet launch in June 2018. In contrast to the traditional PoW-based systems (e.g., Bitcoin and Ethereum), which are limited by low throughput, EOSIO is the first high throughput Delegated Proof of Stake system that has been widely adopted by many decentralized applications. Although EOSIO has millions of accounts and billions of transactions, little is known about its ecosystem, especially related to security and fraud. In this paper, we perform a large-scale measurement study of the EOSIO blockchain and its associated DApps. We gather a large-scale dataset of EOSIO and characterize activities including money transfers, account creation and contract invocation. Using our insights, we then develop techniques to automatically detect bots and fraudulent activity. We discover thousands of bot accounts (over 30% of the accounts in the platform) and a number of real-world attacks (301 attack accounts). By the time of our study, 80 attack accounts we identified have been confirmed by DApp teams, causing 828,824 EOS tokens losses (roughly $2.6 million) in total.

自2018年6月主网上线以来，EOSIO已成为最受欢迎的区块链平台之一。与传统的基于pow的系统(例如比特币和以太坊)相比，它们受到低吞吐量的限制，EOSIO是第一个高吞吐量的权益委托证明系统，已被许多分散应用程序广泛采用。尽管EOSIO拥有数百万个账户和数十亿笔交易，但人们对其生态系统知之甚少，尤其是与安全和欺诈有关的生态系统。在本文中，我们对EOSIO区块链及其相关的dapp进行了大规模的测量研究。我们收集了一个大规模的EOSIO数据集，并描述了包括转账、账户创建和合同调用在内的活动。利用我们的洞察力，我们开发了自动检测机器人和欺诈活动的技术。我们发现了数千个僵尸账户(占平台账户总数的30%以上)和一些真实世界的攻击(301个攻击账户)。到我们研究时，我们发现的80个攻击账户已被DApp团队确认，共造成828,824个EOS代币损失(约260万美元)。

{"title":"Understanding (Mis)Behavior on the EOSIO Blockchain","authors":"Yuheng Huang, Haoyu Wang, Lei Wu, Gareth Tyson, Xiapu Luo, Run Zhang, Xuanzhe Liu, Gang Huang, Xuxian Jiang","doi":"10.1145/3393691.3394223","DOIUrl":"https://doi.org/10.1145/3393691.3394223","url":null,"abstract":"EOSIO has become one of the most popular blockchain platforms since its mainnet launch in June 2018. In contrast to the traditional PoW-based systems (e.g., Bitcoin and Ethereum), which are limited by low throughput, EOSIO is the first high throughput Delegated Proof of Stake system that has been widely adopted by many decentralized applications. Although EOSIO has millions of accounts and billions of transactions, little is known about its ecosystem, especially related to security and fraud. In this paper, we perform a large-scale measurement study of the EOSIO blockchain and its associated DApps. We gather a large-scale dataset of EOSIO and characterize activities including money transfers, account creation and contract invocation. Using our insights, we then develop techniques to automatically detect bots and fraudulent activity. We discover thousands of bot accounts (over 30% of the accounts in the platform) and a number of real-world attacks (301 attack accounts). By the time of our study, 80 attack accounts we identified have been confirmed by DApp teams, causing 828,824 EOS tokens losses (roughly $2.6 million) in total.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116643562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Set the Configuration for the Heart of the OS: On the Practicality of Operating System Kernel Debloating 设置操作系统核心配置:论操作系统内核膨胀的实用性

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394215

H. Kuo, Jianyan Chen, Sibin Mohan, Tianyin Xu

This paper presents a study on the practicality of operating system (OS) kernel debloating---reducing kernel code that is not needed by the target applications---in real-world systems. Despite their significant benefits regarding security (attack surface reduction) and performance (fast boot times and reduced memory footprints), the state-of-the-art OS kernel debloating techniques are seldom adopted in practice, especially in production systems. We identify the limitations of existing kernel debloating techniques that hinder their practical adoption, including both accidental and essential limitations. To understand these limitations, we build an advanced debloating framework named tool which enables us to conduct a number of experiments on different types of OS kernels (including Linux and the L4 microkernel) with a wide variety of applications (including HTTPD, Memcached, MySQL, NGINX, PHP and Redis). Our experimental results reveal the challenges and opportunities towards making kernel debloating techniques practical for real-world systems. The main goal of this paper is to share these insights and our experiences to shed light on addressing the limitations of kernel debloating in future research and development efforts.

本文研究了在实际系统中操作系统(OS)内核精简(裁减目标应用程序不需要的内核代码)的实用性。尽管在安全性(减少攻击面)和性能(快速启动时间和减少内存占用)方面有显著的好处，但在实践中很少采用最先进的操作系统内核膨胀技术，特别是在生产系统中。我们确定了现有的内核分解技术的限制，这些限制阻碍了它们的实际应用，包括偶然的和本质的限制。为了理解这些限制，我们构建了一个名为tool的高级膨胀框架，它使我们能够在不同类型的操作系统内核(包括Linux和L4微内核)上进行大量实验，并使用各种应用程序(包括HTTPD, Memcached, MySQL, NGINX, PHP和Redis)。我们的实验结果揭示了使内核膨胀技术适用于实际系统的挑战和机遇。本文的主要目标是分享这些见解和我们的经验，以便在未来的研究和开发工作中阐明如何解决内核膨胀的局限性。

{"title":"Set the Configuration for the Heart of the OS: On the Practicality of Operating System Kernel Debloating","authors":"H. Kuo, Jianyan Chen, Sibin Mohan, Tianyin Xu","doi":"10.1145/3393691.3394215","DOIUrl":"https://doi.org/10.1145/3393691.3394215","url":null,"abstract":"This paper presents a study on the practicality of operating system (OS) kernel debloating---reducing kernel code that is not needed by the target applications---in real-world systems. Despite their significant benefits regarding security (attack surface reduction) and performance (fast boot times and reduced memory footprints), the state-of-the-art OS kernel debloating techniques are seldom adopted in practice, especially in production systems. We identify the limitations of existing kernel debloating techniques that hinder their practical adoption, including both accidental and essential limitations. To understand these limitations, we build an advanced debloating framework named tool which enables us to conduct a number of experiments on different types of OS kernels (including Linux and the L4 microkernel) with a wide variety of applications (including HTTPD, Memcached, MySQL, NGINX, PHP and Redis). Our experimental results reveal the challenges and opportunities towards making kernel debloating techniques practical for real-world systems. The main goal of this paper is to share these insights and our experiences to shed light on addressing the limitations of kernel debloating in future research and development efforts.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129711077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Centaur: A Novel Architecture for Reliable, Low-Wear, High-Density 3D NAND Storage 半人马:一种可靠、低损耗、高密度3D NAND存储的新架构

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394177

Chun-Yi Liu, Jagadish B. Kotra, Myoungsoo Jung, M. Kandemir

Due to the high density storage demand coming from applications from different domains, 3D NAND flash is becoming a promising candidate to replace 2D NAND flash as the dominant non-volatile memory. However, denser 3D NAND presents various performance and reliability issues, which can be addressed by the 3D NAND specific full-sequence program (FSP) operation. The FSP programs multiple pages simultaneously to mitigate the performance degradation caused by the long latency 3D NAND baseline program operations. However, the FSP-enabled 3D NAND-based SSDs introduce lifetime degradation due to the larger write granularities accessed by the FSP. To address the lifetime issue, in this paper, we propose and experimentally evaluate Centaur, a heterogeneous 2D/3D NAND heterogeneous SSD, as a solution. Centaur has three main components: a lifetime-aware inter-NAND request dispatcher, a lifetime-aware inter-NAND work stealer, and a data migration strategy from 2D NAND to 3D NAND. We used twelve SSD workloads to compare Centaur against a state-of-the-art 3D NAND-based SSD with the same capacity. Our experimental results indicate that the SSD lifetime and performance are improved by 3.7x and 1.11x, respectively, when using our 2D/3D heterogeneous SSD.

由于不同领域的应用对高密度存储的需求，3D NAND闪存正成为取代2D NAND闪存成为主流非易失性存储器的有希望的候选人。然而，密度更大的3D NAND存在各种性能和可靠性问题，这些问题可以通过3D NAND特定的全序列程序(FSP)操作来解决。FSP同时对多个页面进行编程，以减轻长延迟3D NAND基线程序操作造成的性能下降。然而，FSP支持的3D nand ssd由于FSP访问更大的写入粒度而导致寿命降低。为了解决寿命问题，在本文中，我们提出并实验评估了Centaur，一种异构2D/3D NAND异构SSD作为解决方案。Centaur有三个主要组件:一个生命周期感知的内部NAND请求调度程序，一个生命周期感知的内部NAND工作窃取器，以及一个从2D NAND到3D NAND的数据迁移策略。我们使用12个SSD工作负载来比较Centaur与具有相同容量的最先进的基于3D nand的SSD。我们的实验结果表明，使用我们的2D/3D异构SSD时，SSD寿命和性能分别提高了3.7倍和1.11倍。

{"title":"Centaur: A Novel Architecture for Reliable, Low-Wear, High-Density 3D NAND Storage","authors":"Chun-Yi Liu, Jagadish B. Kotra, Myoungsoo Jung, M. Kandemir","doi":"10.1145/3393691.3394177","DOIUrl":"https://doi.org/10.1145/3393691.3394177","url":null,"abstract":"Due to the high density storage demand coming from applications from different domains, 3D NAND flash is becoming a promising candidate to replace 2D NAND flash as the dominant non-volatile memory. However, denser 3D NAND presents various performance and reliability issues, which can be addressed by the 3D NAND specific full-sequence program (FSP) operation. The FSP programs multiple pages simultaneously to mitigate the performance degradation caused by the long latency 3D NAND baseline program operations. However, the FSP-enabled 3D NAND-based SSDs introduce lifetime degradation due to the larger write granularities accessed by the FSP. To address the lifetime issue, in this paper, we propose and experimentally evaluate Centaur, a heterogeneous 2D/3D NAND heterogeneous SSD, as a solution. Centaur has three main components: a lifetime-aware inter-NAND request dispatcher, a lifetime-aware inter-NAND work stealer, and a data migration strategy from 2D NAND to 3D NAND. We used twelve SSD workloads to compare Centaur against a state-of-the-art 3D NAND-based SSD with the same capacity. Our experimental results indicate that the SSD lifetime and performance are improved by 3.7x and 1.11x, respectively, when using our 2D/3D heterogeneous SSD.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133407915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fundamental Limits on the Regret of Online Network-Caching 在线网络缓存遗憾的基本限制

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394189

Rajarshi Bhattacharjee, Subhankar Banerjee, Abhishek Sinha

Optimal caching of files in a content distribution network (CDN) is a problem of fundamental and growing commercial interest. Although many different caching algorithms are in use today, the fundamental performance limits of the network caching algorithms from an online learning point-of-view remain poorly understood to date. In this paper, we resolve this question in the following two settings: (1) a single user connected to a single cache, and (2) a set of users and a set of caches interconnected through a bipartite network. Recently, an online gradient-based coded caching policy was shown to enjoy sub-linear regret. However, due to the lack of known regret lower bounds, the question of the optimality of the proposed policy was left open. In this paper, we settle this question by deriving tight non-asymptotic regret lower bounds in the above settings. In addition to that, we propose a new Follow-the-Perturbed-Leader-based uncoded caching policy with near-optimal regret. Technically, the lower-bounds are obtained by relating the online caching problem to the classic probabilistic paradigm of balls-into-bins. Our proofs make extensive use of a new result on the expected load in the most populated half of the bins, which might also be of independent interest. We evaluate the performance of the caching policies by experimenting with the popular MovieLens dataset and conclude the paper with design recommendations and a list of open problems.

内容分发网络(CDN)中文件的最佳缓存是一个基本的和日益增长的商业利益问题。尽管目前使用了许多不同的缓存算法，但从在线学习的角度来看，网络缓存算法的基本性能限制至今仍然知之甚少。在本文中，我们在以下两种情况下解决了这个问题:(1)单个用户连接到单个缓存，以及(2)一组用户和一组缓存通过二部网络互连。最近，一种基于在线梯度的编码缓存策略被证明具有次线性遗憾。然而，由于缺乏已知的遗憾下界，所提议的政策的最优性问题仍未解决。在本文中，我们通过在上述设置下推导紧非渐近遗憾下界来解决这个问题。除此之外，我们还提出了一种新的基于“跟踪被扰乱的领导者”的非编码缓存策略，该策略具有接近最优的遗憾。从技术上讲，下界是通过将在线缓存问题与经典的扔球到箱子的概率范式联系起来得到的。我们的证明广泛地使用了关于人口最多的一半箱子的预期负荷的新结果，这也可能是独立的兴趣。我们通过对流行的MovieLens数据集进行实验来评估缓存策略的性能，并以设计建议和开放问题列表来总结本文。

{"title":"Fundamental Limits on the Regret of Online Network-Caching","authors":"Rajarshi Bhattacharjee, Subhankar Banerjee, Abhishek Sinha","doi":"10.1145/3393691.3394189","DOIUrl":"https://doi.org/10.1145/3393691.3394189","url":null,"abstract":"Optimal caching of files in a content distribution network (CDN) is a problem of fundamental and growing commercial interest. Although many different caching algorithms are in use today, the fundamental performance limits of the network caching algorithms from an online learning point-of-view remain poorly understood to date. In this paper, we resolve this question in the following two settings: (1) a single user connected to a single cache, and (2) a set of users and a set of caches interconnected through a bipartite network. Recently, an online gradient-based coded caching policy was shown to enjoy sub-linear regret. However, due to the lack of known regret lower bounds, the question of the optimality of the proposed policy was left open. In this paper, we settle this question by deriving tight non-asymptotic regret lower bounds in the above settings. In addition to that, we propose a new Follow-the-Perturbed-Leader-based uncoded caching policy with near-optimal regret. Technically, the lower-bounds are obtained by relating the online caching problem to the classic probabilistic paradigm of balls-into-bins. Our proofs make extensive use of a new result on the expected load in the most populated half of the bins, which might also be of independent interest. We evaluate the performance of the caching policies by experimenting with the popular MovieLens dataset and conclude the paper with design recommendations and a list of open problems.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128546959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

DSM: A Case for Hardware-Assisted Merging of DRAM Rows with Same Content DSM:一个硬件辅助的具有相同内容的DRAM行合并案例

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394182

Seyed Armin Vakil-Ghahani, M. Kandemir, Jagadish B. Kotra

The number of cores and the capacities of main memory in modern systems have been growing significantly. Specifically, memory scaling, although at a slower pace than computation scaling, provided opportunities for very large DRAMs with Terabytes (TBs) capacity. Consequently, addressing the performance and energy consumption bottlenecks of DRAMs is more important than ever. DRAM memory refresh operation is one of the main contributing factors to the memory overheads, especially for large capacity DRAMs used in modern servers and emerging large-scale data centers. This paper addresses the memory refresh problem by leveraging the fact that most cloud servers host virtualized systems that use similar kernels, libraries, etc. We propose and experimentally evaluate a novel approach that exploits this observation to address the DRAM refresh overhead in such systems. More specifically, in this work, we present DSM, a light-weight hardware extension in memory controller to detect the pages with same content in memory and refresh only one of them and redirect the requests to the others to this page. Our detailed experimental analysis shows that the proposed DSM design can reduce 99th percentile memory access latency by up to 2.01x, and it also reduces the overall memory energy consumption by up to 8.5%.

在现代系统中，核心的数量和主存储器的容量一直在显著增长。具体来说，内存扩展虽然比计算扩展速度慢，但为具有tb (tb)容量的非常大的dram提供了机会。因此，解决dram的性能和能耗瓶颈比以往任何时候都更加重要。DRAM内存刷新操作是导致内存开销的主要因素之一，特别是对于现代服务器和新兴大型数据中心中使用的大容量DRAM。本文通过利用大多数云服务器托管使用类似内核、库等的虚拟化系统这一事实来解决内存刷新问题。我们提出并实验评估了一种利用这种观察来解决此类系统中DRAM刷新开销的新方法。更具体地说，在这项工作中，我们提出了DSM，这是内存控制器中的轻量级硬件扩展，用于检测内存中具有相同内容的页面，并仅刷新其中一个页面，并将对其他页面的请求重定向到该页。我们详细的实验分析表明，提出的DSM设计可以将第99百分位内存访问延迟降低高达2.01倍，并将总体内存能耗降低高达8.5%。

{"title":"DSM: A Case for Hardware-Assisted Merging of DRAM Rows with Same Content","authors":"Seyed Armin Vakil-Ghahani, M. Kandemir, Jagadish B. Kotra","doi":"10.1145/3393691.3394182","DOIUrl":"https://doi.org/10.1145/3393691.3394182","url":null,"abstract":"The number of cores and the capacities of main memory in modern systems have been growing significantly. Specifically, memory scaling, although at a slower pace than computation scaling, provided opportunities for very large DRAMs with Terabytes (TBs) capacity. Consequently, addressing the performance and energy consumption bottlenecks of DRAMs is more important than ever. DRAM memory refresh operation is one of the main contributing factors to the memory overheads, especially for large capacity DRAMs used in modern servers and emerging large-scale data centers. This paper addresses the memory refresh problem by leveraging the fact that most cloud servers host virtualized systems that use similar kernels, libraries, etc. We propose and experimentally evaluate a novel approach that exploits this observation to address the DRAM refresh overhead in such systems. More specifically, in this work, we present DSM, a light-weight hardware extension in memory controller to detect the pages with same content in memory and refresh only one of them and redirect the requests to the others to this page. Our detailed experimental analysis shows that the proposed DSM design can reduce 99th percentile memory access latency by up to 2.01x, and it also reduces the overall memory energy consumption by up to 8.5%.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127922745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Forecasting with Alternative Data 利用替代数据进行预测

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394187

Michael Fleder, D. Shah

We consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings reports). We utilize a classical linear systems model to capture both the evolution of the hidden or latent state (e.g. daily revenue), as well as the proxy signal (e.g. credit cards transactions). The linear system model is particularly well suited here as data is extremely sparse (4 quarterly reports per year). In classical system identification, where the central theme is to learn parameters for such linear systems, unbiased and consistent estimation of parameters is not feasible: the likelihood is non-convex; and worse, the global optimum for maximum likelihood estimation is often non-unique. As the main contribution of this work, we provide a simple, consistent estimator of all parameters for the linear system model of interest; in addition the estimation is unbiased for some of the parameters. In effect, the additional sparse observations of aggregate hidden state (e.g. quarterly reports) enable system identification in our setup that is not feasible in general. For estimating and forecasting hidden state (actual earnings) using the noisy observations (daily credit card transactions), we utilize the learned linear model along with a natural adaptation of classical Kalman filtering (or Belief Propagation). This leads to optimal inference with respect to mean-squared error. Analytically, we argue that even though the underlying linear system may be "unstable,'' "uncontrollable,'' or "undetectable'' in the classical setting, our setup and inference algorithm allow for estimation of hidden state with bounded error. Further, the estimation error of the algorithm monotonically decreases as the frequency of the sparse observations increases. This, seemingly intuitive insight contradicts the word on the Street. Finally, we utilize our framework to estimate quarterly earnings of 34 public companies using credit card transaction data. Our data-driven method convincingly outperforms the Wall Street consensus (analyst) estimates even though our method uses only credit card data as input, while the Wall Street consensus is based on various data sources including experts' input.

我们考虑从两种输入类型预测细粒度公司财务(如每日收入)的问题:嘈杂的代理信号和替代数据(如信用卡交易)和稀疏的基本事实观察(如季度收益报告)。我们利用经典的线性系统模型来捕捉隐藏或潜在状态(例如每日收入)以及代理信号(例如信用卡交易)的演变。线性系统模型特别适合这里，因为数据非常稀疏(每年4个季度报告)。在经典系统辨识中，中心主题是学习这种线性系统的参数，无偏和一致的参数估计是不可行的:似然是非凸的;更糟糕的是，最大似然估计的全局最优通常不是唯一的。作为这项工作的主要贡献，我们为感兴趣的线性系统模型提供了一个简单的，所有参数的一致估计;此外，对某些参数的估计是无偏的。实际上，对汇总隐藏状态的额外稀疏观察(例如季度报告)使我们的设置中的系统识别在一般情况下是不可行的。为了使用噪声观测(日常信用卡交易)估计和预测隐藏状态(实际收益)，我们利用学习的线性模型以及经典卡尔曼滤波(或信念传播)的自然适应。这导致了关于均方误差的最佳推断。在分析上，我们认为，即使底层线性系统在经典设置中可能是“不稳定的”、“不可控的”或“不可检测的”，我们的设置和推理算法允许估计具有有限误差的隐藏状态。此外，算法的估计误差随着稀疏观测频率的增加而单调减小。这种看似直观的见解与华尔街的说法相矛盾。最后，我们利用我们的框架来估计34家上市公司使用信用卡交易数据的季度收益。尽管我们的方法只使用信用卡数据作为输入，但我们的数据驱动方法令人信服地优于华尔街的共识(分析师)估计，而华尔街的共识是基于各种数据源，包括专家的输入。

{"title":"Forecasting with Alternative Data","authors":"Michael Fleder, D. Shah","doi":"10.1145/3393691.3394187","DOIUrl":"https://doi.org/10.1145/3393691.3394187","url":null,"abstract":"We consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings reports). We utilize a classical linear systems model to capture both the evolution of the hidden or latent state (e.g. daily revenue), as well as the proxy signal (e.g. credit cards transactions). The linear system model is particularly well suited here as data is extremely sparse (4 quarterly reports per year). In classical system identification, where the central theme is to learn parameters for such linear systems, unbiased and consistent estimation of parameters is not feasible: the likelihood is non-convex; and worse, the global optimum for maximum likelihood estimation is often non-unique. As the main contribution of this work, we provide a simple, consistent estimator of all parameters for the linear system model of interest; in addition the estimation is unbiased for some of the parameters. In effect, the additional sparse observations of aggregate hidden state (e.g. quarterly reports) enable system identification in our setup that is not feasible in general. For estimating and forecasting hidden state (actual earnings) using the noisy observations (daily credit card transactions), we utilize the learned linear model along with a natural adaptation of classical Kalman filtering (or Belief Propagation). This leads to optimal inference with respect to mean-squared error. Analytically, we argue that even though the underlying linear system may be \"unstable,'' \"uncontrollable,'' or \"undetectable'' in the classical setting, our setup and inference algorithm allow for estimation of hidden state with bounded error. Further, the estimation error of the algorithm monotonically decreases as the frequency of the sparse observations increases. This, seemingly intuitive insight contradicts the word on the Street. Finally, we utilize our framework to estimate quarterly earnings of 34 public companies using credit card transaction data. Our data-driven method convincingly outperforms the Wall Street consensus (analyst) estimates even though our method uses only credit card data as input, while the Wall Street consensus is based on various data sources including experts' input.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132711472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces 度量空间中情景强化学习的自适应离散化

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394176

Sean R. Sinclair, Siddhartha Banerjee, C. Yu

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization.

我们提出了一种在大型(可能连续的)状态-动作空间上进行无模型情景强化学习的有效算法。我们的算法基于一种新颖的q学习策略，具有自适应数据驱动离散化。其核心思想是在历史轨迹中经常访问的区域中保持更精细的状态-行动空间划分，并且具有更高的收益估计。我们演示了我们的自适应分区如何利用最优q函数的形状和关节空间，而不牺牲最坏情况的性能。特别是，我们恢复了连续状态-动作空间的先前算法的遗憾保证，这额外需要最优离散化作为输入，和/或访问模拟oracle。此外，实验证明了我们的算法如何自动适应问题的底层结构，与启发式和均匀离散化的q学习相比，产生了更好的性能。

引用次数: 0

Latency Imbalance Among Internet Load-Balanced Paths: A Cloud-Centric View 互联网负载均衡路径的延迟不平衡:以云为中心的观点

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394196

Yibo Pi, S. Jamin, P. Danzig, Feng Qian

Load balancers choose among load-balanced paths to distribute traffic as if it makes no difference using one path or another. This work shows that the latency difference between load-balanced paths (called latency imbalance), previously deemed insignificant, is now prevalent from the perspective of the cloud and affects various latency-sensitive applications. In this work, we present the first large-scale measurement study of latency imbalance from a cloud-centric view. Using public cloud around the globe, we measure latency imbalance both between data centers (DCs) in the cloud and from the cloud to the public Internet. Our key findings include that 1) Amazon's and Alibaba's clouds together have latency difference between load-balanced paths larger than 20ms to 21% of public IPv4 addresses; 2) Google's secret in having lower latency imbalance than other clouds is to use its own well-balanced private WANs to transit traffic close to the destinations and that 3) latency imbalance is also prevalent between DCs in the cloud, where 8 pairs of DCs are found to have load-balanced paths with latency difference larger than 40ms. We further evaluate the impact of latency imbalance on three applications (i.e., NTP, delay-based geolocation and VoIP) and propose potential solutions to improve application performance. Our experiments show that all three applications can benefit from considering latency imbalance, where the accuracy of delay-based geolocation can be greatly improved by simply changing how ping measures the minimum path latency.

负载平衡器选择负载平衡的路径来分配流量，就好像使用一条路径或另一条路径没有区别一样。这项工作表明，负载平衡路径之间的延迟差异(称为延迟不平衡)，以前被认为是微不足道的，现在从云的角度来看是普遍的，并影响到各种延迟敏感的应用程序。在这项工作中，我们首次从云中心的角度对延迟不平衡进行了大规模测量研究。使用全球范围内的公共云，我们测量了云中的数据中心(dc)之间以及从云到公共互联网之间的延迟不平衡。我们的主要发现包括:1)亚马逊和阿里巴巴的云在负载均衡路径和21%的公共IPv4地址之间的延迟差异大于20毫秒;2)谷歌比其他云具有更低延迟不平衡的秘诀是使用自己的良好平衡的私有wan来传输靠近目的地的流量，3)延迟不平衡在云中的数据中心之间也很普遍，其中发现8对数据中心具有延迟差异大于40ms的负载平衡路径。我们进一步评估了延迟不平衡对三个应用程序(即NTP，基于延迟的地理定位和VoIP)的影响，并提出了提高应用程序性能的潜在解决方案。我们的实验表明，这三个应用程序都可以从考虑延迟不平衡中获益，通过简单地改变ping测量最小路径延迟的方式，可以大大提高基于延迟的地理定位的准确性。

{"title":"Latency Imbalance Among Internet Load-Balanced Paths: A Cloud-Centric View","authors":"Yibo Pi, S. Jamin, P. Danzig, Feng Qian","doi":"10.1145/3393691.3394196","DOIUrl":"https://doi.org/10.1145/3393691.3394196","url":null,"abstract":"Load balancers choose among load-balanced paths to distribute traffic as if it makes no difference using one path or another. This work shows that the latency difference between load-balanced paths (called latency imbalance), previously deemed insignificant, is now prevalent from the perspective of the cloud and affects various latency-sensitive applications. In this work, we present the first large-scale measurement study of latency imbalance from a cloud-centric view. Using public cloud around the globe, we measure latency imbalance both between data centers (DCs) in the cloud and from the cloud to the public Internet. Our key findings include that 1) Amazon's and Alibaba's clouds together have latency difference between load-balanced paths larger than 20ms to 21% of public IPv4 addresses; 2) Google's secret in having lower latency imbalance than other clouds is to use its own well-balanced private WANs to transit traffic close to the destinations and that 3) latency imbalance is also prevalent between DCs in the cloud, where 8 pairs of DCs are found to have load-balanced paths with latency difference larger than 40ms. We further evaluate the impact of latency imbalance on three applications (i.e., NTP, delay-based geolocation and VoIP) and propose potential solutions to improve application performance. Our experiments show that all three applications can benefit from considering latency imbalance, where the accuracy of delay-based geolocation can be greatly improved by simply changing how ping measures the minimum path latency.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114686175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems 2020年SIGMETRICS/性能计算机系统测量和建模联合国际会议摘要

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691

引用次数: 2