首页 > 最新文献

ACM SIGOPS Oper. Syst. Rev.最新文献

英文 中文
Performance Implications of Extended Page Tables on Virtualized x86 Processors 扩展页表在虚拟化x86处理器上的性能影响
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139652
Timothy Merrifield, H. Taheri
Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Processing TLB misses on a virtualized x86 server requires a twodimensional page walk that can have 6x more page table lookups, hence 6x more memory references, than a native page table walk. Thus much of the recent research on the subject starts from the assumption that TLB miss processing in virtual environments is significantly more expensive than on native servers. However, we will show that with the latest software stack on modern x86 processors, most of these page table lookups are satisfied by internal paging structure caches and the L1/L2 data caches, and the actual virtualization overhead of TLB miss processing is a modest fraction of the overall time spent processing TLB misses. We show that even for the heaviest workloads, a welltuned application that uses large pages on a recent OS release with a modern hypervisor running on the latest x86 processors sees only minimal degradation from the additional overhead of the two-dimensional page walks in a virtualized server.
管理虚拟内存是一项昂贵的操作,在虚拟化服务器上甚至会变得更加昂贵。在虚拟化的x86服务器上处理TLB缺失需要一个二维页遍历,它可以有6倍多的页表查找,因此比本机页表遍历多6倍的内存引用。因此,最近关于该主题的许多研究都基于这样的假设,即虚拟环境中的TLB遗漏处理比本地服务器上的TLB遗漏处理要昂贵得多。但是,我们将展示,使用现代x86处理器上的最新软件堆栈,大多数页表查找都可以通过内部分页结构缓存和L1/L2数据缓存来满足,并且TLB丢失处理的实际虚拟化开销只是处理TLB丢失所花费的总时间的一小部分。我们表明,即使对于最繁重的工作负载,在最新的操作系统版本上使用大型页面,并在最新的x86处理器上运行现代管理程序的经过优化的应用程序,在虚拟服务器中二维页面遍历的额外开销也只会带来最小的性能下降。
{"title":"Performance Implications of Extended Page Tables on Virtualized x86 Processors","authors":"Timothy Merrifield, H. Taheri","doi":"10.1145/3139645.3139652","DOIUrl":"https://doi.org/10.1145/3139645.3139652","url":null,"abstract":"Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Processing TLB misses on a virtualized x86 server requires a twodimensional page walk that can have 6x more page table lookups, hence 6x more memory references, than a native page table walk. Thus much of the recent research on the subject starts from the assumption that TLB miss processing in virtual environments is significantly more expensive than on native servers. However, we will show that with the latest software stack on modern x86 processors, most of these page table lookups are satisfied by internal paging structure caches and the L1/L2 data caches, and the actual virtualization overhead of TLB miss processing is a modest fraction of the overall time spent processing TLB misses.\u0000 We show that even for the heaviest workloads, a welltuned application that uses large pages on a recent OS release with a modern hypervisor running on the latest x86 processors sees only minimal degradation from the additional overhead of the two-dimensional page walks in a virtualized server.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82510668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Ingens: Huge Page Support for the OS and Hypervisor Ingens:对操作系统和管理程序的巨大页面支持
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139659
Youngjin Kwon, Hangchen Yu, Simon Peter, C. Rossbach, E. Witchel
Memory capacity and demand have grown hand in hand in recent years. However, overheads for memory virtualization, in particular for address translation, grow with memory capacity as well, motivating hardware manufacturers to provide TLBs with thousands of entries for larger pages, or huge pages. Current OSes and hypervisors support huge pages with a hodge-podge of best-effort algorithms and spot fixes that make less and less sense as architectural support for huge pages matures. The time has come for a more fundamental redesign. Ingens is a framework for providing transparent huge page support in a coordinated way. Ingens manages contiguity as a first-class resource, and tracks utilization and access frequency of memory pages, enabling it to eliminate pathologies that plague current systems. Experiments with a Linux/KVM-based prototype show improved fairness and performance, and reduced tail latency and memory bloat for important applications such as Web services and Redis. We report early experiences with our in-progress port of Ingens to the ESX Hypervisor.
近年来,内存容量和需求同步增长。然而,内存虚拟化(尤其是地址转换)的开销也会随着内存容量的增加而增加,这促使硬件制造商为更大的页面或巨大的页面提供数千个条目的tlb。当前的操作系统和管理程序支持巨大的页面,使用的是大杂烩式的最佳算法和点修复,随着对巨大页面的架构支持的成熟,这些算法和点修复越来越没有意义。是时候进行更根本的重新设计了。Ingens是一个框架,用于以协调的方式提供透明的大页面支持。Ingens将连续性作为一级资源进行管理,并跟踪内存页面的利用率和访问频率,使其能够消除困扰当前系统的病态。基于Linux/ kvm的原型的实验表明,对于Web服务和Redis等重要应用程序,提高了公平性和性能,减少了尾部延迟和内存膨胀。我们报告了正在进行的Ingens移植到ESX Hypervisor的早期经验。
{"title":"Ingens: Huge Page Support for the OS and Hypervisor","authors":"Youngjin Kwon, Hangchen Yu, Simon Peter, C. Rossbach, E. Witchel","doi":"10.1145/3139645.3139659","DOIUrl":"https://doi.org/10.1145/3139645.3139659","url":null,"abstract":"Memory capacity and demand have grown hand in hand in recent years. However, overheads for memory virtualization, in particular for address translation, grow with memory capacity as well, motivating hardware manufacturers to provide TLBs with thousands of entries for larger pages, or huge pages. Current OSes and hypervisors support huge pages with a hodge-podge of best-effort algorithms and spot fixes that make less and less sense as architectural support for huge pages matures. The time has come for a more fundamental redesign.\u0000 Ingens is a framework for providing transparent huge page support in a coordinated way. Ingens manages contiguity as a first-class resource, and tracks utilization and access frequency of memory pages, enabling it to eliminate pathologies that plague current systems. Experiments with a Linux/KVM-based prototype show improved fairness and performance, and reduced tail latency and memory bloat for important applications such as Web services and Redis. We report early experiences with our in-progress port of Ingens to the ESX Hypervisor.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89572532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Exploring Machine Learning for Thread Characterization on Heterogeneous Multiprocessors 探索机器学习在异构多处理器上的线程表征
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139664
Cha V. Li, V. Petrucci, D. Mossé
We introduce a thread characterization method that explores hardware performance counters and machine learning techniques to automate estimating workload execution on heterogeneous processors. We show that our characterization scheme achieves higher accuracy when predicting performance indicators, such as instructions per cycle and last-level cache misses, commonly used to determine the mapping of threads to processor types at runtime. We also show that support vector regression achieves higher accuracy when compared to linear regression, and has very low (1%) overhead. The results presented in this paper can provide a foundation for advanced investigations and interesting new directions in intelligent thread scheduling and power management on multiprocessors.
我们介绍了一种线程表征方法,该方法探索了硬件性能计数器和机器学习技术,以自动估计异构处理器上的工作负载执行。我们表明,我们的表征方案在预测性能指标(例如每个周期的指令和最后一级缓存缺失)时实现了更高的准确性,这些性能指标通常用于确定线程到运行时处理器类型的映射。我们还表明,与线性回归相比,支持向量回归实现了更高的精度,并且开销非常低(1%)。本文的研究结果为多处理器智能线程调度和电源管理的深入研究和有趣的新方向提供了基础。
{"title":"Exploring Machine Learning for Thread Characterization on Heterogeneous Multiprocessors","authors":"Cha V. Li, V. Petrucci, D. Mossé","doi":"10.1145/3139645.3139664","DOIUrl":"https://doi.org/10.1145/3139645.3139664","url":null,"abstract":"We introduce a thread characterization method that explores hardware performance counters and machine learning techniques to automate estimating workload execution on heterogeneous processors. We show that our characterization scheme achieves higher accuracy when predicting performance indicators, such as instructions per cycle and last-level cache misses, commonly used to determine the mapping of threads to processor types at runtime. We also show that support vector regression achieves higher accuracy when compared to linear regression, and has very low (1%) overhead. The results presented in this paper can provide a foundation for advanced investigations and interesting new directions in intelligent thread scheduling and power management on multiprocessors.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90586433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Revisiting the Paxos Foundations: A Look at Summer Internship Work at VMware Research 重新审视Paxos的基础:回顾VMware Research的暑期实习工作
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139656
H. Howard, D. Malkhi, A. Spiegelman
The summer of 2016 was buzzing with intern activity at the VMware Research Group (VRG), working with all the research team and with David Tennenhouse, Chief Research Officer of VMware. In this paper, we give a brief introduction to Flexible Paxos [4], one of the internship results. There were several other exciting outcomes; internships are a great way to participate in driving innovation at VMware! Flexible Paxos introduces a surprising observation concerning the foundations distributed computing. The observation revisits the basic requisites of Paxos [7, 8], Lamport’s widely adopted algorithmic foundation for fault tolerance and replication, and a pinnacle of his Turing award [1]. Since its publication, Paxos has been widely built upon in teaching, research and production systems. Paxos implements a fault tolerant state-machine among a group of nodes. At its core, Paxos uses two phases, each requires agreement from a subset of nodes (known as a quorum) to proceed. Throughout this manuscript, we will refer to the first phase as the leader election phase, and the second as the replication phase. The safety and liveness of Paxos is based on the guarantee that any two quorums will intersect. To satisfy this requirement, quorums are typically composed of any majority from a fixed set of nodes, although other quorum schemes have been proposed. In practice, we usually wish to reach agreement over a sequence of commands, not one. This is often referred to as the Multi-Paxos problem [3]. In Multi-Paxos, we use the leader election phase of Paxos to establish one node as a leader for all future commands, until it is replaced by another leader. We use the replication phase of Paxos to agree on a series of commands, one at a time. To commit a command, the leader must always communicate with at least a quorum of nodes and wait for them to accept the value. In the Flexible Paxos work, we observe that Paxos is conservative:
2016年夏天,VMware Research Group (VRG)的实习生活动非常活跃,我与整个研究团队以及VMware首席研究官David Tennenhouse一起工作。在本文中,我们将对实习成果之一Flexible Paxos[4]进行简要介绍。还有其他几个令人兴奋的结果;实习是参与推动VMware创新的好方法!灵活的Paxos引入了一个关于基础分布式计算的惊人观察。这一观察回顾了Paxos的基本要求[7,8],这是Lamport广泛采用的容错和复制算法基础,也是他获得图灵奖的巅峰之作[1]。自发布以来,Paxos已广泛应用于教学、研究和生产系统。Paxos在一组节点之间实现容错状态机。Paxos的核心使用两个阶段,每个阶段都需要得到节点子集(称为quorum)的同意才能继续进行。在本文中,我们将第一阶段称为领导者选举阶段,第二阶段称为复制阶段。Paxos的安全性和活动性是建立在保证任意两个群体将相交的基础上的。为了满足这一要求,仲裁通常由来自一组固定节点的任何多数组成,尽管已经提出了其他仲裁方案。在实践中,我们通常希望在一系列命令上达成一致,而不是一个命令。这通常被称为Multi-Paxos问题[3]。在Multi-Paxos中,我们使用Paxos的leader选举阶段来建立一个节点作为未来所有命令的leader,直到它被另一个leader所取代。我们使用Paxos的复制阶段来商定一系列命令,一次一个。要提交命令,leader必须始终与至少一定数量的节点通信,并等待它们接受该值。在Flexible Paxos工作中,我们观察到Paxos是保守的:
{"title":"Revisiting the Paxos Foundations: A Look at Summer Internship Work at VMware Research","authors":"H. Howard, D. Malkhi, A. Spiegelman","doi":"10.1145/3139645.3139656","DOIUrl":"https://doi.org/10.1145/3139645.3139656","url":null,"abstract":"The summer of 2016 was buzzing with intern activity at the VMware Research Group (VRG), working with all the research team and with David Tennenhouse, Chief Research Officer of VMware. In this paper, we give a brief introduction to Flexible Paxos [4], one of the internship results. There were several other exciting outcomes; internships are a great way to participate in driving innovation at VMware! Flexible Paxos introduces a surprising observation concerning the foundations distributed computing. The observation revisits the basic requisites of Paxos [7, 8], Lamport’s widely adopted algorithmic foundation for fault tolerance and replication, and a pinnacle of his Turing award [1]. Since its publication, Paxos has been widely built upon in teaching, research and production systems. Paxos implements a fault tolerant state-machine among a group of nodes. At its core, Paxos uses two phases, each requires agreement from a subset of nodes (known as a quorum) to proceed. Throughout this manuscript, we will refer to the first phase as the leader election phase, and the second as the replication phase. The safety and liveness of Paxos is based on the guarantee that any two quorums will intersect. To satisfy this requirement, quorums are typically composed of any majority from a fixed set of nodes, although other quorum schemes have been proposed. In practice, we usually wish to reach agreement over a sequence of commands, not one. This is often referred to as the Multi-Paxos problem [3]. In Multi-Paxos, we use the leader election phase of Paxos to establish one node as a leader for all future commands, until it is replaced by another leader. We use the replication phase of Paxos to agree on a series of commands, one at a time. To commit a command, the leader must always communicate with at least a quorum of nodes and wait for them to accept the value. In the Flexible Paxos work, we observe that Paxos is conservative:","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75493126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Building an Extensible Open vSwitch Datapath 构建可扩展的开放vSwitch数据路径
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139657
Cheng-Chun Tu, Joe Stringer, J. Pettit
The virtual switch is the cornerstone of the today's virtualized data center. As all traffic to and from virtual machines or containers must pass through a vSwitch, it is the ideal location for network configuration and policy enforcement. The bulk of Open vSwitch functionality is platform-agnostic and portable. However the datapath, which touches every packet, is unique to each supported platform. Maintaining each datapath requires duplicated effort and the result has been inconsistent support of features across platforms. Even on a single platform, the features supported by a particular kernel version can vary. Further, datapath functionality must be broadly useful which prevents having application-specific features in the fast path. eBPF, extended Berkeley Packet Filter, enables userspace applications to customize and extend the Linux kernel's functionality. It provides flexible platform abstractions for network functions, and is being ported to a variety of platforms. This paper describes the design, implementation, and evaluation of an eBPF-based extensible OVS datapath. The eBPF OVS datapath delivers the equivalent functionality of the existing OVS kernel datapath, while significantly reducing development pain points around maintainability and extensibility. We demonstrate that these benefits don't necessarily have a trade off in regards to performance, with the eBPFbased datapath showing negligible overhead compared to the existing kernel datapath.
虚拟交换机是当今虚拟化数据中心的基石。由于所有进出虚拟机或容器的流量都必须通过虚拟交换机,因此它是网络配置和策略实施的理想位置。Open vSwitch的大部分功能是平台无关的和可移植的。然而,涉及每个数据包的数据路径对于每个支持的平台都是唯一的。维护每个数据路径需要重复的工作,其结果是对跨平台特性的支持不一致。即使在单一平台上,特定内核版本所支持的特性也会有所不同。此外,数据路径功能必须广泛使用,以防止在快速路径中使用特定于应用程序的特性。eBPF是扩展的Berkeley包过滤器,它允许用户空间应用程序自定义和扩展Linux内核的功能。它为网络功能提供了灵活的平台抽象,并被移植到各种平台上。本文描述了基于ebpf的可扩展OVS数据路径的设计、实现和评估。eBPF OVS数据路径提供了与现有OVS内核数据路径相同的功能,同时显著减少了围绕可维护性和可扩展性的开发痛点。我们证明了这些好处并不一定要以性能为代价,与现有的内核数据路径相比,基于ebpf的数据路径的开销可以忽略不计。
{"title":"Building an Extensible Open vSwitch Datapath","authors":"Cheng-Chun Tu, Joe Stringer, J. Pettit","doi":"10.1145/3139645.3139657","DOIUrl":"https://doi.org/10.1145/3139645.3139657","url":null,"abstract":"The virtual switch is the cornerstone of the today's virtualized data center. As all traffic to and from virtual machines or containers must pass through a vSwitch, it is the ideal location for network configuration and policy enforcement.\u0000 The bulk of Open vSwitch functionality is platform-agnostic and portable. However the datapath, which touches every packet, is unique to each supported platform. Maintaining each datapath requires duplicated effort and the result has been inconsistent support of features across platforms. Even on a single platform, the features supported by a particular kernel version can vary. Further, datapath functionality must be broadly useful which prevents having application-specific features in the fast path.\u0000 eBPF, extended Berkeley Packet Filter, enables userspace applications to customize and extend the Linux kernel's functionality. It provides flexible platform abstractions for network functions, and is being ported to a variety of platforms. This paper describes the design, implementation, and evaluation of an eBPF-based extensible OVS datapath. The eBPF OVS datapath delivers the equivalent functionality of the existing OVS kernel datapath, while significantly reducing development pain points around maintainability and extensibility. We demonstrate that these benefits don't necessarily have a trade off in regards to performance, with the eBPFbased datapath showing negligible overhead compared to the existing kernel datapath.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77731406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
The P416 Programming Language P416编程语言
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139648
M. Budiu, C. Dodd
P4 is a language for expressing how packets are processed by the data-plane of a programmable network element such as a hardware or software switch, network interface card, router or network function appliance. This document describes the most recent version of the language, P416, and the reference implementation of the P416 compiler.
P4是一种语言,用于表示可编程网络元件(如硬件或软件交换机、网络接口卡、路由器或网络功能设备)的数据平面如何处理数据包。本文档描述了该语言的最新版本P416,以及P416编译器的参考实现。
{"title":"The P416 Programming Language","authors":"M. Budiu, C. Dodd","doi":"10.1145/3139645.3139648","DOIUrl":"https://doi.org/10.1145/3139645.3139648","url":null,"abstract":"P4 is a language for expressing how packets are processed by the data-plane of a programmable network element such as a hardware or software switch, network interface card, router or network function appliance. This document describes the most recent version of the language, P416, and the reference implementation of the P416 compiler.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82054637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Hybrid Cloud Storage: Bridging the Gap between Compute Clusters and Cloud Storage 混合云存储:弥合计算集群和云存储之间的差距
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139653
Abhishek K. Gupta, Richard P. Spillane, Wenguang Wang, Maxime Austruy, Vahid Fereydouny, C. Karamanolis
Thanks to the compelling economics of public cloud storage, the trend in the IT industry is to move the bulk of analytics and application data to services such as AWS S3 and Google Cloud Storage. At the same time, customers want to continue accessing and analyzing much of that data using applications that run on compute clusters that may reside either on public clouds or on-premise. For VMware customers, those clusters run vSphere (sometimes with vSAN) on-premise and in the future may utilize SDDCaaS. Cloud storage exhibits high latencies and it is not appropriate for direct use by applications. A key challenge for these use cases is determining the subset of the typically huge data sets that need to be moved into the primary storage tier of the compute clusters. This paper introduces a novel approach for creating a hybrid cloud storage that allows customers to utilize the fast primary storage of their compute clusters as a caching tier in front of a slow secondary storage tier. This approach can be completely transparent requiring no changes to the application. To achieve this, we extended VDFS [16], a POSIX-compliant scale-out filesystem, with the concept of caching-tier volumes. VDFS caching-tier volumes resemble regular file system volumes, but they fault-in data from a cloud storage back-end on first access. Cached data are persisted on fast primary storage, close to the compute cluster, like VMware's vSAN. Caching-tier volumes use a write-back approach. The enterprise features of the primary storage ensure the persistence and fault tolerance of new or updated data. Write-back from the primary to cloud storage is managed using an efficient change-tracking mechanism built into VDFS called exo-clones [18]. This paper outlines the architecture and implementation of caching tier volumes on VDFS and reports on an initial evaluation of the current prototype.
由于公共云存储令人信服的经济效益,IT行业的趋势是将大量分析和应用程序数据转移到AWS S3和谷歌云存储等服务上。与此同时,客户希望继续使用运行在计算集群上的应用程序访问和分析大部分数据,这些集群可能驻留在公共云或内部部署上。对于VMware客户,这些集群在本地运行vSphere(有时使用vSAN),将来可能会使用SDDCaaS。云存储具有高延迟,不适合应用程序直接使用。这些用例面临的一个关键挑战是确定需要移动到计算集群的主存储层的典型大型数据集的子集。本文介绍了一种用于创建混合云存储的新方法,该方法允许客户利用其计算集群的快速主存储作为慢速二级存储层前面的缓存层。这种方法可以是完全透明的,无需对应用程序进行任何更改。为了实现这一点,我们扩展了VDFS[16],它是一个兼容posix的横向扩展文件系统,具有缓存层卷的概念。VDFS缓存层卷类似于普通文件系统卷,但它们在首次访问时从云存储后端导入数据。缓存的数据持久化在快速主存储上,靠近计算集群,就像VMware的vSAN一样。缓存层卷使用回写方法。主存储的企业特性确保了新数据或更新数据的持久性和容错性。从主存储到云存储的回写使用VDFS内建的一种高效的变更跟踪机制进行管理,该机制称为exo-克隆[18]。本文概述了VDFS上缓存层卷的体系结构和实现,并报告了对当前原型的初步评估。
{"title":"Hybrid Cloud Storage: Bridging the Gap between Compute Clusters and Cloud Storage","authors":"Abhishek K. Gupta, Richard P. Spillane, Wenguang Wang, Maxime Austruy, Vahid Fereydouny, C. Karamanolis","doi":"10.1145/3139645.3139653","DOIUrl":"https://doi.org/10.1145/3139645.3139653","url":null,"abstract":"Thanks to the compelling economics of public cloud storage, the trend in the IT industry is to move the bulk of analytics and application data to services such as AWS S3 and Google Cloud Storage. At the same time, customers want to continue accessing and analyzing much of that data using applications that run on compute clusters that may reside either on public clouds or on-premise. For VMware customers, those clusters run vSphere (sometimes with vSAN) on-premise and in the future may utilize SDDCaaS. Cloud storage exhibits high latencies and it is not appropriate for direct use by applications. A key challenge for these use cases is determining the subset of the typically huge data sets that need to be moved into the primary storage tier of the compute clusters.\u0000 This paper introduces a novel approach for creating a hybrid cloud storage that allows customers to utilize the fast primary storage of their compute clusters as a caching tier in front of a slow secondary storage tier. This approach can be completely transparent requiring no changes to the application. To achieve this, we extended VDFS [16], a POSIX-compliant scale-out filesystem, with the concept of caching-tier volumes.\u0000 VDFS caching-tier volumes resemble regular file system volumes, but they fault-in data from a cloud storage back-end on first access. Cached data are persisted on fast primary storage, close to the compute cluster, like VMware's vSAN.\u0000 Caching-tier volumes use a write-back approach. The enterprise features of the primary storage ensure the persistence and fault tolerance of new or updated data. Write-back from the primary to cloud storage is managed using an efficient change-tracking mechanism built into VDFS called exo-clones [18].\u0000 This paper outlines the architecture and implementation of caching tier volumes on VDFS and reports on an initial evaluation of the current prototype.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85379914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
OS Support for Adaptive Components in Self-aware Systems 自感知系统中自适应组件的操作系统支持
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139663
João Gabriel Reis, A. A. Fröhlich
The current pace of innovation in computing makes it difficult to assume a fixed set of requirements for the whole life span of a system. Aggressive technology scaling also imposes additional constraints to modern hardware platforms. An answer to this question are self-aware systems, which are capable of autonomously sensing and actuating upon themselves to cope with varying requirements. In this paper, we discuss the design and implementation of adaptive components in this scenario from the perspective of the OS. Components can exist in multiple avors that can by dynamically chosen according to current demands. The proposed framework supports this variability for components while preserving their interface contracts, even if avors exist in different domains (software, hardware, remote). The synthesis process delivers tailored wrapper for components according to their avors. Besides reconfiguration, we also support adaptations through dynamic power management and task remapping. The framework also supports component designers in terms of sensing via an event-based mechanism. The framework is validated through a case with three adaptive components in a telecommunication switch (AES, ADPCM, and DTMF) with little overhead both in terms of execution time and memory/silicon consumption.
当前计算领域的创新速度使得很难在系统的整个生命周期内假定一组固定的需求。激进的技术扩展也给现代硬件平台带来了额外的限制。这个问题的答案是自我意识系统,它能够自主感知和驱动自己来应对不同的需求。在本文中,我们从操作系统的角度讨论了该场景下自适应组件的设计和实现。组件可以以多种形式存在,可以根据当前需求动态选择。建议的框架支持组件的这种可变性,同时保留它们的接口契约,即使青睐者存在于不同的领域(软件、硬件、远程)。合成过程根据组件的喜好提供定制的包装。除了重新配置,我们还通过动态电源管理和任务重新映射支持自适应。该框架还支持组件设计人员通过基于事件的机制进行感知。该框架通过在电信交换机中使用三个自适应组件(AES、ADPCM和DTMF)的案例进行验证,在执行时间和内存/硅消耗方面的开销很小。
{"title":"OS Support for Adaptive Components in Self-aware Systems","authors":"João Gabriel Reis, A. A. Fröhlich","doi":"10.1145/3139645.3139663","DOIUrl":"https://doi.org/10.1145/3139645.3139663","url":null,"abstract":"The current pace of innovation in computing makes it difficult to assume a fixed set of requirements for the whole life span of a system. Aggressive technology scaling also imposes additional constraints to modern hardware platforms. An answer to this question are self-aware systems, which are capable of autonomously sensing and actuating upon themselves to cope with varying requirements. In this paper, we discuss the design and implementation of adaptive components in this scenario from the perspective of the OS. Components can exist in multiple avors that can by dynamically chosen according to current demands. The proposed framework supports this variability for components while preserving their interface contracts, even if avors exist in different domains (software, hardware, remote). The synthesis process delivers tailored wrapper for components according to their avors. Besides reconfiguration, we also support adaptations through dynamic power management and task remapping. The framework also supports component designers in terms of sensing via an event-based mechanism. The framework is validated through a case with three adaptive components in a telecommunication switch (AES, ADPCM, and DTMF) with little overhead both in terms of execution time and memory/silicon consumption.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72627380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research at VMware 在VMware的研究
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139647
D. Tennenhouse
VMware has its roots in the academic research community, starting with the commercialization of the work on x86 virtualization of Prof. Mendel Rosenblum and his team at Stanford University [1]. Developers embraced VMware's original workstation product and the ensuing work on server virtualization led to today's vSphere platform, which has enabled significant server consolidation, numerous operational benefits, and isolation-based security. In addition, the vast improvements in server utilization provide VMware's customers with significant cost savings and is a key contributor to the environmental sustainability of modern data centers [2]. VMware has remained true to its research roots, with a strong engineering culture that emphasizes grassroots innovation through hackathons, incubation projects, open source activities, seminars and RADIO, an annual R&D innovation offsite that brings together a substantial fraction of the company's developers. Just a few examples of current activities are open vSwitch (OVS), the virtualization and exploration of non-volatile memory (NVM), securing and managing the Internet of Things (IoT), and support for Containers. Over time, there has been a dramatic increase in the scope for innovation at VMware. This paper provides an overview of how that scope has grown and how it has expanded the range of relevant research opportunities along with a description of VMware's recently formed research group, including its mission, composition and significant research thrusts.
VMware起源于学术研究界,始于斯坦福大学教授Mendel Rosenblum及其团队对x86虚拟化的商业化研究[1]。开发人员接受了VMware最初的工作站产品,随后在服务器虚拟化方面的工作导致了今天的vSphere平台,该平台实现了重要的服务器整合、众多运营优势和基于隔离的安全性。此外,服务器利用率的大幅提高为VMware的客户节省了大量成本,并且是现代数据中心环境可持续性的关键因素[2]。VMware始终忠于自己的研究根基,拥有强大的工程文化,通过黑客马拉松、孵化项目、开源活动、研讨会和RADIO强调基层创新。RADIO是一项年度研发创新活动,汇集了公司大部分开发人员。当前活动的几个例子是开放虚拟交换机(OVS),非易失性内存(NVM)的虚拟化和探索,保护和管理物联网(IoT),以及对容器的支持。随着时间的推移,VMware的创新范围急剧扩大。本文概述了这一范围是如何发展的,以及它是如何扩大相关研究机会的范围的,并描述了VMware最近成立的研究小组,包括其使命、组成和重要的研究重点。
{"title":"Research at VMware","authors":"D. Tennenhouse","doi":"10.1145/3139645.3139647","DOIUrl":"https://doi.org/10.1145/3139645.3139647","url":null,"abstract":"VMware has its roots in the academic research community, starting with the commercialization of the work on x86 virtualization of Prof. Mendel Rosenblum and his team at Stanford University [1]. Developers embraced VMware's original workstation product and the ensuing work on server virtualization led to today's vSphere platform, which has enabled significant server consolidation, numerous operational benefits, and isolation-based security. In addition, the vast improvements in server utilization provide VMware's customers with significant cost savings and is a key contributor to the environmental sustainability of modern data centers [2].\u0000 VMware has remained true to its research roots, with a strong engineering culture that emphasizes grassroots innovation through hackathons, incubation projects, open source activities, seminars and RADIO, an annual R&D innovation offsite that brings together a substantial fraction of the company's developers. Just a few examples of current activities are open vSwitch (OVS), the virtualization and exploration of non-volatile memory (NVM), securing and managing the Internet of Things (IoT), and support for Containers.\u0000 Over time, there has been a dramatic increase in the scope for innovation at VMware. This paper provides an overview of how that scope has grown and how it has expanded the range of relevant research opportunities along with a description of VMware's recently formed research group, including its mission, composition and significant research thrusts.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76450268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hypervisor Approach to Enable Live Migration with Passthrough SR-IOV Network Devices 通过直通SR-IOV网络设备实现热迁移的管理程序方法
Pub Date : 2017-09-11 DOI: 10.1145/3139645.3139649
Xin Xu, Bhavesh Davda
Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (physical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned to a virtual machine (VM) in passthrough mode to significantly improve the network performance. However, VF passthrough mode is not compatible with live migration, which is an essential capability that enables many advanced virtualization features such as high availability and resource provisioning. To solve this problem, we design SRVM which provides hypervisor support to ensure the VF device can be correctly used by the migrated VM and the applications. SRVM is implemented in the hypervisor without modification in guest operating systems or guest VM drivers. SRVM does not increase VM downtime. It only costs limited resources (an extra CPU core only during the live migration pre-copy phase), and there is no significant runtime overhead in VM network performance.
SR-IOV (single - root I/O Virtualization)是一种允许单个PCIe设备(物理功能或PF)作为多个PCIe设备(虚拟功能或VF)使用的规范。在虚拟化系统中,可以将每个VF以直通方式直接分配给虚拟机,从而大大提高网络性能。但是,VF直通模式与实时迁移不兼容,实时迁移是支持许多高级虚拟化特性(如高可用性和资源供应)的基本功能。为了解决这个问题,我们设计了SRVM,它提供了管理程序支持,以确保迁移的VM和应用程序可以正确使用VF设备。SRVM在虚拟机管理程序中实现,无需修改客户操作系统或客户虚拟机驱动程序。SRVM不会增加虚拟机停机时间。它只消耗有限的资源(仅在实时迁移预复制阶段额外的CPU核心),并且在VM网络性能方面没有明显的运行时开销。
{"title":"A Hypervisor Approach to Enable Live Migration with Passthrough SR-IOV Network Devices","authors":"Xin Xu, Bhavesh Davda","doi":"10.1145/3139645.3139649","DOIUrl":"https://doi.org/10.1145/3139645.3139649","url":null,"abstract":"Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (physical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned to a virtual machine (VM) in passthrough mode to significantly improve the network performance. However, VF passthrough mode is not compatible with live migration, which is an essential capability that enables many advanced virtualization features such as high availability and resource provisioning.\u0000 To solve this problem, we design SRVM which provides hypervisor support to ensure the VF device can be correctly used by the migrated VM and the applications. SRVM is implemented in the hypervisor without modification in guest operating systems or guest VM drivers. SRVM does not increase VM downtime. It only costs limited resources (an extra CPU core only during the live migration pre-copy phase), and there is no significant runtime overhead in VM network performance.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79214209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
ACM SIGOPS Oper. Syst. Rev.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1