Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation最新文献_第5页

A comparison of Windows driver model latency performance on Windows NT and Windows 98 Windows驱动模型在Windows NT和Windows 98上的延迟性能比较

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296823

Erik Cota-Robles, J. P. Held

Windows 98 and NT share a common driver model known as WDM (Windows Driver Model) and carefully designed drivers can be binary portable. We compare the performance of Windows 98 and Windows NT 4.0 under load from office, multimedia and engineering applications on a personal computer (PC) of modest power that is free of legacy hardware. We report our observations using a complementary pair of system performance measures, interrupt and thread latency, that capture the ability of the OS to support multimedia and real-time workloads in a way that traditional throughput-based performance measures miss. We use the measured latency distributions to evaluate the quality of service that a WDM driver can expect to receive on both OSs, irrespective of whether the driver uses thread-based or interrupt-based processing. We conclude that for real-time applications a driver on Windows NT 4.0 that uses high, real-time priority threads receives an order of magnitude better service than a similar WDM driver on Windows 98 that uses Deferred Procedure Calls, a form of interrupt processing. With the increase in multimedia and other real-time processing on PCs the interrupt and thread latency metrics have become as important as the throughput metrics traditionally used to measure performance.

Windows 98和NT共享一个称为WDM (Windows驱动程序模型)的公共驱动程序模型，并且精心设计的驱动程序可以是二进制可移植的。我们比较了Windows 98和Windows NT 4.0在没有传统硬件的中等功率个人电脑(PC)上运行办公、多媒体和工程应用程序时的性能。我们使用一组互补的系统性能指标——中断和线程延迟来报告我们的观察结果，这两组指标捕捉了操作系统支持多媒体和实时工作负载的能力，而传统的基于吞吐量的性能指标无法实现。我们使用测量的延迟分布来评估WDM驱动程序在两个操作系统上期望接收的服务质量，而不管驱动程序是使用基于线程的处理还是基于中断的处理。我们得出结论，对于实时应用程序，Windows NT 4.0上使用高实时优先级线程的驱动程序比Windows 98上使用延迟过程调用(一种中断处理形式)的类似WDM驱动程序获得的服务要好一个数量级。随着pc上多媒体和其他实时处理的增加，中断和线程延迟指标已经变得和传统上用来衡量性能的吞吐量指标一样重要。

{"title":"A comparison of Windows driver model latency performance on Windows NT and Windows 98","authors":"Erik Cota-Robles, J. P. Held","doi":"10.1145/296806.296823","DOIUrl":"https://doi.org/10.1145/296806.296823","url":null,"abstract":"Windows 98 and NT share a common driver model known as WDM (Windows Driver Model) and carefully designed drivers can be binary portable. We compare the performance of Windows 98 and Windows NT 4.0 under load from office, multimedia and engineering applications on a personal computer (PC) of modest power that is free of legacy hardware. We report our observations using a complementary pair of system performance measures, interrupt and thread latency, that capture the ability of the OS to support multimedia and real-time workloads in a way that traditional throughput-based performance measures miss. We use the measured latency distributions to evaluate the quality of service that a WDM driver can expect to receive on both OSs, irrespective of whether the driver uses thread-based or interrupt-based processing. We conclude that for real-time applications a driver on Windows NT 4.0 that uses high, real-time priority threads receives an order of magnitude better service than a similar WDM driver on Windows 98 that uses Deferred Procedure Calls, a form of interrupt processing. With the increase in multimedia and other real-time processing on PCs the interrupt and thread latency metrics have become as important as the throughput metrics traditionally used to measure performance.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"25 1","pages":"159-172"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85432467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Interface and execution models in the Fluke kernel Fluke内核中的接口和执行模型

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296815

B. Ford, Mike Hibler, Jay Lepreau, R. McGrath, Patrick Tullmann

We have defined and implemented a kernel API that makes every exported operation fully interruptible and restartable, thereby appearing atomic to the user. To achieve interruptibility, all possible kernel states in which a thread may become blocked for a long time are represented as kernel system calls, without requiring the kernel to retain any unexposable internal state. Since all kernel operations appear atomic, services such as transparent checkpointing and process migration that need access to the complete and consistent state of a process can be implemented by ordinary user-mode processes. Atomic operations also enable applications to provide reliability in a more straightforward manner. This API also allows us to explore novel kernel implementation techniques and to evaluate existing techniques. The Fluke kernel's single source implements either the process or the interrupt execution model on both uniprocessors and multiprocessors, depending on a configuration option affecting a small amount of code. We report preliminary measurements comparing fully, partially and non-preemptible configurations of both process and interrupt model implementations. We find that the interrupt model has a modest performance advantage in some benchmarks, maximum preemption latency varies nearly three orders of magnitude, average preemption latency varies by a factor of six, and memory use favors the interrupt model as expected, but not by a large amount. We find that the overhead for restarting the most costly kernel operation ranges from 2-8% of the cost of the operation.

我们已经定义并实现了一个内核API，它使每个导出的操作都是完全可中断和可重新启动的，因此对用户来说是原子的。为了实现可中断性，线程可能长时间阻塞的所有可能的内核状态都表示为内核系统调用，而不需要内核保留任何不可暴露的内部状态。由于所有内核操作都是原子的，因此需要访问进程的完整和一致状态的透明检查点和进程迁移等服务可以由普通的用户模式进程实现。原子操作还使应用程序能够以更直接的方式提供可靠性。这个API还允许我们探索新的内核实现技术并评估现有技术。Fluke内核的单一源代码在单处理器和多处理器上实现进程或中断执行模型，这取决于影响少量代码的配置选项。我们报告了比较进程和中断模型实现的完全、部分和不可抢占配置的初步测量结果。我们发现中断模型在一些基准测试中具有适度的性能优势，最大抢占延迟变化近三个数量级，平均抢占延迟变化六倍，内存使用如预期的那样有利于中断模型，但不是很大。我们发现重新启动最昂贵的内核操作的开销在操作成本的2-8%之间。

{"title":"Interface and execution models in the Fluke kernel","authors":"B. Ford, Mike Hibler, Jay Lepreau, R. McGrath, Patrick Tullmann","doi":"10.1145/296806.296815","DOIUrl":"https://doi.org/10.1145/296806.296815","url":null,"abstract":"We have defined and implemented a kernel API that makes every exported operation fully interruptible and restartable, thereby appearing atomic to the user. To achieve interruptibility, all possible kernel states in which a thread may become blocked for a long time are represented as kernel system calls, without requiring the kernel to retain any unexposable internal state. Since all kernel operations appear atomic, services such as transparent checkpointing and process migration that need access to the complete and consistent state of a process can be implemented by ordinary user-mode processes. Atomic operations also enable applications to provide reliability in a more straightforward manner. This API also allows us to explore novel kernel implementation techniques and to evaluate existing techniques. The Fluke kernel's single source implements either the process or the interrupt execution model on both uniprocessors and multiprocessors, depending on a configuration option affecting a small amount of code. We report preliminary measurements comparing fully, partially and non-preemptible configurations of both process and interrupt model implementations. We find that the interrupt model has a modest performance advantage in some benchmarks, maximum preemption latency varies nearly three orders of magnitude, average preemption latency varies by a factor of six, and memory use favors the interrupt model as expected, but not by a large amount. We find that the overhead for restarting the most costly kernel operation ranges from 2-8% of the cost of the operation.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"36 1","pages":"101-115"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85797275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

ETI resource distributor: guaranteed resource allocation and scheduling in multimedia systems ETI资源分发器:保证多媒体系统中的资源分配和调度

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296819

M. Baker-Harvey

Multimedia processors offer a programmable, cost-effective way to provide multimedia functionality in environments previously serviced by fixed-function hardware and digital signal processors. Achieving acceptable performance requires that the multimedia processor's software emulate hardware devices. There are stringent requirements on the operating system scheduler of a multimedia processor. First, when a user starts a task believing it to be backed by hardware, the system cannot terminate that task. The task must continue to run as if the hardware were present. Second, optimizing the Quality of Service (QOS) requires that tasks use all available system resources. Third, QOS decisions must be made globally, and in the interests of the user, if system overload occurs. No previously existing scheduler meets all these requirements. The Equator Technologies, Inc. (ETI) Resource Distributor guarantees scheduling for admitted tasks: the delivery of resources is not interrupted even if the system is overloaded. The Scheduler delivers resources to applications in units known to be useful for achieving a specific level of service quality. This promotes better utilization of system resources and a higher perceived QOS. When QOS degradations are required, the Resource Distributor never makes inadvertent or implicit policy decisions: policy must be explicitly specified by the user. While providing superior services for periodic real-time applications, the Resource Distributor also guarantees liveness for applications that are not real-time. Support for real-time applications that do not require continuous resource use is integrated: it neither interferes with the scheduling guarantees of other applications nor ties up resources that could be used by other applications.

多媒体处理器提供了一种可编程的、经济有效的方式，在以前由固定功能硬件和数字信号处理器提供服务的环境中提供多媒体功能。实现可接受的性能要求多媒体处理器的软件模拟硬件设备。多媒体处理器的操作系统调度程序有严格的要求。首先，当用户启动一个任务时，认为它是由硬件支持的，系统不能终止该任务。任务必须像硬件存在一样继续运行。其次，优化服务质量(QOS)要求任务使用所有可用的系统资源。第三，如果发生系统过载，QOS决策必须在全局范围内做出，并且符合用户的利益。以前存在的调度程序不满足所有这些要求。Equator Technologies, Inc. (ETI)的资源分发器保证了已接受任务的调度:即使系统过载，资源的交付也不会中断。Scheduler以已知对实现特定服务质量级别有用的单元向应用程序交付资源。这促进了更好地利用系统资源和更高的感知QOS。当需要QOS降级时，资源分发器从不做出无意或隐含的策略决策:策略必须由用户显式指定。资源分发器在为周期性实时应用程序提供优质服务的同时，也保证了非实时应用程序的活动性。集成了对不需要持续使用资源的实时应用程序的支持:它既不会干扰其他应用程序的调度保证，也不会占用其他应用程序可以使用的资源。

{"title":"ETI resource distributor: guaranteed resource allocation and scheduling in multimedia systems","authors":"M. Baker-Harvey","doi":"10.1145/296806.296819","DOIUrl":"https://doi.org/10.1145/296806.296819","url":null,"abstract":"Multimedia processors offer a programmable, cost-effective way to provide multimedia functionality in environments previously serviced by fixed-function hardware and digital signal processors. Achieving acceptable performance requires that the multimedia processor's software emulate hardware devices. There are stringent requirements on the operating system scheduler of a multimedia processor. First, when a user starts a task believing it to be backed by hardware, the system cannot terminate that task. The task must continue to run as if the hardware were present. Second, optimizing the Quality of Service (QOS) requires that tasks use all available system resources. Third, QOS decisions must be made globally, and in the interests of the user, if system overload occurs. No previously existing scheduler meets all these requirements. The Equator Technologies, Inc. (ETI) Resource Distributor guarantees scheduling for admitted tasks: the delivery of resources is not interrupted even if the system is overloaded. The Scheduler delivers resources to applications in units known to be useful for achieving a specific level of service quality. This promotes better utilization of system resources and a higher perceived QOS. When QOS degradations are required, the Resource Distributor never makes inadvertent or implicit policy decisions: policy must be explicitly specified by the user. While providing superior services for periodic real-time applications, the Resource Distributor also guarantees liveness for applications that are not real-time. Support for real-time applications that do not require continuous resource use is integrated: it neither interferes with the scheduling guarantees of other applications nor ties up resources that could be used by other applications.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"73 6 1","pages":"131-144"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83375532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Logical vs. physical file system backup 逻辑与物理文件系统备份

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296835

N. Hutchinson, S. Manley, Mike Federwisch, Guy Harris, D. Hitz, S. Kleiman, S. O'Malley

As file systems grow in size, ensuring that data is safely stored becomes more and more difficult. Historically, file system backup strategies have focused on logical backup where files are written in their entirety to the backup media. An alternative is physical backup where the disk blocks that make up the file system are written to the backup media. This paper compares logical and physical backup strategies in large file systems. We discuss the advantages and disadvantages of the two approaches, and conclude by showing that while both can achieve good performance, physical backup and restore can achieve much higher throughput while consuming less CPU. In addition, physical backup and restore is much more capable of scaling its performance as more devices are added to a system.

随着文件系统的规模越来越大，确保数据的安全存储变得越来越困难。过去，文件系统备份策略主要侧重于逻辑备份，将文件完整地写入备份介质。另一种选择是物理备份，其中组成文件系统的磁盘块被写入备份介质。本文比较了大型文件系统的逻辑备份策略和物理备份策略。我们讨论了这两种方法的优缺点，并得出结论，虽然两者都可以实现良好的性能，但物理备份和恢复可以在消耗更少CPU的同时实现更高的吞吐量。此外，随着向系统中添加更多的设备，物理备份和恢复更有能力扩展其性能。

引用次数: 68

Defending against denial of service attacks in Scout 在Scout中防御拒绝服务攻击

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296811

O. Spatscheck, L. Peterson

We describe a two-dimensional architecture for defending against denial of service attacks. In one dimension, the architecture accounts for all resources consumed by each I/O path in the system; this accounting mechanism is implemented as an extension to the path object in the Scout operating system. In the second dimension, the various modules that define each path can be configured in separate protection domains; we implement hardware enforced protection domains, although other implementations are possible. The resulting system-which we call Escort-is the first example of a system that simultaneously does end-to-end resource accounting (thereby protecting against resource based denial of service attacks where principals can be identified) and supports multiple protection domains (thereby allowing untrusted modules to be isolated from each other). The paper describes the Escort architecture and its implementation in Scout, and reports a collection of experiments that measure the costs and benefits of using Escort to protect a web server from denial of service attacks.

我们描述了一个用于防御拒绝服务攻击的二维架构。在一个维度上，体系结构考虑了系统中每个I/O路径所消耗的所有资源;这种记帐机制是作为Scout操作系统中path对象的扩展来实现的。在第二个维度中，定义每个路径的各种模块可以配置在单独的保护域中;我们实现硬件强制保护域，尽管其他实现也是可能的。由此产生的系统——我们称之为护航系统——是第一个同时进行端到端资源核算(从而防止基于资源的拒绝服务攻击，其中可以识别主体)并支持多个保护域(从而允许互不信任的模块相互隔离)的系统示例。本文描述了护航架构及其在Scout中的实现，并报告了一系列实验，这些实验测量了使用护航来保护web服务器免受拒绝服务攻击的成本和收益。

{"title":"Defending against denial of service attacks in Scout","authors":"O. Spatscheck, L. Peterson","doi":"10.1145/296806.296811","DOIUrl":"https://doi.org/10.1145/296806.296811","url":null,"abstract":"We describe a two-dimensional architecture for defending against denial of service attacks. In one dimension, the architecture accounts for all resources consumed by each I/O path in the system; this accounting mechanism is implemented as an extension to the path object in the Scout operating system. In the second dimension, the various modules that define each path can be configured in separate protection domains; we implement hardware enforced protection domains, although other implementations are possible. The resulting system-which we call Escort-is the first example of a system that simultaneously does end-to-end resource accounting (thereby protecting against resource based denial of service attacks where principals can be identified) and supports multiple protection domains (thereby allowing untrusted modules to be isolated from each other). The paper describes the Escort architecture and its implementation in Scout, and reports a collection of experiments that measure the costs and benefits of using Escort to protect a web server from denial of service attacks.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"97 1","pages":"59-72"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88798007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 162

Integrating content-based access mechanisms with hierarchical file systems 将基于内容的访问机制与分层文件系统集成在一起

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296838

B. Gopal, U. Manber

We describe a new file system that provides, at the same time, both name and content based access to files. To make this possible, we introduce the concept of a semantic directory. Every semantic directory has a query associated with it. When a user creates a semantic directory, the file system automatically creates a set of pointers to the files in the file system that satisfy the query associated with the directory. This set of pointers is called the query-result of the directory. To access the files that satisfy the query, users just need to de-reference the appropriate pointers. Users can also create files and sub-directories within semantic directories in the usual way. Hence, users can organize files in a hierarchy and access them by specifying path names, and at the same time, retrieve files by asking queries that describe their content. Our file system also provides facilities for query-refinement and customization. When a user creates a new semantic sub-directory within a semantic directory, the file system ensures that the query-result of the sub-directory is a subset of the query-result of its parent. Hence, users can create a hierarchy of semantic directories to refine their queries. Users can also edit the set of pointers in a semantic directory, and thereby modify its query-result without modifying its query or the files in the file system. In this way, users can customize the results of queries according to their personal tastes, and use customized results to refine queries in the future. That is, users do not have to depend solely on the query language to achieve these objectives. Our file system has many other features, including semantic mount-points that allow users to access information in other file systems by content. The file system does not depend on the query language used for content-based access. Hence, it is possible to integrate any content-based access mechanism into our file system.

我们描述了一个新的文件系统，它同时提供对文件的基于名称和基于内容的访问。为了实现这一点，我们引入了语义目录的概念。每个语义目录都有一个与之关联的查询。当用户创建语义目录时，文件系统自动创建一组指向文件系统中满足与该目录关联的查询的文件的指针。这组指针称为目录的查询结果。要访问满足查询的文件，用户只需要取消对适当指针的引用。用户还可以按照通常的方式在语义目录中创建文件和子目录。因此，用户可以在层次结构中组织文件，并通过指定路径名来访问它们，同时，通过查询描述其内容来检索文件。我们的文件系统还提供了查询细化和定制的工具。当用户在语义目录中创建新的语义子目录时，文件系统确保子目录的查询结果是其父目录的查询结果的子集。因此，用户可以创建语义目录的层次结构来优化他们的查询。用户还可以编辑语义目录中的指针集，从而在不修改查询或文件系统中的文件的情况下修改其查询结果。这样，用户就可以根据自己的喜好定制查询的结果，并在以后使用定制的结果来细化查询。也就是说，用户不必完全依赖查询语言来实现这些目标。我们的文件系统还有许多其他特性，包括允许用户按内容访问其他文件系统中的信息的语义挂载点。文件系统不依赖于用于基于内容的访问的查询语言。因此，可以将任何基于内容的访问机制集成到我们的文件系统中。

{"title":"Integrating content-based access mechanisms with hierarchical file systems","authors":"B. Gopal, U. Manber","doi":"10.1145/296806.296838","DOIUrl":"https://doi.org/10.1145/296806.296838","url":null,"abstract":"We describe a new file system that provides, at the same time, both name and content based access to files. To make this possible, we introduce the concept of a semantic directory. Every semantic directory has a query associated with it. When a user creates a semantic directory, the file system automatically creates a set of pointers to the files in the file system that satisfy the query associated with the directory. This set of pointers is called the query-result of the directory. To access the files that satisfy the query, users just need to de-reference the appropriate pointers. Users can also create files and sub-directories within semantic directories in the usual way. Hence, users can organize files in a hierarchy and access them by specifying path names, and at the same time, retrieve files by asking queries that describe their content. Our file system also provides facilities for query-refinement and customization. When a user creates a new semantic sub-directory within a semantic directory, the file system ensures that the query-result of the sub-directory is a subset of the query-result of its parent. Hence, users can create a hierarchy of semantic directories to refine their queries. Users can also edit the set of pointers in a semantic directory, and thereby modify its query-result without modifying its query or the files in the file system. In this way, users can customize the results of queries according to their personal tastes, and use customized results to refine queries in the future. That is, users do not have to depend solely on the query language to achieve these objectives. Our file system has many other features, including semantic mount-points that allow users to access information in other file systems by content. The file system does not depend on the query language used for content-based access. Hence, it is possible to integrate any content-based access mechanism into our file system.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"2015 1","pages":"265-278"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83496946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 131

Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system 龙卷风:在共享内存多处理器操作系统中最大化局部性和并发性

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296814

Benjamin Gamsa, O. Krieger, J. Appavoo, M. Stumm

We describe the design and implementation of Tornado, a new operating system designed from the ground up specifically for today's shared memory multiprocessors. The need for improved locality in the operating system is growing as multiprocessor hardware evolves, increasing the costs for cache misses and sharing, and adding complications due to NUMAness. Tornado is optimized so that locality and independence in application requests for operating system services-whetherfrom multiple sequential applications or a single parallel application-are mapped onto locality and independence in the servicing of these requests in the kernel and system servers. By contrast, previous shared memory multiprocessor operating systems all evolved from designs constructed at a time when sharing costs were low, memory latency was low and uniform, and caches were small; for these systems, concurrency was the main performance concern and locality was not an important issue. Tornado achieves this locality by starting with an object-oriented structure, where every virtual and physical resource is represented by an independent object. Locality, as well as concurrency, is further enhanced with the introduction of three key innovations: (i) clustered objects that support the partitioning of contended objects across processors, (ii) a protected procedure call facility that preserves the locality and concurrency of IPC's, and (iii) a new locking strategy that allows all locking to be encapsulated within the objects being protected and greatly simplifies the overall locking protocols. As a result of these techniques, Tornado has far better performance characteristics, particularly for multithreaded applications, than existing commercial operating systems. Tornado has been fully implemented and runs both on Toronto's NUMAchine hardware and on the SimOS simulator.

我们描述了Tornado的设计和实现，这是一个专门为今天的共享内存多处理器而设计的新操作系统。随着多处理器硬件的发展，对操作系统中改进局部性的需求也在增长，这增加了缓存丢失和共享的成本，并且由于numness而增加了复杂性。Tornado经过优化，使得操作系统服务的应用程序请求的局部性和独立性(无论是来自多个顺序应用程序还是单个并行应用程序)映射到内核和系统服务器中为这些请求提供服务的局部性和独立性。相比之下，以前的共享内存多处理器操作系统都是在共享成本低、内存延迟低且均匀、缓存小的时候设计出来的;对于这些系统，并发性是主要的性能关注点，局部性不是一个重要的问题。Tornado从一个面向对象的结构开始实现这种局部性，其中每个虚拟和物理资源都由一个独立的对象表示。局部性和并发性通过引入三个关键创新得到了进一步增强:(i)支持跨处理器分区竞争对象的集群对象，(ii)保护IPC的局部性和并发性的受保护过程调用工具，以及(iii)新的锁定策略，允许将所有锁定封装在被保护的对象中，并大大简化了整体锁定协议。由于这些技术，Tornado具有比现有商业操作系统更好的性能特征，特别是对于多线程应用程序。Tornado已经完全实现并在多伦多NUMAchine硬件和SimOS模拟器上运行。

{"title":"Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system","authors":"Benjamin Gamsa, O. Krieger, J. Appavoo, M. Stumm","doi":"10.1145/296806.296814","DOIUrl":"https://doi.org/10.1145/296806.296814","url":null,"abstract":"We describe the design and implementation of Tornado, a new operating system designed from the ground up specifically for today's shared memory multiprocessors. The need for improved locality in the operating system is growing as multiprocessor hardware evolves, increasing the costs for cache misses and sharing, and adding complications due to NUMAness. Tornado is optimized so that locality and independence in application requests for operating system services-whetherfrom multiple sequential applications or a single parallel application-are mapped onto locality and independence in the servicing of these requests in the kernel and system servers. By contrast, previous shared memory multiprocessor operating systems all evolved from designs constructed at a time when sharing costs were low, memory latency was low and uniform, and caches were small; for these systems, concurrency was the main performance concern and locality was not an important issue. Tornado achieves this locality by starting with an object-oriented structure, where every virtual and physical resource is represented by an independent object. Locality, as well as concurrency, is further enhanced with the introduction of three key innovations: (i) clustered objects that support the partitioning of contended objects across processors, (ii) a protected procedure call facility that preserves the locality and concurrency of IPC's, and (iii) a new locking strategy that allows all locking to be encapsulated within the objects being protected and greatly simplifies the overall locking protocols. As a result of these techniques, Tornado has far better performance characteristics, particularly for multithreaded applications, than existing commercial operating systems. Tornado has been fully implemented and runs both on Toronto's NUMAchine hardware and on the SimOS simulator.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"43 1","pages":"87-100"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86691810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 217

Optimizing the idle task and other MMU tricks 优化空闲任务和其他MMU技巧

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296833

C. Dougan, P. Mackerras, Victor Yodaiken

In highly cached and pipelined machines, operating system performance, and aggregate user/system performance, is enormously sensitive to small changes in cache and TLB hit rates. We have implemented a variety of changes in the memory management of a native port of the Linux operating system to the PowerPC architecture in an effort to improve performance. Our results show that careful design to minimize the OS caching footprint, to shorten critical code paths in page fault handling, and to otherwise take full advantage of the memory management hardware can have dramatic effects on performance. Our results also show that the operating system can intelligently manage MMU resources as well or better than hardware can and suggest that complex hardware MMU assistance may not be the most appropriate use of scarce chip area. Comparative benchmarks show that our optimizations result in kernel performance that is significantly better than other monolithic kernels for the same architecture and highlight the distance that micro-kernel designs will have to travel to approach the performance of a reasonably efficient monolithic kernel.

在高度缓存和流水线化的机器中，操作系统性能和用户/系统总体性能对缓存和TLB命中率的微小变化非常敏感。为了提高性能，我们对Linux操作系统到PowerPC体系结构的本机移植的内存管理进行了各种更改。我们的结果表明，通过精心设计最小化操作系统缓存占用空间，缩短页面错误处理中的关键代码路径，以及充分利用内存管理硬件，可以对性能产生巨大影响。我们的研究结果还表明，操作系统可以智能地管理MMU资源，甚至比硬件更好，这表明复杂的硬件MMU辅助可能不是最合适的使用稀缺的芯片面积。比较基准测试表明，我们的优化导致的内核性能明显优于相同体系结构下的其他单片内核，并突出了微内核设计要接近相当高效的单片内核的性能必须经过的距离。

{"title":"Optimizing the idle task and other MMU tricks","authors":"C. Dougan, P. Mackerras, Victor Yodaiken","doi":"10.1145/296806.296833","DOIUrl":"https://doi.org/10.1145/296806.296833","url":null,"abstract":"In highly cached and pipelined machines, operating system performance, and aggregate user/system performance, is enormously sensitive to small changes in cache and TLB hit rates. We have implemented a variety of changes in the memory management of a native port of the Linux operating system to the PowerPC architecture in an effort to improve performance. Our results show that careful design to minimize the OS caching footprint, to shorten critical code paths in page fault handling, and to otherwise take full advantage of the memory management hardware can have dramatic effects on performance. Our results also show that the operating system can intelligently manage MMU resources as well or better than hardware can and suggest that complex hardware MMU assistance may not be the most appropriate use of scarce chip area. Comparative benchmarks show that our optimizations result in kernel performance that is significantly better than other monolithic kernels for the same architecture and highlight the distance that micro-kernel designs will have to travel to approach the performance of a reasonably efficient monolithic kernel.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"41 1","pages":"229-237"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86453497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

The design of a multicast-based distributed file system 基于组播的分布式文件系统的设计

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-02-22 DOI: 10.1145/296806.296837

B. Grönvall, A. Westerlund, S. Pink

JetFile is a distributed file system designed to support shared file access in a heterogenous environment such as the Internet. It uses multicast communication and optimistic strategies for synchronization and distribution. JetFile relies on peer-to-peer communication over multicast channels. Most of the traditional file server responsibilities have been decentralized. In particular, the more heavyweight operations such as serving file data and attributes are, in our system, the responsibility of the clients. Some functions such as serializing file updates are still centralized in JetFile. Since serialization is a relatively lightweight operation in our system, serialization is expected to have only minor impact on scalability. We have implemented parts of the JetFile design and have measured its performance over a local-area network and an emulated wide-area network. Our measurements indicate that, using a standard benchmark, JetFile performance is comparable to that of local-disk based file systems. This means it is considerably faster than commonly used distributed file systems such as NFS and AFS.

JetFile是一个分布式文件系统，旨在支持异构环境(如Internet)中的共享文件访问。它采用多播通信和乐观策略进行同步和分发。JetFile依赖于多播通道上的点对点通信。大多数传统的文件服务器职责已经分散。特别是，在我们的系统中，更重量级的操作，如提供文件数据和属性，是客户端的责任。一些功能，如序列化文件更新，仍然集中在JetFile中。由于序列化在我们的系统中是一种相对轻量级的操作，因此预计序列化对可伸缩性的影响很小。我们已经实现了JetFile设计的一部分，并在局域网和模拟广域网上测量了它的性能。我们的测量表明，使用标准基准测试，JetFile的性能与基于本地磁盘的文件系统相当。这意味着它比常用的分布式文件系统(如NFS和AFS)快得多。

{"title":"The design of a multicast-based distributed file system","authors":"B. Grönvall, A. Westerlund, S. Pink","doi":"10.1145/296806.296837","DOIUrl":"https://doi.org/10.1145/296806.296837","url":null,"abstract":"JetFile is a distributed file system designed to support shared file access in a heterogenous environment such as the Internet. It uses multicast communication and optimistic strategies for synchronization and distribution. JetFile relies on peer-to-peer communication over multicast channels. Most of the traditional file server responsibilities have been decentralized. In particular, the more heavyweight operations such as serving file data and attributes are, in our system, the responsibility of the clients. Some functions such as serializing file updates are still centralized in JetFile. Since serialization is a relatively lightweight operation in our system, serialization is expected to have only minor impact on scalability. We have implemented parts of the JetFile design and have measured its performance over a local-area network and an emulated wide-area network. Our measurements indicate that, using a standard benchmark, JetFile performance is comparable to that of local-disk based file systems. This means it is considerably faster than commonly used distributed file systems such as NFS and AFS.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"7 1","pages":"251-264"},"PeriodicalIF":0.0,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89725067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

IO-lite: a unified I/O buffering and caching system IO-lite:一个统一的I/O缓冲和缓存系统

Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation

Pub Date : 1999-01-01 DOI: 10.1145/296806.296808

Vivek S. Pai, P. Druschel, W. Zwaenepoel

引用次数: 5