Proceedings of the eighteenth ACM symposium on Operating systems principles最新文献

英文中文

BASE: using abstraction to improve fault tolerance BASE:使用抽象来提高容错性

Proceedings of the eighteenth ACM symposium on Operating systems principles

Pub Date : 2001-10-21 DOI: 10.1145/502034.502037

R. Rodrigues, M. Castro, B. Liskov

Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy. This paper describes a replication technique, BASE, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors. BASE reduces cost because it enables reuse of off-the-shelf service implementations. It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or non-deterministic service implementations, which reduces the probability of common mode failures. We built an NFS service where each replica can run a different off-the-shelf file system implementation, and an object-oriented database where the replicas ran the same, non-deterministic implementation. These examples suggest that our technique can be used in practice --- in both cases, the implementation required only a modest amount of new code, and our performance results indicate that the replicated services perform comparably to the implementations that they reuse.

软件错误是导致系统中断的主要原因，并且越来越多地被恶意攻击利用。拜占庭式容错允许复制系统掩盖一些软件错误，但部署成本很高。本文描述了一种复制技术BASE，它使用抽象来降低拜占庭容错的成本，并提高其掩盖软件错误的能力。BASE降低了成本，因为它支持重用现成的服务实现。它提高了可用性，因为可以使用正确副本存储的状态的抽象视图定期修复每个副本，并且每个副本可以运行不同的或不确定的服务实现，这降低了公共模式失败的可能性。我们构建了一个NFS服务，其中每个副本可以运行不同的现成文件系统实现，以及一个面向对象的数据库，其中副本运行相同的非确定性实现。这些示例表明，我们的技术可以在实践中使用——在这两种情况下，实现只需要少量的新代码，并且我们的性能结果表明，复制的服务的性能与它们重用的实现相当。

{"title":"BASE: using abstraction to improve fault tolerance","authors":"R. Rodrigues, M. Castro, B. Liskov","doi":"10.1145/502034.502037","DOIUrl":"https://doi.org/10.1145/502034.502037","url":null,"abstract":"Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy. This paper describes a replication technique, BASE, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors. BASE reduces cost because it enables reuse of off-the-shelf service implementations. It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or non-deterministic service implementations, which reduces the probability of common mode failures. We built an NFS service where each replica can run a different off-the-shelf file system implementation, and an object-oriented database where the replicas ran the same, non-deterministic implementation. These examples suggest that our technique can be used in practice --- in both cases, the implementation required only a modest amount of new code, and our performance results indicate that the replicated services perform comparably to the implementations that they reuse.","PeriodicalId":263344,"journal":{"name":"Proceedings of the eighteenth ACM symposium on Operating systems principles","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122192927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 251

SEDA: an architecture for well-conditioned, scalable internet services SEDA:用于条件良好、可扩展的互联网服务的架构

Proceedings of the eighteenth ACM symposium on Operating systems principles

Pub Date : 2001-10-21 DOI: 10.1145/502034.502057

M. Welsh, D. Culler, E. Brewer

We propose a new design for highly concurrent Internet services, which we call the staged event-driven architecture (SEDA). SEDA is intended to support massive concurrency demands and simplify the construction of well-conditioned services. In SEDA, applications consist of a network of event-driven stages connected by explicit queues. This architecture allows services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA makes use of a set of dynamic resource controllers to keep stages within their operating regime despite large fluctuations in load. We describe several control mechanisms for automatic tuning and load conditioning, including thread pool sizing, event batching, and adaptive load shedding. We present the SEDA design and an implementation of an Internet services platform based on this architecture. We evaluate the use of SEDA through two applications: a high-performance HTTP server and a packet router for the Gnutella peer-to-peer file sharing network. These results show that SEDA applications exhibit higher performance than traditional service designs, and are robust to huge variations in load.

我们提出了一种高并发Internet服务的新设计，我们称之为阶段事件驱动架构(SEDA)。SEDA旨在支持大量并发需求并简化条件良好的服务的构造。在SEDA中，应用程序由由显式队列连接的事件驱动阶段网络组成。这种体系结构允许服务很好地适应负载，防止在需求超过服务容量时过度使用资源。SEDA利用一组动态资源控制器，在负载波动较大的情况下，使各阶段保持在运行状态。我们描述了几种用于自动调优和负载调节的控制机制，包括线程池大小、事件批处理和自适应负载减少。我们提出了SEDA的设计和基于此架构的互联网服务平台的实现。我们通过两个应用程序来评估SEDA的使用:一个高性能HTTP服务器和一个用于Gnutella点对点文件共享网络的数据包路由器。这些结果表明，SEDA应用程序表现出比传统服务设计更高的性能，并且对负载的巨大变化具有鲁棒性。

{"title":"SEDA: an architecture for well-conditioned, scalable internet services","authors":"M. Welsh, D. Culler, E. Brewer","doi":"10.1145/502034.502057","DOIUrl":"https://doi.org/10.1145/502034.502057","url":null,"abstract":"We propose a new design for highly concurrent Internet services, which we call the staged event-driven architecture (SEDA). SEDA is intended to support massive concurrency demands and simplify the construction of well-conditioned services. In SEDA, applications consist of a network of event-driven stages connected by explicit queues. This architecture allows services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA makes use of a set of dynamic resource controllers to keep stages within their operating regime despite large fluctuations in load. We describe several control mechanisms for automatic tuning and load conditioning, including thread pool sizing, event batching, and adaptive load shedding. We present the SEDA design and an implementation of an Internet services platform based on this architecture. We evaluate the use of SEDA through two applications: a high-performance HTTP server and a packet router for the Gnutella peer-to-peer file sharing network. These results show that SEDA applications exhibit higher performance than traditional service designs, and are robust to huge variations in load.","PeriodicalId":263344,"journal":{"name":"Proceedings of the eighteenth ACM symposium on Operating systems principles","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133033048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1018

Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O 预期调度:一种磁盘调度框架，用于克服同步I/O中的欺骗性空闲

Proceedings of the eighteenth ACM symposium on Operating systems principles

Pub Date : 2001-10-21 DOI: 10.1145/502034.502046

Sitaram Iyer, P. Druschel

Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as soon as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. Unfortunately, many common applications issue disk read requests in a synchronous manner, interspersing successive requests with short periods of computation. The scheduler chooses the next request too early; this induces deceptive idleness, a condition where the scheduler incorrectly assumes that the last request issuing process has no further requests, and becomes forced to switch to a request from another process.We propose the anticipatory disk scheduling framework to solve this problem in a simple, general and transparent way, based on the non-work-conserving scheduling discipline. Our FreeBSD implementation is observed to yield large benefits on a range of microbenchmarks and real workloads. The Apache webserver delivers between 29% and 71% more throughput on a disk-intensive workload. The Andrew filesystem benchmark runs faster by 8%, due to a speedup of 54% in its read-intensive phase. Variants of the TPC-B database benchmark exhibit improvements between 2% and 60%. Proportional-share schedulers are seen to achieve their contracts accurately and efficiently.

当前操作系统中的磁盘调度器通常是节省工作的，也就是说，它们在前一个请求完成后立即调度请求。这样的调度器通常需要来自每个进程的多个未完成请求，以满足系统级的性能和服务质量目标。不幸的是，许多常见的应用程序以同步方式发出磁盘读请求，将连续的请求分散在较短的计算周期内。调度程序过早地选择下一个请求;这会导致欺骗性的空闲，在这种情况下，调度器错误地假设最后发出请求的进程没有进一步的请求，并被迫切换到来自另一个进程的请求。为了解决这一问题，我们提出了基于非工作节约调度原则的预期磁盘调度框架，以一种简单、通用和透明的方式解决这一问题。我们的FreeBSD实现被观察到在一系列微基准测试和实际工作负载上产生了很大的好处。Apache web服务器在磁盘密集型工作负载上提供了29%到71%的吞吐量。Andrew文件系统基准测试的运行速度提高了8%，这是由于在其读密集型阶段加速了54%。TPC-B数据库基准测试的变体表现出2%到60%的改进。比例份额调度器被认为能够准确有效地实现它们的契约。

{"title":"Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O","authors":"Sitaram Iyer, P. Druschel","doi":"10.1145/502034.502046","DOIUrl":"https://doi.org/10.1145/502034.502046","url":null,"abstract":"Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as soon as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. Unfortunately, many common applications issue disk read requests in a synchronous manner, interspersing successive requests with short periods of computation. The scheduler chooses the next request too early; this induces deceptive idleness, a condition where the scheduler incorrectly assumes that the last request issuing process has no further requests, and becomes forced to switch to a request from another process.We propose the anticipatory disk scheduling framework to solve this problem in a simple, general and transparent way, based on the non-work-conserving scheduling discipline. Our FreeBSD implementation is observed to yield large benefits on a range of microbenchmarks and real workloads. The Apache webserver delivers between 29% and 71% more throughput on a disk-intensive workload. The Andrew filesystem benchmark runs faster by 8%, due to a speedup of 54% in its read-intensive phase. Variants of the TPC-B database benchmark exhibit improvements between 2% and 60%. Proportional-share schedulers are seen to achieve their contracts accurately and efficiently.","PeriodicalId":263344,"journal":{"name":"Proceedings of the eighteenth ACM symposium on Operating systems principles","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130539042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 281

Real-time dynamic voltage scaling for low-power embedded operating systems 低功耗嵌入式操作系统的实时动态电压缩放

Proceedings of the eighteenth ACM symposium on Operating systems principles

Pub Date : 2001-10-21 DOI: 10.1145/502034.502044

P. Pillai, K. Shin

In recent years, there has been a rapid and wide spread of non-traditional computing platforms, especially mobile and portable computing devices. As applications become increasingly sophisticated and processing power increases, the most serious limitation on these devices is the available battery life. Dynamic Voltage Scaling (DVS) has been a key technique in exploiting the hardware characteristics of processors to reduce energy dissipation by lowering the supply voltage and operating frequency. The DVS algorithms are shown to be able to make dramatic energy savings while providing the necessary peak computation power in general-purpose systems. However, for a large class of applications in embedded real-time systems like cellular phones and camcorders, the variable operating frequency interferes with their deadline guarantee mechanisms, and DVS in this context, despite its growing importance, is largely overlooked/under-developed. To provide real-time guarantees, DVS must consider deadlines and periodicity of real-time tasks, requiring integration with the real-time scheduler. In this paper, we present a class of novel algorithms called real-time DVS (RT-DVS) that modify the OS's real-time scheduler and task management service to provide significant energy savings while maintaining real-time deadline guarantees. We show through simulations and a working prototype implementation that these RT-DVS algorithms closely approach the theoretical lower bound on energy consumption, and can easily reduce energy consumption 20% to 40% in an embedded real-time system.

近年来，非传统计算平台，特别是移动和便携式计算设备迅速而广泛地传播开来。随着应用变得越来越复杂和处理能力的提高，这些设备最严重的限制是可用的电池寿命。动态电压缩放(DVS)是利用处理器的硬件特性，通过降低电源电压和工作频率来降低功耗的一项关键技术。在通用系统中，分布式交换机算法能够在提供必要的峰值计算能力的同时显著节省能源。然而，对于诸如移动电话和摄像机之类的嵌入式实时系统中的大量应用来说，可变的工作频率会干扰它们的截止日期保证机制，而在这种情况下，尽管DVS越来越重要，但它在很大程度上被忽视/开发不足。为了提供实时保证，分布式交换机必须考虑实时任务的最后期限和周期性，这需要与实时调度程序集成。在本文中，我们提出了一类称为实时分布式交换机(RT-DVS)的新算法，它修改了操作系统的实时调度程序和任务管理服务，在保持实时截止日期保证的同时提供显著的能源节约。我们通过模拟和工作原型实现表明，这些RT-DVS算法非常接近能耗的理论下限，并且可以轻松地将嵌入式实时系统的能耗降低20%至40%。

{"title":"Real-time dynamic voltage scaling for low-power embedded operating systems","authors":"P. Pillai, K. Shin","doi":"10.1145/502034.502044","DOIUrl":"https://doi.org/10.1145/502034.502044","url":null,"abstract":"In recent years, there has been a rapid and wide spread of non-traditional computing platforms, especially mobile and portable computing devices. As applications become increasingly sophisticated and processing power increases, the most serious limitation on these devices is the available battery life. Dynamic Voltage Scaling (DVS) has been a key technique in exploiting the hardware characteristics of processors to reduce energy dissipation by lowering the supply voltage and operating frequency. The DVS algorithms are shown to be able to make dramatic energy savings while providing the necessary peak computation power in general-purpose systems. However, for a large class of applications in embedded real-time systems like cellular phones and camcorders, the variable operating frequency interferes with their deadline guarantee mechanisms, and DVS in this context, despite its growing importance, is largely overlooked/under-developed. To provide real-time guarantees, DVS must consider deadlines and periodicity of real-time tasks, requiring integration with the real-time scheduler. In this paper, we present a class of novel algorithms called real-time DVS (RT-DVS) that modify the OS's real-time scheduler and task management service to provide significant energy savings while maintaining real-time deadline guarantees. We show through simulations and a working prototype implementation that these RT-DVS algorithms closely approach the theoretical lower bound on energy consumption, and can easily reduce energy consumption 20% to 40% in an embedded real-time system.","PeriodicalId":263344,"journal":{"name":"Proceedings of the eighteenth ACM symposium on Operating systems principles","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132542612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1303

Wide-area cooperative storage with CFS 与粮安委广域合作仓储

Proceedings of the eighteenth ACM symposium on Operating systems principles

Pub Date : 2001-10-21 DOI: 10.1145/502034.502054

F. Dabek, M. Kaashoek, David R Karger, R. Morris, I. Stoica

The Cooperative File System (CFS) is a new peer-to-peer read-only storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers.CFS is implemented using the SFS file system toolkit and runs on Linux, OpenBSD, and FreeBSD. Experience on a globally deployed prototype shows that CFS delivers data to clients as fast as FTP. Controlled tests show that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.

协作文件系统(Cooperative File System, CFS)是一种新型的点对点只读存储系统，它为文件存储和检索的效率、健壮性和负载均衡提供了可证明的保证。CFS通过一个完全分散的架构来实现这一点，该架构可以扩展到大型系统。CFS服务器为块存储提供分布式哈希表(DHash)。CFS客户端将dash块解释为文件系统。DHash以细粒度分发和缓存块以实现负载平衡，使用复制实现鲁棒性，并通过服务器选择减少延迟。DHash使用Chord定位协议查找区块，该协议在服务器数量上按对数时间进行操作。CFS是使用SFS文件系统工具包实现的，可以在Linux、OpenBSD和FreeBSD上运行。在全局部署的原型上的经验表明，CFS向客户机交付数据的速度与FTP一样快。受控测试表明CFS是可扩展的:使用4,096台服务器，查找数据块只需要联系7台服务器。测试还展示了近乎完美的健壮性和不受影响的性能，即使多达一半的服务器出现故障。

引用次数: 1869

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the eighteenth ACM symposium on Operating systems principles

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀