ACM Transactions on Computer Systems最新文献_第6页

The S2E Platform: Design, Implementation, and Applications S2E平台:设计、实现和应用

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2012-02-01 DOI: 10.1145/2110356.2110358

Vitaly Chipounov, Volodymyr Kuznetsov, George Candea

This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs. S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.

本文介绍了S2E，一个用于分析软件系统属性和行为的平台，以及它在开发工具中的用途，这些工具用于全面的性能分析、专有软件的逆向工程以及对内核模式和用户模式二进制文件的自动化测试。从概念上讲，S2E是一个带有模块化路径分析器的自动化路径资源管理器:资源管理器使用符号执行引擎驱动目标系统沿着所有感兴趣的执行路径运行，而分析器则测量和/或检查每个路径的属性。S2E用户可以组合现有的分析工具来构建定制的分析工具，也可以直接使用S2E的api。S2E的优势在于能够扩展到大型系统，例如完整的Windows堆栈，使用两个新思想:选择性符号执行，一种自动减少必须在目标分析中以符号方式执行的代码量的方法，以及执行一致性模型，一种在分析过程中进行性能/准确性折衷的方法。这些技术赋予S2E三个关键能力:同时分析整个执行路径族，而不是一次只分析一个执行;在真实的软件堆栈(用户程序、库、内核、驱动程序等)中执行活体分析，而不是使用这些层的抽象模型;并直接操作二进制文件，从而能够分析甚至专有软件。

{"title":"The S2E Platform: Design, Implementation, and Applications","authors":"Vitaly Chipounov, Volodymyr Kuznetsov, George Candea","doi":"10.1145/2110356.2110358","DOIUrl":"https://doi.org/10.1145/2110356.2110358","url":null,"abstract":"This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs.\u0000 S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"208 1","pages":"2:1-2:49"},"PeriodicalIF":1.5,"publicationDate":"2012-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77745512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 184

Efficient Testing of Recovery Code Using Fault Injection 使用故障注入的有效测试恢复代码

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-12-01 DOI: 10.1145/2063509.2063511

P. Marinescu, George Candea

A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.

开发可靠的软件系统的关键部分是测试其恢复代码。这段代码通常很难在实验室中进行测试，并且在现场很少运行;然而，当它运行时，它必须完美地执行，以便从故障中恢复系统。在本文中，我们提供了一个库级故障注入引擎，它支持在软件测试中高效地使用故障注入。我们描述了自动化技术，用于可靠地识别应用程序在与其环境交互时可能遇到的错误，用于自动识别程序二进制文件中的高值注入目标，以及用于生成有效的注入测试场景。我们提出了一个框架，用于编写精确的触发器，以错误返回码和相应的副作用的形式在应用程序和库之间的边界注入所需的错误。这些技术体现在LFI中，这是我们正在发布的一种新的故障注入引擎http://lfi.epfl.ch。本文包括我们使用LFI的初步经验报告。最值得注意的是，LFI在MySQL数据库服务器，Git版本控制系统，BIND名称服务器，Pidgin IM客户端和PBFT复制系统中发现了12个严重的，以前未报告的错误，没有开发人员的帮助，也没有访问源代码。LFI还完全自动地将恢复代码覆盖率从几乎为零提高到60%，而不需要新的测试或人工参与。

{"title":"Efficient Testing of Recovery Code Using Fault Injection","authors":"P. Marinescu, George Candea","doi":"10.1145/2063509.2063511","DOIUrl":"https://doi.org/10.1145/2063509.2063511","url":null,"abstract":"A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"8 1","pages":"11:1-11:38"},"PeriodicalIF":1.5,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88714805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

EventGuard: A System Architecture for Securing Publish-Subscribe Networks EventGuard:用于保护发布-订阅网络的系统架构

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-12-01 DOI: 10.1145/2063509.2063510

M. Srivatsa, Ling Liu, A. Iyengar

Publish-subscribe (pub-sub) is an emerging paradigm for building a large number of distributed systems. A wide area pub-sub system is usually implemented on an overlay network infrastructure to enable information dissemination from publishers to subscribers. Using an open overlay network raises several security concerns such as: confidentiality and integrity, authentication, authorization and Denial-of-Service (DoS) attacks. In this article we present EventGuard, a framework for building secure wide-area pub-sub systems. The EventGuard architecture is comprised of three key components: (1) a suite of security guards that can be seamlessly plugged-into a content-based pub-sub system, (2) a scalable key management algorithm to enforce access control on subscribers, and (3) a resilient pub-sub network design that is capable of scalable routing, handling message dropping-based DoS attacks, and node failures. The design of EventGuard mechanisms aims at providing security guarantees while maintaining the system’s overall simplicity, scalability, and performance metrics. We describe an implementation of the EventGuard pub-sub system to show that EventGuard is easily stackable on any content-based pub-sub core. We present detailed experimental results that quantify the overhead of the EventGuard pub-sub system and demonstrate its resilience against various attacks.

发布-订阅(pub-sub)是一种用于构建大量分布式系统的新兴范例。广域发布-分系统通常在覆盖网络基础设施上实现，以实现信息从发布者到订阅者的传播。使用开放的覆盖网络会引起一些安全问题，例如:机密性和完整性、身份验证、授权和拒绝服务(DoS)攻击。在本文中，我们介绍了EventGuard，一个用于构建安全广域公共-子系统的框架。EventGuard架构由三个关键组件组成:(1)一套可以无缝插入到基于内容的pub-sub系统的安全防护，(2)可扩展的密钥管理算法，用于对订阅者实施访问控制，以及(3)弹性的pub-sub网络设计，能够扩展路由，处理基于消息丢失的DoS攻击和节点故障。EventGuard机制的设计旨在提供安全保证，同时保持系统的整体简单性、可扩展性和性能指标。我们描述了一个EventGuard pub-sub系统的实现，以表明EventGuard可以很容易地堆叠在任何基于内容的pub-sub核心上。我们提供了详细的实验结果，量化了EventGuard公共子系统的开销，并展示了其对各种攻击的弹性。

{"title":"EventGuard: A System Architecture for Securing Publish-Subscribe Networks","authors":"M. Srivatsa, Ling Liu, A. Iyengar","doi":"10.1145/2063509.2063510","DOIUrl":"https://doi.org/10.1145/2063509.2063510","url":null,"abstract":"Publish-subscribe (pub-sub) is an emerging paradigm for building a large number of distributed systems. A wide area pub-sub system is usually implemented on an overlay network infrastructure to enable information dissemination from publishers to subscribers. Using an open overlay network raises several security concerns such as: confidentiality and integrity, authentication, authorization and Denial-of-Service (DoS) attacks. In this article we present EventGuard, a framework for building secure wide-area pub-sub systems. The EventGuard architecture is comprised of three key components: (1) a suite of security guards that can be seamlessly plugged-into a content-based pub-sub system, (2) a scalable key management algorithm to enforce access control on subscribers, and (3) a resilient pub-sub network design that is capable of scalable routing, handling message dropping-based DoS attacks, and node failures. The design of EventGuard mechanisms aims at providing security guarantees while maintaining the system’s overall simplicity, scalability, and performance metrics. We describe an implementation of the EventGuard pub-sub system to show that EventGuard is easily stackable on any content-based pub-sub core. We present detailed experimental results that quantify the overhead of the EventGuard pub-sub system and demonstrate its resilience against various attacks.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"72 1","pages":"10:1-10:40"},"PeriodicalIF":1.5,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80525352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

On the design of perturbation-resilient atomic commit protocols for mobile transactions 移动事务中抗扰动原子提交协议的设计

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-08-01 DOI: 10.1145/2003690.2003691

Brahim Ayari, Abdelmajid Khelil, N. Suri

Distributed mobile transactions utilize commit protocols to achieve atomicity and consistent decisions. This is challenging, as mobile environments are typically characterized by frequent perturbations such as network disconnections and node failures. On one hand environmental constraints on mobile participants and wireless links may increase the resource blocking time of fixed participants. On the other hand frequent node and link failures complicate the design of atomic commit protocols by increasing both the transaction abort rate and resource blocking time. Hence, the deployment of classical commit protocols (such as two-phase commit) does not reasonably extend to distributed infrastructure-based mobile environments driving the need for perturbation-resilient commit protocols. In this article, we comprehensively consider and classify the perturbations of the wireless infrastructure-based mobile environment according to their impact on the outcome of commit protocols and on the resource blocking times. For each identified perturbation class a commit solution is provided. Consolidating these subsolutions, we develop a family of fault-tolerant atomic commit protocols that are tunable to meet the desired perturbation needs and provide minimized resource blocking times and optimized transaction commit rates. The framework is also evaluated using simulations and an actual testbed deployment.

分布式移动事务利用提交协议来实现原子性和一致的决策。这是具有挑战性的，因为移动环境的典型特征是频繁的扰动，如网络断开和节点故障。一方面，移动参与者和无线链路的环境约束可能会增加固定参与者的资源阻塞时间。另一方面，频繁的节点和链路故障增加了事务中止率和资源阻塞时间，使原子提交协议的设计复杂化。因此，经典提交协议(如两阶段提交)的部署不能合理地扩展到基于分布式基础设施的移动环境，从而推动了对扰动弹性提交协议的需求。在本文中，我们根据其对提交协议结果和资源阻塞时间的影响，对基于无线基础设施的移动环境的扰动进行了全面考虑和分类。对于每个确定的扰动类，提供了提交解。通过整合这些子解决方案，我们开发了一系列容错原子提交协议，这些协议可调以满足所需的扰动需求，并提供最小的资源阻塞时间和优化的事务提交率。该框架还使用模拟和实际测试平台部署进行了评估。

{"title":"On the design of perturbation-resilient atomic commit protocols for mobile transactions","authors":"Brahim Ayari, Abdelmajid Khelil, N. Suri","doi":"10.1145/2003690.2003691","DOIUrl":"https://doi.org/10.1145/2003690.2003691","url":null,"abstract":"Distributed mobile transactions utilize commit protocols to achieve atomicity and consistent decisions. This is challenging, as mobile environments are typically characterized by frequent perturbations such as network disconnections and node failures. On one hand environmental constraints on mobile participants and wireless links may increase the resource blocking time of fixed participants. On the other hand frequent node and link failures complicate the design of atomic commit protocols by increasing both the transaction abort rate and resource blocking time. Hence, the deployment of classical commit protocols (such as two-phase commit) does not reasonably extend to distributed infrastructure-based mobile environments driving the need for perturbation-resilient commit protocols.\u0000 In this article, we comprehensively consider and classify the perturbations of the wireless infrastructure-based mobile environment according to their impact on the outcome of commit protocols and on the resource blocking times. For each identified perturbation class a commit solution is provided. Consolidating these subsolutions, we develop a family of fault-tolerant atomic commit protocols that are tunable to meet the desired perturbation needs and provide minimized resource blocking times and optimized transaction commit rates. The framework is also evaluated using simulations and an actual testbed deployment.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"69 1","pages":"7:1-7:36"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89930976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Mobile processors for energy-efficient web search 移动处理器的节能网络搜索

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-08-01 DOI: 10.1145/2003690.2003693

V. Reddi, Benjamin C. Lee, Trishul M. Chilimbi, Kushagra Vaid

As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.

随着云和效用计算的普及，计算机架构师必须确保组成云的数据中心的能力持续增长。考虑到兆瓦级的电力预算，增加数据中心的能力需要提高计算硬件的能源效率。为了提高数据中心的工作能力，每焦耳所做的功必须增加。即使随着数据中心应用程序性质的发展，我们也在追求这种效率。与传统的企业工作负载(通常是内存或I/O限制)不同，大数据计算和分析表现出更高的计算强度。本文研究了移动处理器作为数据中心能力手段的效率。特别地，我们比较和对比了Microsoft Bing搜索引擎在移动级Atom处理器和服务器级Xeon处理器上的性能和效率。必应实现了统计机器学习来动态排序页面，产生复杂的搜索结果，但也增加了计算强度。虽然移动处理器是节能的，但它们也需要为此付出代价。当比较每焦耳的查询次数时，Atom的能效是Xeon的5倍。但是，Atom上的搜索查询会遇到更高的延迟、不同的页面结果以及复杂查询的鲁棒性降低。尽管存在这些挑战，大多数常见查询仍然保持了服务质量。此外，由于搜索引擎的不同计算阶段遇到不同的瓶颈，我们将描述对未来架构增强、应用程序调优和系统架构的影响。在优化了Atom服务器平台之后，很大一部分功率和成本都花在了处理器功能上。使用优化的atom，在给定的数据中心功率预算中可以容纳更多的服务器。对于临界负载为15MW的数据中心，基于atom的服务器将Bing的能力提高了3.2倍。

{"title":"Mobile processors for energy-efficient web search","authors":"V. Reddi, Benjamin C. Lee, Trishul M. Chilimbi, Kushagra Vaid","doi":"10.1145/2003690.2003693","DOIUrl":"https://doi.org/10.1145/2003690.2003693","url":null,"abstract":"As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"8 1","pages":"9:1-9:39"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88213693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Scheduling real-time garbage collection on uniprocessors 在单处理器上调度实时垃圾收集

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-08-01 DOI: 10.1145/2003690.2003692

T. Kalibera, F. Pizlo, Antony Lloyd Hosking, J. Vitek

Managed languages such as Java and C# are increasingly being considered for hard real-time applications because of their productivity and software engineering advantages. Automatic memory management, or garbage collection, is a key enabler for robust, reusable libraries, yet remains a challenge for analysis and implementation of real-time execution environments. This article comprehensively compares leading approaches to hard real-time garbage collection. There are many design decisions involved in selecting a real-time garbage collection algorithm. For time-based garbage collectors on uniprocessors one must choose whether to use periodic, slack-based or hybrid scheduling. A significant impediment to valid experimental comparison of such choices is that commercial implementations use completely different proprietary infrastructures. We present Minuteman, a framework for experimenting with real-time collection algorithms in the context of a high-performance execution environment for real-time Java. We provide the first comparison of the approaches, both experimentally using realistic workloads, and analytically in terms of schedulability.

托管语言(如Java和c#)由于其生产力和软件工程优势，越来越多地被考虑用于硬实时应用程序。自动内存管理或垃圾收集是健壮的、可重用的库的关键支持因素，但对于实时执行环境的分析和实现来说仍然是一个挑战。本文全面比较了硬实时垃圾收集的主要方法。选择实时垃圾收集算法涉及许多设计决策。对于单处理器上基于时间的垃圾收集器，必须选择是使用周期性调度、基于空闲调度还是混合调度。对这些选择进行有效实验比较的一个重大障碍是，商业实现使用完全不同的专有基础设施。我们提出Minuteman，这是一个框架，用于在实时Java的高性能执行环境中实验实时收集算法。我们首先对这两种方法进行比较，实验上使用实际工作负载，分析上使用可调度性。

{"title":"Scheduling real-time garbage collection on uniprocessors","authors":"T. Kalibera, F. Pizlo, Antony Lloyd Hosking, J. Vitek","doi":"10.1145/2003690.2003692","DOIUrl":"https://doi.org/10.1145/2003690.2003692","url":null,"abstract":"Managed languages such as Java and C# are increasingly being considered for hard real-time applications because of their productivity and software engineering advantages. Automatic memory management, or garbage collection, is a key enabler for robust, reusable libraries, yet remains a challenge for analysis and implementation of real-time execution environments. This article comprehensively compares leading approaches to hard real-time garbage collection. There are many design decisions involved in selecting a real-time garbage collection algorithm. For time-based garbage collectors on uniprocessors one must choose whether to use periodic, slack-based or hybrid scheduling. A significant impediment to valid experimental comparison of such choices is that commercial implementations use completely different proprietary infrastructures. We present Minuteman, a framework for experimenting with real-time collection algorithms in the context of a high-performance execution environment for real-time Java. We provide the first comparison of the approaches, both experimentally using realistic workloads, and analytically in terms of schedulability.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"27 1","pages":"8:1-8:29"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73858077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Management of Multilevel, Multiclient Cache Hierarchies with Application Hints 管理多层次，多客户端缓存层次结构与应用程序提示

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-05-01 DOI: 10.1145/1963559.1963561

G. Yadgar, M. Factor, Kai Li, A. Schuster

Multilevel caching, common in many storage configurations, introduces new challenges to traditional cache management: data must be kept in the appropriate cache and replication avoided across the various cache levels. Additional challenges are introduced when the lower levels of the hierarchy are shared by multiple clients. Sharing can have both positive and negative effects. While data fetched by one client can be used by another client without incurring additional delays, clients competing for cache buffers can evict each other’s blocks and interfere with exclusive caching schemes. We present a global noncentralized, dynamic and informed management policy for multiple levels of cache, accessed by multiple clients. Our algorithm, MC2, combines local, per client management with a global, system-wide scheme, to emphasize the positive effects of sharing and reduce the negative ones. Our local management scheme, Karma, uses readily available information about the client’s future access profile to save the most valuable blocks, and to choose the best replacement policy for them. The global scheme uses the same information to divide the shared cache space between clients, and to manage this space. Exclusive caching is maintained for nonshared data and is disabled when sharing is identified. Previous studies have partially addressed these challenges through minor changes to the storage interface. We show that all these challenges can in fact be addressed by combining minor interface changes with smart allocation and replacement policies. We show the superiority of our approach through comparison to existing solutions, including LRU, ARC, MultiQ, LRU-SP, and Demote, as well as a lower bound on optimal I/O response times. Our simulation results demonstrate better cache performance than all other solutions and up to 87% better performance than LRU on representative workloads.

多级缓存(在许多存储配置中很常见)给传统的缓存管理带来了新的挑战:必须将数据保存在适当的缓存中，并避免跨不同缓存级别进行复制。当层次结构的较低级别由多个客户机共享时，会引入额外的挑战。分享可以有积极和消极的影响。虽然一个客户端获取的数据可以被另一个客户端使用而不会产生额外的延迟，但竞争缓存缓冲区的客户端可能会驱逐彼此的块并干扰排他性缓存方案。我们为多个客户端访问的多级缓存提供了一个全局非集中式、动态和知情的管理策略。我们的算法，MC2，将本地，每个客户端管理与全局，系统范围的方案相结合，以强调共享的积极影响并减少负面影响。我们的本地管理方案Karma使用有关客户未来访问配置文件的现成信息来保存最有价值的块，并为它们选择最佳的替换策略。全局方案使用相同的信息在客户端之间划分共享缓存空间，并对该空间进行管理。为非共享数据维护独占缓存，在确定共享时禁用独占缓存。以前的研究通过对存储接口的微小改变部分解决了这些挑战。我们表明，所有这些挑战实际上都可以通过将微小的接口更改与智能分配和替换策略相结合来解决。通过与现有的解决方案(包括LRU、ARC、MultiQ、LRU- sp和Demote)以及最佳I/O响应时间的下限进行比较，我们展示了我们方法的优越性。我们的模拟结果表明，在代表性工作负载上，缓存性能比所有其他解决方案都要好，比LRU的性能高出87%。

{"title":"Management of Multilevel, Multiclient Cache Hierarchies with Application Hints","authors":"G. Yadgar, M. Factor, Kai Li, A. Schuster","doi":"10.1145/1963559.1963561","DOIUrl":"https://doi.org/10.1145/1963559.1963561","url":null,"abstract":"Multilevel caching, common in many storage configurations, introduces new challenges to traditional cache management: data must be kept in the appropriate cache and replication avoided across the various cache levels. Additional challenges are introduced when the lower levels of the hierarchy are shared by multiple clients. Sharing can have both positive and negative effects. While data fetched by one client can be used by another client without incurring additional delays, clients competing for cache buffers can evict each other’s blocks and interfere with exclusive caching schemes.\u0000 We present a global noncentralized, dynamic and informed management policy for multiple levels of cache, accessed by multiple clients. Our algorithm, MC2, combines local, per client management with a global, system-wide scheme, to emphasize the positive effects of sharing and reduce the negative ones. Our local management scheme, Karma, uses readily available information about the client’s future access profile to save the most valuable blocks, and to choose the best replacement policy for them. The global scheme uses the same information to divide the shared cache space between clients, and to manage this space. Exclusive caching is maintained for nonshared data and is disabled when sharing is identified.\u0000 Previous studies have partially addressed these challenges through minor changes to the storage interface. We show that all these challenges can in fact be addressed by combining minor interface changes with smart allocation and replacement policies. We show the superiority of our approach through comparison to existing solutions, including LRU, ARC, MultiQ, LRU-SP, and Demote, as well as a lower bound on optimal I/O response times. Our simulation results demonstrate better cache performance than all other solutions and up to 87% better performance than LRU on representative workloads.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"59 1","pages":"5:1-5:51"},"PeriodicalIF":1.5,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87095494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

Application-Tailored I/O with Streamline 应用定制I/O与流线

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-05-01 DOI: 10.1145/1963559.1963562

W. D. Bruijn, H. Bos, H. Bal

Streamline is a stream-based OS communication subsystem that spans from peripheral hardware to userspace processes. It improves performance of I/O-bound applications (such as webservers and streaming media applications) by constructing tailor-made I/O paths through the operating system for each application at runtime. Path optimization removes unnecessary copying, context switching and cache replacement and integrates specialized hardware. Streamline automates optimization and only presents users a clear, concise job control language based on Unix pipelines. For backward compatibility Streamline also presents well known files, pipes and sockets abstractions. Observed throughput improvement over Linux 2.6.24 for networking applications is up to 30-fold, but two-fold is more typical.

流线是一个基于流的操作系统通信子系统，从外围硬件到用户空间进程。它在运行时通过操作系统为每个应用程序构建定制的I/O路径，从而提高了I/O绑定应用程序(如web服务器和流媒体应用程序)的性能。路径优化消除了不必要的复制、上下文切换和缓存替换，并集成了专门的硬件。streamlined自动优化，并且只向用户提供基于Unix管道的清晰、简洁的作业控制语言。为了向后兼容，streamlined还提供了众所周知的文件、管道和套接字抽象。观察到的网络应用程序的吞吐量比Linux 2.6.24提高了30倍，但通常是两倍。

引用次数: 21

A Declarative Language Approach to Device Configuration 设备配置的声明性语言方法

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-03-05 DOI: 10.1145/2110356.2110361

Adrian Schüpbach, Andrew Baumann, Timothy Roscoe, Simon Peter

C remains the language of choice for hardware programming (device drivers, bus configuration, etc.): it is fast, allows low-level access, and is trusted by OS developers. However, the algorithms required to configure and reconfigure hardware devices and interconnects are becoming more complex and diverse, with the added burden of legacy support, “quirks,” and hardware bugs to work around. Even programming PCI bridges in a modern PC is a surprisingly complex problem, and is getting worse as new functionality such as hotplug appears. Existing approaches use relatively simple algorithms, hard-coded in C and closely coupled with low-level register access code, generally leading to suboptimal configurations. We investigate the merits and drawbacks of a new approach: separating hardware configuration logic (algorithms to determine configuration parameter values) from mechanism (programming device registers). The latter we keep in C, and the former we encode in a declarative programming language with constraint-satisfaction extensions. As a test case, we have implemented full PCI configuration, resource allocation, and interrupt assignment in the Barrelfish research operating system, using a concise expression of efficient algorithms in constraint logic programming. We show that the approach is tractable, and can successfully configure a wide range of PCs with competitive runtime cost. Moreover, it requires about half the code of the C-based approach in Linux while offering considerably more functionality. Additionally it easily accommodates adaptations such as hotplug, fixed regions, and “quirks.”

C仍然是硬件编程(设备驱动程序，总线配置等)的首选语言:它速度快，允许低级访问，并且受到操作系统开发人员的信任。然而，配置和重新配置硬件设备和互连所需的算法正变得越来越复杂和多样化，还要解决遗留支持、“怪癖”和硬件错误带来的额外负担。甚至在现代PC中编程PCI桥也是一个非常复杂的问题，并且随着热插拔等新功能的出现而变得越来越糟糕。现有的方法使用相对简单的算法，用C语言硬编码，并与低级寄存器访问代码紧密结合，通常会导致次优配置。我们研究了一种新方法的优点和缺点:将硬件配置逻辑(确定配置参数值的算法)与机制(编程设备寄存器)分离。后者我们保存在C语言中，而前者我们使用具有约束满足扩展的声明性编程语言进行编码。作为测试用例，我们在Barrelfish研究操作系统中实现了完整的PCI配置，资源分配和中断分配，使用约束逻辑编程中高效算法的简明表达。我们的研究表明，该方法易于处理，并且可以成功地配置具有竞争力的运行时成本的各种pc。此外，它只需要Linux中基于c的方法一半的代码，同时提供更多的功能。此外，它很容易适应诸如热插拔、固定区域和“怪癖”之类的适应性。

{"title":"A Declarative Language Approach to Device Configuration","authors":"Adrian Schüpbach, Andrew Baumann, Timothy Roscoe, Simon Peter","doi":"10.1145/2110356.2110361","DOIUrl":"https://doi.org/10.1145/2110356.2110361","url":null,"abstract":"C remains the language of choice for hardware programming (device drivers, bus configuration, etc.): it is fast, allows low-level access, and is trusted by OS developers. However, the algorithms required to configure and reconfigure hardware devices and interconnects are becoming more complex and diverse, with the added burden of legacy support, “quirks,” and hardware bugs to work around. Even programming PCI bridges in a modern PC is a surprisingly complex problem, and is getting worse as new functionality such as hotplug appears. Existing approaches use relatively simple algorithms, hard-coded in C and closely coupled with low-level register access code, generally leading to suboptimal configurations.\u0000 We investigate the merits and drawbacks of a new approach: separating hardware configuration logic (algorithms to determine configuration parameter values) from mechanism (programming device registers). The latter we keep in C, and the former we encode in a declarative programming language with constraint-satisfaction extensions. As a test case, we have implemented full PCI configuration, resource allocation, and interrupt assignment in the Barrelfish research operating system, using a concise expression of efficient algorithms in constraint logic programming. We show that the approach is tractable, and can successfully configure a wide range of PCs with competitive runtime cost. Moreover, it requires about half the code of the C-based approach in Linux while offering considerably more functionality. Additionally it easily accommodates adaptations such as hotplug, fixed regions, and “quirks.”","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"3 1","pages":"5:1-5:35"},"PeriodicalIF":1.5,"publicationDate":"2011-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90289933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Improving software diagnosability via log enhancement 通过日志增强提高软件的可诊断性

IF 1.5 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems

Pub Date : 2011-03-05 DOI: 10.1145/1950365.1950369

Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, S. Savage

Diagnosing software failures in the field is notoriously difficult, in part due to the fundamental complexity of trouble-shooting any complex software system, but further exacerbated by the paucity of information that is typically available in the production setting. Indeed, for reasons of both overhead and privacy, it is common that only the run-time log generated by a system (e.g., syslog) can be shared with the developers. Unfortunately, the ad-hoc nature of such reports are frequently insufficient for detailed failure diagnosis. This paper seeks to improve this situation within the rubric of existing practice. We describe a tool, LogEnhancer that automatically "enhances" existing logging code to aid in future post-failure debugging. We evaluate LogEnhancer on eight large, real-world applications and demonstrate that it can dramatically reduce the set of potential root failure causes that must be considered during diagnosis while imposing negligible overheads.

在现场诊断软件故障是出了名的困难，部分原因是任何复杂的软件系统的故障诊断的基本复杂性，但进一步加剧了在生产环境中通常可用的信息的缺乏。实际上，出于开销和隐私的原因，通常只有系统生成的运行时日志(例如，syslog)才能与开发人员共享。不幸的是，此类报告的临时性质往往不足以进行详细的故障诊断。本文试图在现有实践的框架内改善这种情况。我们描述了一个工具，LogEnhancer，它自动“增强”现有的日志代码，以帮助将来的故障后调试。我们在8个大型实际应用程序上对LogEnhancer进行了评估，并证明它可以显著减少诊断过程中必须考虑的潜在根本故障原因集，而开销可以忽略不计。

引用次数: 30