Vitaly Chipounov, Volodymyr Kuznetsov, George Candea
This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs. S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.
{"title":"The S2E Platform: Design, Implementation, and Applications","authors":"Vitaly Chipounov, Volodymyr Kuznetsov, George Candea","doi":"10.1145/2110356.2110358","DOIUrl":"https://doi.org/10.1145/2110356.2110358","url":null,"abstract":"This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs.\u0000 S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"208 1","pages":"2:1-2:49"},"PeriodicalIF":1.5,"publicationDate":"2012-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77745512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.
{"title":"Efficient Testing of Recovery Code Using Fault Injection","authors":"P. Marinescu, George Candea","doi":"10.1145/2063509.2063511","DOIUrl":"https://doi.org/10.1145/2063509.2063511","url":null,"abstract":"A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"8 1","pages":"11:1-11:38"},"PeriodicalIF":1.5,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88714805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Publish-subscribe (pub-sub) is an emerging paradigm for building a large number of distributed systems. A wide area pub-sub system is usually implemented on an overlay network infrastructure to enable information dissemination from publishers to subscribers. Using an open overlay network raises several security concerns such as: confidentiality and integrity, authentication, authorization and Denial-of-Service (DoS) attacks. In this article we present EventGuard, a framework for building secure wide-area pub-sub systems. The EventGuard architecture is comprised of three key components: (1) a suite of security guards that can be seamlessly plugged-into a content-based pub-sub system, (2) a scalable key management algorithm to enforce access control on subscribers, and (3) a resilient pub-sub network design that is capable of scalable routing, handling message dropping-based DoS attacks, and node failures. The design of EventGuard mechanisms aims at providing security guarantees while maintaining the system’s overall simplicity, scalability, and performance metrics. We describe an implementation of the EventGuard pub-sub system to show that EventGuard is easily stackable on any content-based pub-sub core. We present detailed experimental results that quantify the overhead of the EventGuard pub-sub system and demonstrate its resilience against various attacks.
{"title":"EventGuard: A System Architecture for Securing Publish-Subscribe Networks","authors":"M. Srivatsa, Ling Liu, A. Iyengar","doi":"10.1145/2063509.2063510","DOIUrl":"https://doi.org/10.1145/2063509.2063510","url":null,"abstract":"Publish-subscribe (pub-sub) is an emerging paradigm for building a large number of distributed systems. A wide area pub-sub system is usually implemented on an overlay network infrastructure to enable information dissemination from publishers to subscribers. Using an open overlay network raises several security concerns such as: confidentiality and integrity, authentication, authorization and Denial-of-Service (DoS) attacks. In this article we present EventGuard, a framework for building secure wide-area pub-sub systems. The EventGuard architecture is comprised of three key components: (1) a suite of security guards that can be seamlessly plugged-into a content-based pub-sub system, (2) a scalable key management algorithm to enforce access control on subscribers, and (3) a resilient pub-sub network design that is capable of scalable routing, handling message dropping-based DoS attacks, and node failures. The design of EventGuard mechanisms aims at providing security guarantees while maintaining the system’s overall simplicity, scalability, and performance metrics. We describe an implementation of the EventGuard pub-sub system to show that EventGuard is easily stackable on any content-based pub-sub core. We present detailed experimental results that quantify the overhead of the EventGuard pub-sub system and demonstrate its resilience against various attacks.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"72 1","pages":"10:1-10:40"},"PeriodicalIF":1.5,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80525352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Distributed mobile transactions utilize commit protocols to achieve atomicity and consistent decisions. This is challenging, as mobile environments are typically characterized by frequent perturbations such as network disconnections and node failures. On one hand environmental constraints on mobile participants and wireless links may increase the resource blocking time of fixed participants. On the other hand frequent node and link failures complicate the design of atomic commit protocols by increasing both the transaction abort rate and resource blocking time. Hence, the deployment of classical commit protocols (such as two-phase commit) does not reasonably extend to distributed infrastructure-based mobile environments driving the need for perturbation-resilient commit protocols. In this article, we comprehensively consider and classify the perturbations of the wireless infrastructure-based mobile environment according to their impact on the outcome of commit protocols and on the resource blocking times. For each identified perturbation class a commit solution is provided. Consolidating these subsolutions, we develop a family of fault-tolerant atomic commit protocols that are tunable to meet the desired perturbation needs and provide minimized resource blocking times and optimized transaction commit rates. The framework is also evaluated using simulations and an actual testbed deployment.
{"title":"On the design of perturbation-resilient atomic commit protocols for mobile transactions","authors":"Brahim Ayari, Abdelmajid Khelil, N. Suri","doi":"10.1145/2003690.2003691","DOIUrl":"https://doi.org/10.1145/2003690.2003691","url":null,"abstract":"Distributed mobile transactions utilize commit protocols to achieve atomicity and consistent decisions. This is challenging, as mobile environments are typically characterized by frequent perturbations such as network disconnections and node failures. On one hand environmental constraints on mobile participants and wireless links may increase the resource blocking time of fixed participants. On the other hand frequent node and link failures complicate the design of atomic commit protocols by increasing both the transaction abort rate and resource blocking time. Hence, the deployment of classical commit protocols (such as two-phase commit) does not reasonably extend to distributed infrastructure-based mobile environments driving the need for perturbation-resilient commit protocols.\u0000 In this article, we comprehensively consider and classify the perturbations of the wireless infrastructure-based mobile environment according to their impact on the outcome of commit protocols and on the resource blocking times. For each identified perturbation class a commit solution is provided. Consolidating these subsolutions, we develop a family of fault-tolerant atomic commit protocols that are tunable to meet the desired perturbation needs and provide minimized resource blocking times and optimized transaction commit rates. The framework is also evaluated using simulations and an actual testbed deployment.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"69 1","pages":"7:1-7:36"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89930976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Reddi, Benjamin C. Lee, Trishul M. Chilimbi, Kushagra Vaid
As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.
{"title":"Mobile processors for energy-efficient web search","authors":"V. Reddi, Benjamin C. Lee, Trishul M. Chilimbi, Kushagra Vaid","doi":"10.1145/2003690.2003693","DOIUrl":"https://doi.org/10.1145/2003690.2003693","url":null,"abstract":"As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"8 1","pages":"9:1-9:39"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88213693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Kalibera, F. Pizlo, Antony Lloyd Hosking, J. Vitek
Managed languages such as Java and C# are increasingly being considered for hard real-time applications because of their productivity and software engineering advantages. Automatic memory management, or garbage collection, is a key enabler for robust, reusable libraries, yet remains a challenge for analysis and implementation of real-time execution environments. This article comprehensively compares leading approaches to hard real-time garbage collection. There are many design decisions involved in selecting a real-time garbage collection algorithm. For time-based garbage collectors on uniprocessors one must choose whether to use periodic, slack-based or hybrid scheduling. A significant impediment to valid experimental comparison of such choices is that commercial implementations use completely different proprietary infrastructures. We present Minuteman, a framework for experimenting with real-time collection algorithms in the context of a high-performance execution environment for real-time Java. We provide the first comparison of the approaches, both experimentally using realistic workloads, and analytically in terms of schedulability.
{"title":"Scheduling real-time garbage collection on uniprocessors","authors":"T. Kalibera, F. Pizlo, Antony Lloyd Hosking, J. Vitek","doi":"10.1145/2003690.2003692","DOIUrl":"https://doi.org/10.1145/2003690.2003692","url":null,"abstract":"Managed languages such as Java and C# are increasingly being considered for hard real-time applications because of their productivity and software engineering advantages. Automatic memory management, or garbage collection, is a key enabler for robust, reusable libraries, yet remains a challenge for analysis and implementation of real-time execution environments. This article comprehensively compares leading approaches to hard real-time garbage collection. There are many design decisions involved in selecting a real-time garbage collection algorithm. For time-based garbage collectors on uniprocessors one must choose whether to use periodic, slack-based or hybrid scheduling. A significant impediment to valid experimental comparison of such choices is that commercial implementations use completely different proprietary infrastructures. We present Minuteman, a framework for experimenting with real-time collection algorithms in the context of a high-performance execution environment for real-time Java. We provide the first comparison of the approaches, both experimentally using realistic workloads, and analytically in terms of schedulability.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"27 1","pages":"8:1-8:29"},"PeriodicalIF":1.5,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73858077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multilevel caching, common in many storage configurations, introduces new challenges to traditional cache management: data must be kept in the appropriate cache and replication avoided across the various cache levels. Additional challenges are introduced when the lower levels of the hierarchy are shared by multiple clients. Sharing can have both positive and negative effects. While data fetched by one client can be used by another client without incurring additional delays, clients competing for cache buffers can evict each other’s blocks and interfere with exclusive caching schemes. We present a global noncentralized, dynamic and informed management policy for multiple levels of cache, accessed by multiple clients. Our algorithm, MC2, combines local, per client management with a global, system-wide scheme, to emphasize the positive effects of sharing and reduce the negative ones. Our local management scheme, Karma, uses readily available information about the client’s future access profile to save the most valuable blocks, and to choose the best replacement policy for them. The global scheme uses the same information to divide the shared cache space between clients, and to manage this space. Exclusive caching is maintained for nonshared data and is disabled when sharing is identified. Previous studies have partially addressed these challenges through minor changes to the storage interface. We show that all these challenges can in fact be addressed by combining minor interface changes with smart allocation and replacement policies. We show the superiority of our approach through comparison to existing solutions, including LRU, ARC, MultiQ, LRU-SP, and Demote, as well as a lower bound on optimal I/O response times. Our simulation results demonstrate better cache performance than all other solutions and up to 87% better performance than LRU on representative workloads.
{"title":"Management of Multilevel, Multiclient Cache Hierarchies with Application Hints","authors":"G. Yadgar, M. Factor, Kai Li, A. Schuster","doi":"10.1145/1963559.1963561","DOIUrl":"https://doi.org/10.1145/1963559.1963561","url":null,"abstract":"Multilevel caching, common in many storage configurations, introduces new challenges to traditional cache management: data must be kept in the appropriate cache and replication avoided across the various cache levels. Additional challenges are introduced when the lower levels of the hierarchy are shared by multiple clients. Sharing can have both positive and negative effects. While data fetched by one client can be used by another client without incurring additional delays, clients competing for cache buffers can evict each other’s blocks and interfere with exclusive caching schemes.\u0000 We present a global noncentralized, dynamic and informed management policy for multiple levels of cache, accessed by multiple clients. Our algorithm, MC2, combines local, per client management with a global, system-wide scheme, to emphasize the positive effects of sharing and reduce the negative ones. Our local management scheme, Karma, uses readily available information about the client’s future access profile to save the most valuable blocks, and to choose the best replacement policy for them. The global scheme uses the same information to divide the shared cache space between clients, and to manage this space. Exclusive caching is maintained for nonshared data and is disabled when sharing is identified.\u0000 Previous studies have partially addressed these challenges through minor changes to the storage interface. We show that all these challenges can in fact be addressed by combining minor interface changes with smart allocation and replacement policies. We show the superiority of our approach through comparison to existing solutions, including LRU, ARC, MultiQ, LRU-SP, and Demote, as well as a lower bound on optimal I/O response times. Our simulation results demonstrate better cache performance than all other solutions and up to 87% better performance than LRU on representative workloads.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"59 1","pages":"5:1-5:51"},"PeriodicalIF":1.5,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87095494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Streamline is a stream-based OS communication subsystem that spans from peripheral hardware to userspace processes. It improves performance of I/O-bound applications (such as webservers and streaming media applications) by constructing tailor-made I/O paths through the operating system for each application at runtime. Path optimization removes unnecessary copying, context switching and cache replacement and integrates specialized hardware. Streamline automates optimization and only presents users a clear, concise job control language based on Unix pipelines. For backward compatibility Streamline also presents well known files, pipes and sockets abstractions. Observed throughput improvement over Linux 2.6.24 for networking applications is up to 30-fold, but two-fold is more typical.
{"title":"Application-Tailored I/O with Streamline","authors":"W. D. Bruijn, H. Bos, H. Bal","doi":"10.1145/1963559.1963562","DOIUrl":"https://doi.org/10.1145/1963559.1963562","url":null,"abstract":"Streamline is a stream-based OS communication subsystem that spans from peripheral hardware to userspace processes. It improves performance of I/O-bound applications (such as webservers and streaming media applications) by constructing tailor-made I/O paths through the operating system for each application at runtime. Path optimization removes unnecessary copying, context switching and cache replacement and integrates specialized hardware. Streamline automates optimization and only presents users a clear, concise job control language based on Unix pipelines. For backward compatibility Streamline also presents well known files, pipes and sockets abstractions. Observed throughput improvement over Linux 2.6.24 for networking applications is up to 30-fold, but two-fold is more typical.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"14 1","pages":"6:1-6:33"},"PeriodicalIF":1.5,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75143505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrian Schüpbach, Andrew Baumann, Timothy Roscoe, Simon Peter
C remains the language of choice for hardware programming (device drivers, bus configuration, etc.): it is fast, allows low-level access, and is trusted by OS developers. However, the algorithms required to configure and reconfigure hardware devices and interconnects are becoming more complex and diverse, with the added burden of legacy support, “quirks,” and hardware bugs to work around. Even programming PCI bridges in a modern PC is a surprisingly complex problem, and is getting worse as new functionality such as hotplug appears. Existing approaches use relatively simple algorithms, hard-coded in C and closely coupled with low-level register access code, generally leading to suboptimal configurations. We investigate the merits and drawbacks of a new approach: separating hardware configuration logic (algorithms to determine configuration parameter values) from mechanism (programming device registers). The latter we keep in C, and the former we encode in a declarative programming language with constraint-satisfaction extensions. As a test case, we have implemented full PCI configuration, resource allocation, and interrupt assignment in the Barrelfish research operating system, using a concise expression of efficient algorithms in constraint logic programming. We show that the approach is tractable, and can successfully configure a wide range of PCs with competitive runtime cost. Moreover, it requires about half the code of the C-based approach in Linux while offering considerably more functionality. Additionally it easily accommodates adaptations such as hotplug, fixed regions, and “quirks.”
{"title":"A Declarative Language Approach to Device Configuration","authors":"Adrian Schüpbach, Andrew Baumann, Timothy Roscoe, Simon Peter","doi":"10.1145/2110356.2110361","DOIUrl":"https://doi.org/10.1145/2110356.2110361","url":null,"abstract":"C remains the language of choice for hardware programming (device drivers, bus configuration, etc.): it is fast, allows low-level access, and is trusted by OS developers. However, the algorithms required to configure and reconfigure hardware devices and interconnects are becoming more complex and diverse, with the added burden of legacy support, “quirks,” and hardware bugs to work around. Even programming PCI bridges in a modern PC is a surprisingly complex problem, and is getting worse as new functionality such as hotplug appears. Existing approaches use relatively simple algorithms, hard-coded in C and closely coupled with low-level register access code, generally leading to suboptimal configurations.\u0000 We investigate the merits and drawbacks of a new approach: separating hardware configuration logic (algorithms to determine configuration parameter values) from mechanism (programming device registers). The latter we keep in C, and the former we encode in a declarative programming language with constraint-satisfaction extensions. As a test case, we have implemented full PCI configuration, resource allocation, and interrupt assignment in the Barrelfish research operating system, using a concise expression of efficient algorithms in constraint logic programming. We show that the approach is tractable, and can successfully configure a wide range of PCs with competitive runtime cost. Moreover, it requires about half the code of the C-based approach in Linux while offering considerably more functionality. Additionally it easily accommodates adaptations such as hotplug, fixed regions, and “quirks.”","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"3 1","pages":"5:1-5:35"},"PeriodicalIF":1.5,"publicationDate":"2011-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90289933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, S. Savage
Diagnosing software failures in the field is notoriously difficult, in part due to the fundamental complexity of trouble-shooting any complex software system, but further exacerbated by the paucity of information that is typically available in the production setting. Indeed, for reasons of both overhead and privacy, it is common that only the run-time log generated by a system (e.g., syslog) can be shared with the developers. Unfortunately, the ad-hoc nature of such reports are frequently insufficient for detailed failure diagnosis. This paper seeks to improve this situation within the rubric of existing practice. We describe a tool, LogEnhancer that automatically "enhances" existing logging code to aid in future post-failure debugging. We evaluate LogEnhancer on eight large, real-world applications and demonstrate that it can dramatically reduce the set of potential root failure causes that must be considered during diagnosis while imposing negligible overheads.
{"title":"Improving software diagnosability via log enhancement","authors":"Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, S. Savage","doi":"10.1145/1950365.1950369","DOIUrl":"https://doi.org/10.1145/1950365.1950369","url":null,"abstract":"Diagnosing software failures in the field is notoriously difficult, in part due to the fundamental complexity of trouble-shooting any complex software system, but further exacerbated by the paucity of information that is typically available in the production setting. Indeed, for reasons of both overhead and privacy, it is common that only the run-time log generated by a system (e.g., syslog) can be shared with the developers. Unfortunately, the ad-hoc nature of such reports are frequently insufficient for detailed failure diagnosis. This paper seeks to improve this situation within the rubric of existing practice. We describe a tool, LogEnhancer that automatically \"enhances\" existing logging code to aid in future post-failure debugging. We evaluate LogEnhancer on eight large, real-world applications and demonstrate that it can dramatically reduce the set of potential root failure causes that must be considered during diagnosis while imposing negligible overheads.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"1 1","pages":"3-14"},"PeriodicalIF":1.5,"publicationDate":"2011-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1950365.1950369","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64122234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}