首页 > 最新文献

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems最新文献

英文 中文
WSMeter: A Performance Evaluation Methodology for Google's Production Warehouse-Scale Computers WSMeter: Google生产仓库级计算机的性能评估方法
Jaewon Lee, Changkyun Kim, Kun Lin, Liqun Cheng, R. Govindaraju, Jangwoo Kim
Evaluating the comprehensive performance of a warehouse-scale computer (WSC) has been a long-standing challenge. Traditional load-testing benchmarks become ineffective because they cannot accurately reproduce the behavior of thousands of distinct jobs co-located on a WSC. We therefore evaluate WSCs using actual job behaviors in live production environments. From our experience of developing multiple generations of WSCs, we identify two major challenges of this approach: 1) the lack of a holistic metric that incorporates thousands of jobs and summarizes the performance, and 2) the high costs and risks of conducting an evaluation in a live environment. To address these challenges, we propose WSMeter, a cost-effective methodology to accurately evaluate a WSC's performance using a live production environment. We first define a new metric which accurately represents a WSC's overall performance, taking a wide variety of unevenly distributed jobs into account. We then propose a model to statistically embrace the performance variance inherent in WSCs, to conduct an evaluation with minimal costs and risks. We present three real-world use cases to prove the effectiveness of WSMeter. In the first two cases, WSMeter accurately discerns 7% and 1% performance improvements from WSC upgrades using only 0.9% and 6.6% of the machines in the WSCs, respectively. We emphasize that naive statistical comparisons incur much higher evaluation costs (> 4 times) and sometimes even fail to distinguish subtle differences. The third case shows that a cloud customer hosting two services on our WSC quantifies the performance benefits of software optimization (+9.3%) with minimal overheads (2.3% of the service capacity).
评估仓库级计算机(WSC)的综合性能一直是一个长期存在的挑战。传统的负载测试基准变得无效,因为它们不能准确地重现共同位于WSC上的数千个不同作业的行为。因此,我们使用现场生产环境中的实际工作行为来评估WSCs。根据我们开发多代wsc的经验,我们确定了这种方法的两个主要挑战:1)缺乏包含数千个工作并总结性能的整体度量,2)在实时环境中进行评估的高成本和风险。为了应对这些挑战,我们提出了WSMeter,这是一种经济有效的方法,可以使用实时生产环境准确评估WSC的性能。我们首先定义了一个新的指标,它准确地代表了WSC的整体性能,考虑了各种不均匀分布的作业。然后,我们提出了一个模型来统计包含WSCs固有的性能差异,以最小的成本和风险进行评估。我们给出了三个真实的用例来证明WSMeter的有效性。在前两种情况下,WSMeter分别使用WSC中0.9%和6.6%的机器,准确地识别出WSC升级带来的7%和1%的性能改进。我们强调,单纯的统计比较会产生更高的评估成本(> 4倍),有时甚至无法区分细微的差异。第三个案例表明,在我们的WSC上托管两个服务的云客户以最小的开销(服务容量的2.3%)量化了软件优化的性能优势(+9.3%)。
{"title":"WSMeter: A Performance Evaluation Methodology for Google's Production Warehouse-Scale Computers","authors":"Jaewon Lee, Changkyun Kim, Kun Lin, Liqun Cheng, R. Govindaraju, Jangwoo Kim","doi":"10.1145/3173162.3173196","DOIUrl":"https://doi.org/10.1145/3173162.3173196","url":null,"abstract":"Evaluating the comprehensive performance of a warehouse-scale computer (WSC) has been a long-standing challenge. Traditional load-testing benchmarks become ineffective because they cannot accurately reproduce the behavior of thousands of distinct jobs co-located on a WSC. We therefore evaluate WSCs using actual job behaviors in live production environments. From our experience of developing multiple generations of WSCs, we identify two major challenges of this approach: 1) the lack of a holistic metric that incorporates thousands of jobs and summarizes the performance, and 2) the high costs and risks of conducting an evaluation in a live environment. To address these challenges, we propose WSMeter, a cost-effective methodology to accurately evaluate a WSC's performance using a live production environment. We first define a new metric which accurately represents a WSC's overall performance, taking a wide variety of unevenly distributed jobs into account. We then propose a model to statistically embrace the performance variance inherent in WSCs, to conduct an evaluation with minimal costs and risks. We present three real-world use cases to prove the effectiveness of WSMeter. In the first two cases, WSMeter accurately discerns 7% and 1% performance improvements from WSC upgrades using only 0.9% and 6.6% of the machines in the WSCs, respectively. We emphasize that naive statistical comparisons incur much higher evaluation costs (> 4 times) and sometimes even fail to distinguish subtle differences. The third case shows that a cloud customer hosting two services on our WSC quantifies the performance benefits of software optimization (+9.3%) with minimal overheads (2.3% of the service capacity).","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Optimistic Hybrid Analysis: Accelerating Dynamic Analysis through Predicated Static Analysis 乐观混合分析:通过预测静态分析加速动态分析
David Devecsery, Peter M. Chen, J. Flinn, S. Narayanasamy
Dynamic analysis tools, such as those that detect data-races, verify memory safety, and identify information flow, have become a vital part of testing and debugging complex software systems. While these tools are powerful, their slow speed often limits how effectively they can be deployed in practice. Hybrid analysis speeds up these tools by using static analysis to decrease the work performed during dynamic analysis. In this paper we argue that current hybrid analysis is needlessly hampered by an incorrect assumption that preserving the soundness of dynamic analysis requires an underlying sound static analysis. We observe that, even with unsound static analysis, it is possible to achieve sound dynamic analysis for the executions which fall within the set of states statically considered. This leads us to a new approach, called optimistic hybrid analysis. We first profile a small set of executions and generate a set of likely invariants that hold true during most, but not necessarily all, executions. Next, we apply a much more precise, but unsound, static analysis that assumes these invariants hold true. Finally, we run the resulting dynamic analysis speculatively while verifying whether the assumed invariants hold true during that particular execution; if not, the program is reexecuted with a traditional hybrid analysis. Optimistic hybrid analysis is as precise and sound as traditional dynamic analysis, but is typically much faster because (1) unsound static analysis can speed up dynamic analysis much more than sound static analysis can and (2) verifications rarely fail. We apply optimistic hybrid analysis to race detection and program slicing and achieve 1.8x over a state-of-the-art race detector (FastTrack) optimized with traditional hybrid analysis and 8.3x over a hybrid backward slicer (Giri).
动态分析工具,例如那些检测数据竞争、验证内存安全性和识别信息流的工具,已经成为测试和调试复杂软件系统的重要组成部分。虽然这些工具功能强大,但它们缓慢的速度往往限制了它们在实践中部署的有效性。混合分析通过使用静态分析来减少动态分析期间执行的工作,从而加快了这些工具的速度。在本文中,我们认为,当前的混合分析是不必要的阻碍了一个错误的假设,即保持动态分析的健全需要一个潜在的健全的静态分析。我们观察到,即使使用不健全的静态分析,也有可能对处于静态考虑的状态集内的执行实现健全的动态分析。这就引出了一种新的方法,叫做乐观混合分析。我们首先分析一小组执行,并生成一组可能在大多数(但不一定是全部)执行中成立的不变量。接下来,我们应用更精确但不可靠的静态分析,假设这些不变量成立。最后,我们推测地运行结果动态分析,同时验证假设的不变量在特定执行期间是否成立;如果没有,则使用传统的混合分析重新执行程序。乐观混合分析与传统的动态分析一样精确和可靠,但通常要快得多,因为(1)不可靠的静态分析比可靠的静态分析更能加快动态分析的速度,(2)验证很少失败。我们将乐观混合分析应用于比赛检测和程序切片,比传统混合分析优化的最先进的比赛检测器(FastTrack)实现1.8倍,比混合向后切片器(Giri)实现8.3倍。
{"title":"Optimistic Hybrid Analysis: Accelerating Dynamic Analysis through Predicated Static Analysis","authors":"David Devecsery, Peter M. Chen, J. Flinn, S. Narayanasamy","doi":"10.1145/3173162.3177153","DOIUrl":"https://doi.org/10.1145/3173162.3177153","url":null,"abstract":"Dynamic analysis tools, such as those that detect data-races, verify memory safety, and identify information flow, have become a vital part of testing and debugging complex software systems. While these tools are powerful, their slow speed often limits how effectively they can be deployed in practice. Hybrid analysis speeds up these tools by using static analysis to decrease the work performed during dynamic analysis. In this paper we argue that current hybrid analysis is needlessly hampered by an incorrect assumption that preserving the soundness of dynamic analysis requires an underlying sound static analysis. We observe that, even with unsound static analysis, it is possible to achieve sound dynamic analysis for the executions which fall within the set of states statically considered. This leads us to a new approach, called optimistic hybrid analysis. We first profile a small set of executions and generate a set of likely invariants that hold true during most, but not necessarily all, executions. Next, we apply a much more precise, but unsound, static analysis that assumes these invariants hold true. Finally, we run the resulting dynamic analysis speculatively while verifying whether the assumed invariants hold true during that particular execution; if not, the program is reexecuted with a traditional hybrid analysis. Optimistic hybrid analysis is as precise and sound as traditional dynamic analysis, but is typically much faster because (1) unsound static analysis can speed up dynamic analysis much more than sound static analysis can and (2) verifications rarely fail. We apply optimistic hybrid analysis to race detection and program slicing and achieve 1.8x over a state-of-the-art race detector (FastTrack) optimized with traditional hybrid analysis and 8.3x over a hybrid backward slicer (Giri).","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124636323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Session details: Session 7A: Irregular Apps and Graphs 会议详情:会议7A:不规则应用程序和图表
Martha A. Kim
{"title":"Session details: Session 7A: Irregular Apps and Graphs","authors":"Martha A. Kim","doi":"10.1145/3252964","DOIUrl":"https://doi.org/10.1145/3252964","url":null,"abstract":"","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127273522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unconventional Parallelization of Nondeterministic Applications 非确定性应用的非常规并行化
E. A. Deiana, Vincent St-Amour, P. Dinda, N. Hardavellas, Simone Campanoni
The demand for thread-level-parallelism (TLP) on commodity processors is endless as it is essential for gaining performance and saving energy. However, TLP in today's programs is limited by dependences that must be satisfied at run time. We have found that for nondeterministic programs, some of these actual dependences can be satisfied with alternative data that can be generated in parallel, thus boosting the program's TLP. Satisfying these dependences with alternative data nonetheless produces final outputs that match those of the original nondeterministic program. To demonstrate the practicality of our technique, we describe the design, implementation, and evaluation of our compilers, autotuner, profiler, and runtime, which are enabled by our proposed C++ programming language extensions. The resulting system boosts the performance of six well-known nondeterministic and multi-threaded benchmarks by 158.2% (geometric mean) on a 28-core Intel-based platform.
在商用处理器上对线程级并行性(TLP)的需求是无止境的,因为它对于获得性能和节省能源至关重要。然而,今天的程序中的TLP受到必须在运行时满足的依赖关系的限制。我们发现,对于不确定的程序,其中一些实际的依赖关系可以用并行生成的替代数据来满足,从而提高了程序的TLP。然而,用替代数据满足这些依赖关系会产生与原始不确定性程序相匹配的最终输出。为了演示我们技术的实用性,我们描述了我们的编译器、自动调谐器、分析器和运行时的设计、实现和评估,这些都是由我们提议的c++编程语言扩展启用的。由此产生的系统在基于28核的intel平台上将六个众所周知的不确定性和多线程基准测试的性能提高了158.2%(几何平均值)。
{"title":"Unconventional Parallelization of Nondeterministic Applications","authors":"E. A. Deiana, Vincent St-Amour, P. Dinda, N. Hardavellas, Simone Campanoni","doi":"10.1145/3173162.3173181","DOIUrl":"https://doi.org/10.1145/3173162.3173181","url":null,"abstract":"The demand for thread-level-parallelism (TLP) on commodity processors is endless as it is essential for gaining performance and saving energy. However, TLP in today's programs is limited by dependences that must be satisfied at run time. We have found that for nondeterministic programs, some of these actual dependences can be satisfied with alternative data that can be generated in parallel, thus boosting the program's TLP. Satisfying these dependences with alternative data nonetheless produces final outputs that match those of the original nondeterministic program. To demonstrate the practicality of our technique, we describe the design, implementation, and evaluation of our compilers, autotuner, profiler, and runtime, which are enabled by our proposed C++ programming language extensions. The resulting system boosts the performance of six well-known nondeterministic and multi-threaded benchmarks by 158.2% (geometric mean) on a 28-core Intel-based platform.","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125917183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management 多核心系统资源管理的正式监督控制与协调
A. Rahmani, Bryan Donyanavard, T. Mück, Kasra Moazzemi, A. Jantsch, O. Mutlu, N. Dutt
Resource management strategies for many-core systems need to enable sharing of resources such as power, processing cores, and memory bandwidth while coordinating the priority and significance of system- and application-level objectives at runtime in a scalable and robust manner. State-of-the-art approaches use heuristics or machine learning for resource management, but unfortunately lack formalism in providing robustness against unexpected corner cases. While recent efforts deploy classical control-theoretic approaches with some guarantees and formalism, they lack scalability and autonomy to meet changing runtime goals. We present SPECTR, a new resource management approach for many-core systems that leverages formal supervisory control theory (SCT) to combine the strengths of classical control theory with state-of-the-art heuristic approaches to efficiently meet changing runtime goals. SPECTR is a scalable and robust control architecture and a systematic design flow for hierarchical control of many-core systems. SPECTR leverages SCT techniques such as gain scheduling to allow autonomy for individual controllers. It facilitates automatic synthesis of the high-level supervisory controller and its property verification. We implement SPECTR on an Exynos platform containing ARM»s big.LITTLE-based heterogeneous multi-processor (HMP) and demonstrate that SPECTR»s use of SCT is key to managing multiple interacting resources (e.g., chip power and processing cores) in the presence of competing objectives (e.g., satisfying QoS vs. power capping). The principles of SPECTR are easily applicable to any resource type and objective as long as the management problem can be modeled using dynamical systems theory (e.g., difference equations), discrete-event dynamic systems, or fuzzy dynamics.
多核系统的资源管理策略需要支持资源共享,如电力、处理核心和内存带宽,同时在运行时以可扩展和健壮的方式协调系统级和应用程序级目标的优先级和重要性。最先进的方法使用启发式或机器学习进行资源管理,但不幸的是,在提供针对意外情况的鲁棒性方面缺乏形式化。虽然最近的努力部署了具有一些保证和形式化的经典控制理论方法,但它们缺乏可伸缩性和自主性,无法满足不断变化的运行时目标。我们提出了spectrr,一种针对多核心系统的新的资源管理方法,它利用正式监督控制理论(SCT)将经典控制理论的优势与最先进的启发式方法相结合,以有效地满足不断变化的运行时目标。spectrr是一种可扩展的鲁棒控制体系结构,是一种用于多核心系统分层控制的系统设计流程。spectrr利用SCT技术(如增益调度)来实现单个控制器的自主性。它便于高级监控控制器的自动合成及其性能验证。我们在Exynos平台上实现了specr,该平台包含ARM的big。基于little的异构多处理器(HMP),并证明在存在竞争目标(例如,满足QoS与功率上限)的情况下,spectrr使用SCT是管理多个交互资源(例如,芯片功率和处理核心)的关键。只要管理问题可以用动态系统理论(如差分方程)、离散事件动态系统或模糊动力学建模,spectrr的原理就很容易适用于任何资源类型和目标。
{"title":"SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management","authors":"A. Rahmani, Bryan Donyanavard, T. Mück, Kasra Moazzemi, A. Jantsch, O. Mutlu, N. Dutt","doi":"10.1145/3173162.3173199","DOIUrl":"https://doi.org/10.1145/3173162.3173199","url":null,"abstract":"Resource management strategies for many-core systems need to enable sharing of resources such as power, processing cores, and memory bandwidth while coordinating the priority and significance of system- and application-level objectives at runtime in a scalable and robust manner. State-of-the-art approaches use heuristics or machine learning for resource management, but unfortunately lack formalism in providing robustness against unexpected corner cases. While recent efforts deploy classical control-theoretic approaches with some guarantees and formalism, they lack scalability and autonomy to meet changing runtime goals. We present SPECTR, a new resource management approach for many-core systems that leverages formal supervisory control theory (SCT) to combine the strengths of classical control theory with state-of-the-art heuristic approaches to efficiently meet changing runtime goals. SPECTR is a scalable and robust control architecture and a systematic design flow for hierarchical control of many-core systems. SPECTR leverages SCT techniques such as gain scheduling to allow autonomy for individual controllers. It facilitates automatic synthesis of the high-level supervisory controller and its property verification. We implement SPECTR on an Exynos platform containing ARM»s big.LITTLE-based heterogeneous multi-processor (HMP) and demonstrate that SPECTR»s use of SCT is key to managing multiple interacting resources (e.g., chip power and processing cores) in the presence of competing objectives (e.g., satisfying QoS vs. power capping). The principles of SPECTR are easily applicable to any resource type and objective as long as the management problem can be modeled using dynamical systems theory (e.g., difference equations), discrete-event dynamic systems, or fuzzy dynamics.","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116834128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
Session details: Session 8A: Security and Protection 会议详情:8A:安全与保护
J. Criswell
{"title":"Session details: Session 8A: Security and Protection","authors":"J. Criswell","doi":"10.1145/3252966","DOIUrl":"https://doi.org/10.1145/3252966","url":null,"abstract":"","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130481395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 5A: Concurrency and Parallelism 会话详细信息:会话5A:并发和并行
H. Hoffmann
{"title":"Session details: Session 5A: Concurrency and Parallelism","authors":"H. Hoffmann","doi":"10.1145/3252960","DOIUrl":"https://doi.org/10.1145/3252960","url":null,"abstract":"","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130978816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 2A: GPUs 1 会话详细信息:会话2A: gpu 1
C. Rossbach
{"title":"Session details: Session 2A: GPUs 1","authors":"C. Rossbach","doi":"10.1145/3252954","DOIUrl":"https://doi.org/10.1145/3252954","url":null,"abstract":"","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132197546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reconfigurable Energy Storage Architecture for Energy-harvesting Devices 一种用于能量收集装置的可重构能量存储体系结构
A. Colin, E. Ruppel, Brandon Lucia
Battery-free, energy-harvesting devices operate using energy collected exclusively from their environment. Energy-harvesting devices allow maintenance-free deployment in extreme environments, but requires a power system to provide the right amount of energy when an application needs it. Existing systems must provision energy capacity statically based on an application's peak demand which compromises efficiency and responsiveness when not at peak demand. This work presents Capybara: a co-designed hardware/software power system with dynamically reconfigurable energy storage capacity that meets varied application energy demand. The Capybara software interface allows programmers to specify the energy mode of an application task. Capybara's runtime system reconfigures Capybara's hardware energy capacity to match application demand. Capybara also allows a programmer to write reactive application tasks that pre-allocate a burst of energy that it can spend in response to an asynchronous (e.g., external) event. We instantiated Capybara's hardware design in two EH devices and implemented three reactive sensing applications using its software interface. Capybara improves event detection accuracy by 2x-4x over statically-provisioned energy capacity, maintains response latency within 1.5x of a continuously-powered baseline, and enables reactive applications that are intractable with existing power systems.
无电池,能量收集设备使用的能量完全从他们的环境中收集。能量收集设备允许在极端环境中免维护部署,但需要一个电力系统在应用程序需要时提供适量的能量。现有系统必须根据应用程序的峰值需求静态地提供能源容量,这在非峰值需求时损害了效率和响应能力。这项工作提出了Capybara:一个共同设计的硬件/软件电源系统,具有动态可重构的能量存储容量,满足各种应用能源需求。Capybara软件接口允许程序员指定应用程序任务的能量模式。Capybara的运行时系统重新配置了Capybara的硬件能量容量,以满足应用程序的需求。Capybara还允许程序员编写响应式应用程序任务,这些任务可以预先分配能量,用于响应异步(例如外部)事件。我们在两个EH设备中实例化了Capybara的硬件设计,并使用其软件接口实现了三个反应传感应用。与静态配置的能量容量相比,Capybara将事件检测精度提高了2 -4倍,将响应延迟保持在持续供电基线的1.5倍以内,并支持现有电力系统难以处理的无功应用。
{"title":"A Reconfigurable Energy Storage Architecture for Energy-harvesting Devices","authors":"A. Colin, E. Ruppel, Brandon Lucia","doi":"10.1145/3173162.3173210","DOIUrl":"https://doi.org/10.1145/3173162.3173210","url":null,"abstract":"Battery-free, energy-harvesting devices operate using energy collected exclusively from their environment. Energy-harvesting devices allow maintenance-free deployment in extreme environments, but requires a power system to provide the right amount of energy when an application needs it. Existing systems must provision energy capacity statically based on an application's peak demand which compromises efficiency and responsiveness when not at peak demand. This work presents Capybara: a co-designed hardware/software power system with dynamically reconfigurable energy storage capacity that meets varied application energy demand. The Capybara software interface allows programmers to specify the energy mode of an application task. Capybara's runtime system reconfigures Capybara's hardware energy capacity to match application demand. Capybara also allows a programmer to write reactive application tasks that pre-allocate a burst of energy that it can spend in response to an asynchronous (e.g., external) event. We instantiated Capybara's hardware design in two EH devices and implemented three reactive sensing applications using its software interface. Capybara improves event detection accuracy by 2x-4x over statically-provisioned energy capacity, maintains response latency within 1.5x of a continuously-powered baseline, and enables reactive applications that are intractable with existing power systems.","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127780432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 167
Potluck: Cross-Application Approximate Deduplication for Computation-Intensive Mobile Applications Potluck:计算密集型移动应用的跨应用近似重复数据删除
Peizhen Guo, Wenjun Hu
Emerging mobile applications, such as cognitive assistance and augmented reality (AR) based gaming, are increasingly computation-intensive and latency-sensitive, while running on resource-constrained devices. The standard approaches to addressing these involve either offloading to a cloud(let) or local system optimizations to speed up the computation, often trading off computation quality for low latency. Instead, we observe that these applications often operate on similar input data from the camera feed and share common processing components, both within the same (type of) applications and across different ones. Therefore, deduplicating processing across applications could deliver the best of both worlds. In this paper, we present Potluck, to achieve approximate deduplication. At the core of the system is a cache service that stores and shares processing results between applications and a set of algorithms to process the input data to maximize deduplication opportunities. This is implemented as a background service on Android. Extensive evaluation shows that Potluck can reduce the processing latency for our AR and vision workloads by a factor of 2.5 to 10.
新兴的移动应用程序,如基于认知辅助和增强现实(AR)的游戏,在资源有限的设备上运行时,越来越需要计算密集型和延迟敏感。解决这些问题的标准方法包括卸载到云(let)或本地系统优化以加速计算,通常以计算质量为代价换取低延迟。相反,我们观察到这些应用程序通常对来自摄像头馈电的类似输入数据进行操作,并在同一(类型)应用程序内和不同应用程序之间共享公共处理组件。因此,跨应用程序进行重复数据删除处理可以提供两全其美的效果。在本文中,我们提出了Potluck,以实现近似重复数据删除。该系统的核心是一个缓存服务,用于存储和共享应用程序之间的处理结果,以及一组处理输入数据的算法,以最大限度地提高重复数据删除的机会。这是在Android上作为后台服务实现的。广泛的评估表明,Potluck可以将我们的AR和视觉工作负载的处理延迟减少2.5到10倍。
{"title":"Potluck: Cross-Application Approximate Deduplication for Computation-Intensive Mobile Applications","authors":"Peizhen Guo, Wenjun Hu","doi":"10.1145/3173162.3173185","DOIUrl":"https://doi.org/10.1145/3173162.3173185","url":null,"abstract":"Emerging mobile applications, such as cognitive assistance and augmented reality (AR) based gaming, are increasingly computation-intensive and latency-sensitive, while running on resource-constrained devices. The standard approaches to addressing these involve either offloading to a cloud(let) or local system optimizations to speed up the computation, often trading off computation quality for low latency. Instead, we observe that these applications often operate on similar input data from the camera feed and share common processing components, both within the same (type of) applications and across different ones. Therefore, deduplicating processing across applications could deliver the best of both worlds. In this paper, we present Potluck, to achieve approximate deduplication. At the core of the system is a cache service that stores and shares processing results between applications and a set of algorithms to process the input data to maximize deduplication opportunities. This is implemented as a background service on Android. Extensive evaluation shows that Potluck can reduce the processing latency for our AR and vision workloads by a factor of 2.5 to 10.","PeriodicalId":302876,"journal":{"name":"Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128586984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
期刊
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1