首页 > 最新文献

2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)最新文献

英文 中文
Analysis and optimization of soft error tolerance strategies for real-time systems 实时系统软容错策略的分析与优化
Bowen Zheng, Yue Gao, Qi Zhu, S. Gupta
The safety of real-time embedded systems relies on both functional and timing correctness. On the timing side, realtime constraints are set on task executions, and missing them may lead to system failure. On the functional side, soft errors have become a major concern. Various soft error tolerance strategies are proposed for soft error detection and recovery, however they may introduce significant computation overhead and cause timing violations. In this work, we address the two aspects in an integrated framework, and propose a set of formulations to quantitatively model the impact of soft error detection and recovery mechanisms on real-time constraints. The formulations facilitate designers to analyze system feasibility under fault tolerance requirements and compare various architecture platforms. They may also help select the appropriate error tolerance mechanisms for software tasks, together with exploring task scheduling and allocation on representative single-core, multicore and distributed platforms, to maximize error coverage while meeting real-time constraints. Experiments on an industrial case study and synthetic examples demonstrate the effectiveness of our approach.
实时嵌入式系统的安全性依赖于功能和时间的正确性。在计时方面,对任务执行设置了实时约束,缺少这些约束可能会导致系统故障。在功能方面,软错误已成为一个主要问题。针对软错误检测和恢复,提出了各种软容错策略,但是它们可能会引入大量的计算开销并导致时间冲突。在这项工作中,我们在一个集成的框架中解决了这两个方面,并提出了一套公式来定量模拟软错误检测和恢复机制对实时约束的影响。这些公式便于设计者分析系统在容错要求下的可行性,并对各种架构平台进行比较。它们还可以帮助为软件任务选择合适的容错机制,以及探索具有代表性的单核、多核和分布式平台上的任务调度和分配,在满足实时约束的同时最大限度地提高错误覆盖率。工业案例研究和综合实例的实验证明了该方法的有效性。
{"title":"Analysis and optimization of soft error tolerance strategies for real-time systems","authors":"Bowen Zheng, Yue Gao, Qi Zhu, S. Gupta","doi":"10.5555/2830840.2830847","DOIUrl":"https://doi.org/10.5555/2830840.2830847","url":null,"abstract":"The safety of real-time embedded systems relies on both functional and timing correctness. On the timing side, realtime constraints are set on task executions, and missing them may lead to system failure. On the functional side, soft errors have become a major concern. Various soft error tolerance strategies are proposed for soft error detection and recovery, however they may introduce significant computation overhead and cause timing violations. In this work, we address the two aspects in an integrated framework, and propose a set of formulations to quantitatively model the impact of soft error detection and recovery mechanisms on real-time constraints. The formulations facilitate designers to analyze system feasibility under fault tolerance requirements and compare various architecture platforms. They may also help select the appropriate error tolerance mechanisms for software tasks, together with exploring task scheduling and allocation on representative single-core, multicore and distributed platforms, to maximize error coverage while meeting real-time constraints. Experiments on an industrial case study and synthetic examples demonstrate the effectiveness of our approach.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115712887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A parallelizable approach for mining likely invariants 挖掘可能不变量的并行化方法
Alessandro Danese, Luca Piccolboni, G. Pravadelli
A relevant aspect in design analysis and verification is monitoring how logic relations among different variables change at run time. Current static approaches suffer from scalability problems that prevent their adoption on large designs. On the contrary, dynamic techniques scale better from the memory-consumption point of view. However, to achieve a high accuracy, they require to analyse a huge number of (long) execution traces, which results in time-consuming phases. In this paper, we present a new efficient approach to automatically infer logic relations among the variables of a design implementation. Both a sequential and a GPU-oriented parallel implementation are proposed to dynamically extract likely invariants from execution traces on different time windows. Execution traces composed of millions of simulation instants can be efficiently analysed.
设计分析和验证的一个相关方面是监视不同变量之间的逻辑关系在运行时如何变化。当前的静态方法存在可伸缩性问题,这阻碍了它们在大型设计中的应用。相反,从内存消耗的角度来看,动态技术的可伸缩性更好。然而,为了达到高精度,它们需要分析大量的(长)执行跟踪,这导致了耗时的阶段。在本文中,我们提出了一种新的有效方法来自动推断设计实现中变量之间的逻辑关系。提出了一种顺序的和面向gpu的并行实现,从不同时间窗口的执行轨迹中动态提取可能的不变量。由数百万个仿真瞬间组成的执行轨迹可以有效地分析。
{"title":"A parallelizable approach for mining likely invariants","authors":"Alessandro Danese, Luca Piccolboni, G. Pravadelli","doi":"10.1109/CODESISSS.2015.7331382","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331382","url":null,"abstract":"A relevant aspect in design analysis and verification is monitoring how logic relations among different variables change at run time. Current static approaches suffer from scalability problems that prevent their adoption on large designs. On the contrary, dynamic techniques scale better from the memory-consumption point of view. However, to achieve a high accuracy, they require to analyse a huge number of (long) execution traces, which results in time-consuming phases. In this paper, we present a new efficient approach to automatically infer logic relations among the variables of a design implementation. Both a sequential and a GPU-oriented parallel implementation are proposed to dynamically extract likely invariants from execution traces on different time windows. Execution traces composed of millions of simulation instants can be efficiently analysed.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127485545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Transparent and portable agent based task migration for data-flow applications on multi-tiled architectures 透明和可移植的基于代理的任务迁移,用于多层架构上的数据流应用程序
Ashraf El Antably, O. Gruber, F. Rousseau, Nicolas Fournel
Fully distributed memory multi-processors (MPSoC) implemented in multi-tiled architectures are promising solutions to support modern sophisticated applications, however, reliability of such systems is always an issue. As a result, system-level solution like task migration keeps its importance. Transferring the execution of a task from one tile to another helps keep acceptable reliability of such systems. A tile contains at least one processor and associated peripherals with a communication device responsible for inter-tile communications. We propose in this work a task migration technique that targets data-flow applications running on multi-tiled architectures. This technique uses a middleware layer that makes it transparent to application programmers and eases its portability over different multi-tiled architectures. It can be deployed on small operating systems that support neither MMU nor dynamic loading for task code. We show that this technique is operational on x86 based real hardware platform. Experimental results show low overhead both in memory and performance without much variance.
在多层架构中实现的完全分布式内存多处理器(MPSoC)是支持现代复杂应用的有前途的解决方案,然而,此类系统的可靠性始终是一个问题。因此,像任务迁移这样的系统级解决方案保持了其重要性。将任务的执行从一个块转移到另一个块有助于保持此类系统的可接受可靠性。每个块包含至少一个处理器和具有负责块间通信的通信设备的相关外设。在这项工作中,我们提出了一种针对运行在多层架构上的数据流应用程序的任务迁移技术。该技术使用中间件层,使其对应用程序程序员透明,并简化了其在不同多层体系结构上的可移植性。它可以部署在既不支持MMU也不支持任务代码动态加载的小型操作系统上。结果表明,该技术在基于x86的实际硬件平台上是可行的。实验结果表明,该方法在内存和性能上的开销都很低,且变化不大。
{"title":"Transparent and portable agent based task migration for data-flow applications on multi-tiled architectures","authors":"Ashraf El Antably, O. Gruber, F. Rousseau, Nicolas Fournel","doi":"10.1109/CODESISSS.2015.7331381","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331381","url":null,"abstract":"Fully distributed memory multi-processors (MPSoC) implemented in multi-tiled architectures are promising solutions to support modern sophisticated applications, however, reliability of such systems is always an issue. As a result, system-level solution like task migration keeps its importance. Transferring the execution of a task from one tile to another helps keep acceptable reliability of such systems. A tile contains at least one processor and associated peripherals with a communication device responsible for inter-tile communications. We propose in this work a task migration technique that targets data-flow applications running on multi-tiled architectures. This technique uses a middleware layer that makes it transparent to application programmers and eases its portability over different multi-tiled architectures. It can be deployed on small operating systems that support neither MMU nor dynamic loading for task code. We show that this technique is operational on x86 based real hardware platform. Experimental results show low overhead both in memory and performance without much variance.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127268924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Lightweight virtual memory support for many-core accelerators in heterogeneous embedded SoCs 在异构嵌入式soc中支持多核加速器的轻量级虚拟内存
Pirmin Vogel, A. Marongiu, L. Benini
While high-end heterogeneous systems are increasingly supporting heterogeneous uniform memory access (hUMA) as envisioned by the Heterogeneous System Architecture (HSA) foundation, their low-power counterparts targeting the embedded domain still lack basic features like virtual memory support for accelerators. As opposed to simply passing virtual address pointers, explicit data management involving copies is needed to share data between host processor and accelerators which hampers programmability and performance. In this work, we present a mixed hardware/software solution to enable lightweight virtual memory support for many-core accelerators in heterogeneous embedded systems-on-chip (SoCs). Based on an input/output translation lookaside buffer (IOTLB), efficiently managed by a kernel-level driver module running on the host, our solution features a considerably lower design complexity compared to conventional input/output memory management units. Using our evaluation platform based on the Xilinx Zynq-7000 SoC with a many-core accelerator implemented in the programmable logic, we demonstrate the effectiveness of our solution and the benefits of virtual memory support for embedded heterogeneous SoCs.
虽然高端异构系统越来越多地支持异构统一内存访问(hUMA),正如异构系统架构(HSA)基金会所设想的那样,但它们针对嵌入式领域的低功耗对应物仍然缺乏基本功能,比如对加速器的虚拟内存支持。与简单地传递虚拟地址指针相反,需要涉及副本的显式数据管理来在主机处理器和加速器之间共享数据,这妨碍了可编程性和性能。在这项工作中,我们提出了一种混合硬件/软件解决方案,为异构嵌入式片上系统(soc)中的多核加速器提供轻量级虚拟内存支持。基于输入/输出转换暂置缓冲区(IOTLB),由运行在主机上的内核级驱动模块有效管理,与传统的输入/输出内存管理单元相比,我们的解决方案具有相当低的设计复杂性。使用基于Xilinx Zynq-7000 SoC的评估平台,并在可编程逻辑中实现多核加速器,我们证明了我们的解决方案的有效性以及嵌入式异构SoC的虚拟内存支持的好处。
{"title":"Lightweight virtual memory support for many-core accelerators in heterogeneous embedded SoCs","authors":"Pirmin Vogel, A. Marongiu, L. Benini","doi":"10.1109/CODESISSS.2015.7331367","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331367","url":null,"abstract":"While high-end heterogeneous systems are increasingly supporting heterogeneous uniform memory access (hUMA) as envisioned by the Heterogeneous System Architecture (HSA) foundation, their low-power counterparts targeting the embedded domain still lack basic features like virtual memory support for accelerators. As opposed to simply passing virtual address pointers, explicit data management involving copies is needed to share data between host processor and accelerators which hampers programmability and performance. In this work, we present a mixed hardware/software solution to enable lightweight virtual memory support for many-core accelerators in heterogeneous embedded systems-on-chip (SoCs). Based on an input/output translation lookaside buffer (IOTLB), efficiently managed by a kernel-level driver module running on the host, our solution features a considerably lower design complexity compared to conventional input/output memory management units. Using our evaluation platform based on the Xilinx Zynq-7000 SoC with a many-core accelerator implemented in the programmable logic, we demonstrate the effectiveness of our solution and the benefits of virtual memory support for embedded heterogeneous SoCs.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131948257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Big/little deep neural network for ultra low power inference 用于超低功耗推理的大/小深度神经网络
Eunhyeok Park, Dongyoung Kim, Soobeom Kim, Yong-Deok Kim, Gunhee Kim, Sungroh Yoon, S. Yoo
Deep neural networks (DNNs) have recently proved their effectiveness in complex data analyses such as object/speech recognition. As their applications are being expanded to mobile devices, their energy efficiencies are becoming critical. In this paper, we propose a novel concept called big/LITTLE DNN (BL-DNN) which significantly reduces energy consumption required for DNN execution at a negligible loss of inference accuracy. The BL-DNN consists of a little DNN (consuming low energy) and a full-fledged big DNN. In order to reduce energy consumption, the BL-DNN aims at avoiding the execution of the big DNN whenever possible. The key idea for this goal is to execute the little DNN first for inference (without big DNN execution) and simply use its result as the final inference result as long as the result is estimated to be accurate. On the other hand, if the result from the little DNN is not considered to be accurate, the big DNN is executed to give the final inference result. This approach reduces the total energy consumption by obtaining the inference result only with the little, energy-efficient DNN in most cases, while maintaining the similar level of inference accuracy through selectively utilizing the big DNN execution. We present design-time and runtime methods to control the execution of big DNN under a trade-off between energy consumption and inference accuracy. Experiments with state-of-the-art DNNs for ImageNet and MNIST show that our proposed BL-DNN can offer up to 53.7% (ImageNet) and 94.1% (MNIST) reductions in energy consumption at a loss of 0.90% (ImageNet) and 0.12% (MNIST) in inference accuracy, respectively.
深度神经网络(dnn)最近在复杂数据分析(如对象/语音识别)中证明了其有效性。随着它们的应用扩展到移动设备,它们的能源效率变得至关重要。在本文中,我们提出了一个新的概念,称为大/小深度神经网络(BL-DNN),它显著降低了DNN执行所需的能量消耗,而推理精度的损失可以忽略不计。BL-DNN由一个小DNN(消耗低能量)和一个成熟的大DNN组成。为了减少能量消耗,BL-DNN旨在尽可能避免执行大DNN。这个目标的关键思想是首先执行小DNN进行推理(不执行大DNN),只要结果估计准确,就简单地将其结果用作最终推理结果。另一方面,如果认为小DNN的结果不准确,则执行大DNN来给出最终的推理结果。该方法通过在大多数情况下仅使用小的、节能的深度神经网络来获得推理结果,从而降低了总能耗,同时通过有选择地使用大的深度神经网络来保持相似的推理精度。我们提出了设计时和运行时的方法来控制大深度神经网络的执行,在能量消耗和推理精度之间进行权衡。使用最先进的dnn对ImageNet和MNIST进行的实验表明,我们提出的ml - dnn可以在推理精度分别损失0.90% (ImageNet)和0.12% (MNIST)的情况下,提供高达53.7% (ImageNet)和94.1% (MNIST)的能耗降低。
{"title":"Big/little deep neural network for ultra low power inference","authors":"Eunhyeok Park, Dongyoung Kim, Soobeom Kim, Yong-Deok Kim, Gunhee Kim, Sungroh Yoon, S. Yoo","doi":"10.1109/CODESISSS.2015.7331375","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331375","url":null,"abstract":"Deep neural networks (DNNs) have recently proved their effectiveness in complex data analyses such as object/speech recognition. As their applications are being expanded to mobile devices, their energy efficiencies are becoming critical. In this paper, we propose a novel concept called big/LITTLE DNN (BL-DNN) which significantly reduces energy consumption required for DNN execution at a negligible loss of inference accuracy. The BL-DNN consists of a little DNN (consuming low energy) and a full-fledged big DNN. In order to reduce energy consumption, the BL-DNN aims at avoiding the execution of the big DNN whenever possible. The key idea for this goal is to execute the little DNN first for inference (without big DNN execution) and simply use its result as the final inference result as long as the result is estimated to be accurate. On the other hand, if the result from the little DNN is not considered to be accurate, the big DNN is executed to give the final inference result. This approach reduces the total energy consumption by obtaining the inference result only with the little, energy-efficient DNN in most cases, while maintaining the similar level of inference accuracy through selectively utilizing the big DNN execution. We present design-time and runtime methods to control the execution of big DNN under a trade-off between energy consumption and inference accuracy. Experiments with state-of-the-art DNNs for ImageNet and MNIST show that our proposed BL-DNN can offer up to 53.7% (ImageNet) and 94.1% (MNIST) reductions in energy consumption at a loss of 0.90% (ImageNet) and 0.12% (MNIST) in inference accuracy, respectively.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
Power-awareness and smart-resource management in embedded computing systems 嵌入式计算系统中的功率感知和智能资源管理
M. Santambrogio, J. Ayala, Simone Campanoni, Riccardo Cattaneo, Gianluca Durelli, M. Ferroni, A. A. Nacci, Josué Pagán, Marina Zapater, Mónica Vallejo
Resources such as quantities of transistors and memory, the level of integration and the speed of components have increased dramatically over the years. Even though the technologies have improved, we continue to apply outdated approaches to our use of these resources. Key computer science abstractions have not changed since the 1960's. Therefore this is the time for a fresh approach to the way systems are designed and used.
诸如晶体管和存储器的数量、集成水平和组件的速度等资源在过去几年里急剧增加。尽管技术有所进步,但我们仍在使用过时的方法来使用这些资源。自20世纪60年代以来,关键的计算机科学抽象概念没有改变。因此,现在是时候采用一种新的方法来设计和使用系统了。
{"title":"Power-awareness and smart-resource management in embedded computing systems","authors":"M. Santambrogio, J. Ayala, Simone Campanoni, Riccardo Cattaneo, Gianluca Durelli, M. Ferroni, A. A. Nacci, Josué Pagán, Marina Zapater, Mónica Vallejo","doi":"10.1109/CODESISSS.2015.7331372","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331372","url":null,"abstract":"Resources such as quantities of transistors and memory, the level of integration and the speed of components have increased dramatically over the years. Even though the technologies have improved, we continue to apply outdated approaches to our use of these resources. Key computer science abstractions have not changed since the 1960's. Therefore this is the time for a fresh approach to the way systems are designed and used.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122273491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A tiny-capacitor-backed non-volatile buffer to reduce storage writes in smartphones 一种小型电容支持的非易失性缓冲器,用于减少智能手机中的存储写入
Mungyu Son, Junwhan Ahn, S. Yoo
Mobile storage writes are often dominated by writes to SQLite database files. Our characterization shows that they mostly consist of frequent overwrites with small new data (which we call small writes) and relatively infrequent writes with large data updates. In order to reduce writes to the Flash memory on smartphones, we propose exploiting these characteristics and present a low-cost nonvolatile write buffer for write coalescing. The key challenge in it is that the stringent resource constraints of mobile devices force the write buffer size to be minimized down to a single Flash page in order to reduce the overhead of SRAM buffer on the controller chip and a backing capacitor that maintains non-volatility of the buffer on power failure. As a solution to this problem, we propose three optimizations that make the best use of this small single-page nonvolatile write buffer. First, we propose managing only the difference between old and new data (i.e., differential logs) in the write buffer, based on the observation that small writes are frequent. Second, we develop a dynamic bypass scheme which judiciously bypasses overwrite-unfriendly pages from the write buffer. Third, we devise an incremental flush policy which controls the number of write buffer entries to be flushed according to the size of the newly written data. According to our experiments using four representative mobile applications on a real storage platform, OpenSSD, the proposed method gives average 69.5% and 64.5% reductions in Flash memory writes in single- and multi-application runs, respectively. In addition, our scheme introduces a very small cost into existing systems, including 8-18.5KB SRAM on the controller chip and a tiny capacitor occupying only 1.7% of eMMC package volume.
移动存储写操作通常由对SQLite数据库文件的写操作主导。我们的特征表明,它们主要包括对小的新数据的频繁覆盖(我们称之为小的写)和对大数据更新的相对不频繁的写。为了减少智能手机上对闪存的写入,我们建议利用这些特性并提出一种低成本的非易失性写入缓冲区用于写入合并。其中的关键挑战是,移动设备严格的资源限制迫使写缓冲区大小最小化到单个Flash页面,以减少控制器芯片上的SRAM缓冲区的开销,以及在电源故障时保持缓冲区非易失性的后备电容器。作为这个问题的解决方案,我们提出了三种优化,以充分利用这个小的单页非易失性写缓冲区。首先,我们建议只管理写缓冲区中新旧数据之间的差异(即差异日志),这是基于对频繁进行小写的观察。其次,我们开发了一种动态绕过方案,可以明智地绕过写缓冲区中的不覆盖页面。第三,我们设计了一个增量刷新策略,该策略根据新写入数据的大小控制要刷新的写缓冲区条目的数量。根据我们在真实存储平台OpenSSD上使用四个具有代表性的移动应用程序的实验,所提出的方法在单应用程序和多应用程序运行时分别平均减少了69.5%和64.5%的闪存写入。此外,我们的方案在现有系统中引入了非常小的成本,包括控制器芯片上的8-18.5KB SRAM和仅占eMMC封装体积1.7%的微小电容器。
{"title":"A tiny-capacitor-backed non-volatile buffer to reduce storage writes in smartphones","authors":"Mungyu Son, Junwhan Ahn, S. Yoo","doi":"10.1109/CODESISSS.2015.7331364","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331364","url":null,"abstract":"Mobile storage writes are often dominated by writes to SQLite database files. Our characterization shows that they mostly consist of frequent overwrites with small new data (which we call small writes) and relatively infrequent writes with large data updates. In order to reduce writes to the Flash memory on smartphones, we propose exploiting these characteristics and present a low-cost nonvolatile write buffer for write coalescing. The key challenge in it is that the stringent resource constraints of mobile devices force the write buffer size to be minimized down to a single Flash page in order to reduce the overhead of SRAM buffer on the controller chip and a backing capacitor that maintains non-volatility of the buffer on power failure. As a solution to this problem, we propose three optimizations that make the best use of this small single-page nonvolatile write buffer. First, we propose managing only the difference between old and new data (i.e., differential logs) in the write buffer, based on the observation that small writes are frequent. Second, we develop a dynamic bypass scheme which judiciously bypasses overwrite-unfriendly pages from the write buffer. Third, we devise an incremental flush policy which controls the number of write buffer entries to be flushed according to the size of the newly written data. According to our experiments using four representative mobile applications on a real storage platform, OpenSSD, the proposed method gives average 69.5% and 64.5% reductions in Flash memory writes in single- and multi-application runs, respectively. In addition, our scheme introduces a very small cost into existing systems, including 8-18.5KB SRAM on the controller chip and a tiny capacitor occupying only 1.7% of eMMC package volume.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116011132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Run-DMC: Runtime dynamic heterogeneous multicore performance and power estimation for energy efficiency Run-DMC:运行时动态异构多核性能和能源效率的功率估计
T. Mück, S. Sarma, N. Dutt
In this paper we propose Run-DMC, an accurate runtime performance and power estimation scheme for dynamic workloads executing on heterogeneous multicore systems. In contrast to previous works, Run-DMC uses fine grain per-thread metrics that model the Thread Load Contribution (TLC) induced by the native OS scheduling policy to accurately predict performance and power for any possible thread-to-core mapping. This allows the operating system to opportunistically exploit the heterogeneous multicore architecture by dynamically mapping workloads to the most appropriate core type. We have integrated our models into the Linux kernel running on top of a heterogeneous multicore system with 4 different core types. Our experimental results show that Run-DMC models yield up to 97% more energy efficient when compared to the vanilla Linux. When compared to the approach employed by state-of-the-art energy-aware schedulers, Run-DMC yields up-to 44% better energy efficiency.
在本文中,我们提出了Run-DMC,一个精确的运行时性能和功耗估计方案,用于在异构多核系统上执行的动态工作负载。与以前的工作相反,Run-DMC使用细粒度的每线程指标,这些指标对由本地操作系统调度策略引起的线程负载贡献(TLC)进行建模,以准确预测任何可能的线程到内核映射的性能和功耗。这允许操作系统通过动态地将工作负载映射到最合适的核心类型来利用异构多核架构。我们已经将我们的模型集成到运行在异构多核系统上的Linux内核中,该系统有4种不同的内核类型。我们的实验结果表明,与普通Linux相比,Run-DMC模型的能效提高了97%。与最先进的能源感知调度器所采用的方法相比,Run-DMC的能源效率提高了44%。
{"title":"Run-DMC: Runtime dynamic heterogeneous multicore performance and power estimation for energy efficiency","authors":"T. Mück, S. Sarma, N. Dutt","doi":"10.1109/CODESISSS.2015.7331380","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331380","url":null,"abstract":"In this paper we propose Run-DMC, an accurate runtime performance and power estimation scheme for dynamic workloads executing on heterogeneous multicore systems. In contrast to previous works, Run-DMC uses fine grain per-thread metrics that model the Thread Load Contribution (TLC) induced by the native OS scheduling policy to accurately predict performance and power for any possible thread-to-core mapping. This allows the operating system to opportunistically exploit the heterogeneous multicore architecture by dynamically mapping workloads to the most appropriate core type. We have integrated our models into the Linux kernel running on top of a heterogeneous multicore system with 4 different core types. Our experimental results show that Run-DMC models yield up to 97% more energy efficient when compared to the vanilla Linux. When compared to the approach employed by state-of-the-art energy-aware schedulers, Run-DMC yields up-to 44% better energy efficiency.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127233580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
dsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations dsReliM:工艺变化下暗硅多核芯片的功率约束可靠性管理
M. Salehi, M. Shafique, F. Kriebel, Semeen Rehman, Mohammad Khavari Tavana, A. Ejlali, J. Henkel
Due to the tight power envelope, in the future technology nodes it is envisaged that not all cores in a many-core chip can be simultaneously powered-on (at full performance level). The power-gated cores are referred to as Dark Silicon. At the same time, growing reliability issues due to process variations and soft errors challenge the cost-effective deployment of future technology nodes. This paper presents a reliability management system for Dark Silicon chips (dsReliM) that optimizes for reliability of on-chip systems while jointly accounting for soft errors, process variations and the thermal design power (TDP) constraint. Towards the TDP-constrained reliability optimization, dsReliM leverages multiple reliable application versions that can potentially execute on different cores with frequency variations and supporting differenst voltage-frequency levels, thus facilitating distinct power, reliability and performance tradeoffs at run time. Experiments show that our dsReliM system provides up to 20% reliability improvements under different TDP constraints when compared to a state-of-the-art technique. Also, compared to an ideal-case optimal solution, dsReliM deviates up to 2.5% in terms of reliability efficiency, but speeds up the reliability management decision time by a factor of up to 3100.
由于严格的功率包络,在未来的技术节点中,设想不是多核芯片中的所有核心都可以同时上电(在完全性能水平上)。电源门控核心被称为暗硅。与此同时,由于工艺变化和软错误导致的可靠性问题日益严重,对未来技术节点的经济高效部署提出了挑战。提出了一种暗硅芯片(dsReliM)可靠性管理系统,该系统在考虑软误差、工艺变化和热设计功率(TDP)约束的同时,对片上系统的可靠性进行了优化。为了实现tdp约束下的可靠性优化,dsReliM利用了多个可靠的应用程序版本,这些版本可以在不同频率变化的内核上执行,并支持不同的电压频率水平,从而在运行时促进不同的功耗、可靠性和性能权衡。实验表明,与最先进的技术相比,我们的dsReliM系统在不同的TDP约束下提供了高达20%的可靠性提高。此外,与理想情况下的最优解决方案相比,dsReliM在可靠性效率方面的偏差高达2.5%,但将可靠性管理决策时间提高了3100倍。
{"title":"dsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations","authors":"M. Salehi, M. Shafique, F. Kriebel, Semeen Rehman, Mohammad Khavari Tavana, A. Ejlali, J. Henkel","doi":"10.1109/CODESISSS.2015.7331370","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331370","url":null,"abstract":"Due to the tight power envelope, in the future technology nodes it is envisaged that not all cores in a many-core chip can be simultaneously powered-on (at full performance level). The power-gated cores are referred to as Dark Silicon. At the same time, growing reliability issues due to process variations and soft errors challenge the cost-effective deployment of future technology nodes. This paper presents a reliability management system for Dark Silicon chips (dsReliM) that optimizes for reliability of on-chip systems while jointly accounting for soft errors, process variations and the thermal design power (TDP) constraint. Towards the TDP-constrained reliability optimization, dsReliM leverages multiple reliable application versions that can potentially execute on different cores with frequency variations and supporting differenst voltage-frequency levels, thus facilitating distinct power, reliability and performance tradeoffs at run time. Experiments show that our dsReliM system provides up to 20% reliability improvements under different TDP constraints when compared to a state-of-the-art technique. Also, compared to an ideal-case optimal solution, dsReliM deviates up to 2.5% in terms of reliability efficiency, but speeds up the reliability management decision time by a factor of up to 3100.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129775016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Design methodologies for securing cyber-physical systems 保护网络物理系统的设计方法
M. A. Faruque, F. Regazzoni, M. Pajic
Cyber-Physical Systems (CPS) are in most cases safety- and mission-critical. Standard design techniques used for securing embedded systems are not suitable for CPS due to the restricted computation and communication budget available in the latter. In addition, the sensitivity of sensed data and the presence of actuation components further increase the security requirements of CPS. To address these issues, it is necessary to provide new design methods in which security is considered from the beginning of the whole design flow and addressed in a holistic way. In this paper, we focus on the design of secure CPS as part of the complete CPS design process, and provide insights into new requirements on platform-aware design of control components, design methodologies and architectures posed by CPS design. We start by discussing methods for the multi-disciplinary modeling, simulation, tools, and software synthesis challenges for CPS. We also present a framework for design of secure control systems for CPS, while taking into account properties of the underlying computation and communication platforms. Finally, we describe the security challenges in the computing hardware that is used in CPS.
在大多数情况下,网络物理系统(CPS)对安全和任务至关重要。用于保护嵌入式系统的标准设计技术不适合CPS,因为后者的计算和通信预算有限。此外,感知数据的敏感性和驱动元件的存在进一步增加了CPS的安全性要求。为了解决这些问题,有必要提供新的设计方法,在这些方法中,从整个设计流程的开始就考虑安全性,并以整体的方式进行处理。在本文中,我们将重点放在安全CPS的设计上,作为完整CPS设计过程的一部分,并提供了对控制组件的平台感知设计、设计方法和架构的新要求的见解。我们首先讨论了CPS的多学科建模、仿真、工具和软件合成挑战的方法。我们还提出了一个设计CPS安全控制系统的框架,同时考虑到底层计算和通信平台的特性。最后,我们描述了在CPS中使用的计算硬件中的安全挑战。
{"title":"Design methodologies for securing cyber-physical systems","authors":"M. A. Faruque, F. Regazzoni, M. Pajic","doi":"10.1109/CODESISSS.2015.7331365","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331365","url":null,"abstract":"Cyber-Physical Systems (CPS) are in most cases safety- and mission-critical. Standard design techniques used for securing embedded systems are not suitable for CPS due to the restricted computation and communication budget available in the latter. In addition, the sensitivity of sensed data and the presence of actuation components further increase the security requirements of CPS. To address these issues, it is necessary to provide new design methods in which security is considered from the beginning of the whole design flow and addressed in a holistic way. In this paper, we focus on the design of secure CPS as part of the complete CPS design process, and provide insights into new requirements on platform-aware design of control components, design methodologies and architectures posed by CPS design. We start by discussing methods for the multi-disciplinary modeling, simulation, tools, and software synthesis challenges for CPS. We also present a framework for design of secure control systems for CPS, while taking into account properties of the underlying computation and communication platforms. Finally, we describe the security challenges in the computing hardware that is used in CPS.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132850779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
期刊
2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1