2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)最新文献

英文中文

Invited paper: Low power requirements and side-channel protection of encryption engines: Challenges and opportunities 邀请论文:低功耗要求和加密引擎的侧信道保护:挑战与机遇

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009186

Monodeep Kar, Arvind Singh, S. Mathew, Anand Rajan, V. De, S. Mukhopadhyay

Power attack is a critical challenge to security of encryption engines. Countermeasures to side-channel attacks often come at high power, area, or performance overhead. Therefore, design of side-channel secure encryption engines is a critical challenge for power-/resource-constrained platforms. This paper discusses that although low-power need imposes critical challenge for side-channel security, but circuit techniques traditionally developed for power management also present new opportunities for side-channel resistance. As a case-study, we show the feasibility of using integrated voltage regulator, normally used for efficient power management, for increasing side-channel resistance of AES engines.

功率攻击是加密引擎安全性面临的严峻挑战。对侧信道攻击的对策通常需要很高的功率、面积或性能开销。因此，侧信道安全加密引擎的设计是功率/资源受限平台的关键挑战。本文讨论了尽管低功耗需求对侧通道安全性提出了严峻的挑战，但传统的电源管理电路技术也为侧通道电阻提供了新的机会。作为一个案例研究，我们展示了使用集成电压调节器的可行性，通常用于有效的电源管理，以增加AES发动机的侧通道电阻。

引用次数: 4

A low-power APUF-based environmental abnormality detection framework 基于低功耗apuf的环境异常检测框架

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009194

Hongxiang Gu, T. Xu, M. Potkonjak

Physical unclonable functions (PUFs) take advantage of the effect of process variation on hardware to obtain their unclonability. Traditional PUF design only focuses on the analog signals of circuits. An arbiter PUF, for example, generates responses by racing delay signals. Implementations of such PUFs usually employ large area and power consumption while providing very low throughput. To address this problem, we propose an energy efficient PUF design in such a way that it races analog signals and computes digital logic simultaneously. More importantly, the analog portion of the circuit (racing) shares a large amount of hardware resources with the digital portion of the circuit (computing) by introducing only small overhead in terms of area and power. Our test results on Spartan-6 field-programmable gate array (FPGA) platforms indicate that by combining the two outputs, our design enables much larger PUF output throughput, better randomness and less power consumption compared to traditional PUFs.

物理不可克隆函数(puf)利用进程变化对硬件的影响来获得其不可克隆性。传统的PUF设计只关注电路的模拟信号。例如，仲裁PUF通过竞速延迟信号产生响应。这种puf的实现通常采用较大的面积和功耗，同时提供非常低的吞吐量。为了解决这个问题，我们提出了一种节能的PUF设计，它可以同时处理模拟信号和计算数字逻辑。更重要的是，电路的模拟部分(赛车)与电路的数字部分(计算)共享了大量的硬件资源，在面积和功率方面只引入了很小的开销。我们在Spartan-6现场可编程门阵列(FPGA)平台上的测试结果表明，与传统PUF相比，通过结合两种输出，我们的设计可以实现更大的PUF输出吞吐量，更好的随机性和更低的功耗。

引用次数: 1

Spin-torque sensors with differential signaling for fast and energy efficient global interconnects 具有差分信号的自旋扭矩传感器，用于快速和节能的全球互连

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009155

Z. Azim, K. Roy

We propose a hybrid global interconnect that combines Spin-Torque (ST) sensors with differential amplifiers to greatly reduce the overall power consumption while minimizing the delay along the line. ST-sensor based interconnects have recently been proposed that show significant energy efficiency compared to conventional full swing CMOS interconnects. However, the latency of ST-sensor interconnects can be rather high due to inefficient signal regeneration along the line. As a solution, we propose the use of differential amplifiers as repeaters along with ST-sensor as receiver to speed up the interconnect delay. Moreover, the introduction of differential signaling greatly increases the robustness of the design against noise and variations. Our simulation results indicate that for a 10 mm line in 45 mm CMOS technology, the energy consumption with hybrid ST-sensor interconnect is ∼5× lower compared to full-swing CMOS interconnect while operating at similar speed. Moreover, the energy consumption is ∼2× lower compared to low-swing CMOS interconnect, in addition to significant improvement in latency.

我们提出了一种混合全局互连，将自旋扭矩(ST)传感器与差分放大器相结合，以大大降低总体功耗，同时最大限度地降低线路延迟。与传统的全摆幅CMOS互连相比，最近提出了基于st传感器的互连，显示出显着的能源效率。然而，由于沿线路的信号再生效率低下，st传感器互连的延迟可能相当高。作为一种解决方案，我们建议使用差分放大器作为中继器，st传感器作为接收器，以加快互连延迟。此外，差分信号的引入大大增加了设计对噪声和变化的鲁棒性。我们的仿真结果表明，对于采用45毫米CMOS技术的10毫米线，在相同速度下，混合st传感器互连的能耗比全摆幅CMOS互连低约5倍。此外，与低摆幅CMOS互连相比，能耗降低约2倍，延迟也有显著改善。

{"title":"Spin-torque sensors with differential signaling for fast and energy efficient global interconnects","authors":"Z. Azim, K. Roy","doi":"10.1109/ISLPED.2017.8009155","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009155","url":null,"abstract":"We propose a hybrid global interconnect that combines Spin-Torque (ST) sensors with differential amplifiers to greatly reduce the overall power consumption while minimizing the delay along the line. ST-sensor based interconnects have recently been proposed that show significant energy efficiency compared to conventional full swing CMOS interconnects. However, the latency of ST-sensor interconnects can be rather high due to inefficient signal regeneration along the line. As a solution, we propose the use of differential amplifiers as repeaters along with ST-sensor as receiver to speed up the interconnect delay. Moreover, the introduction of differential signaling greatly increases the robustness of the design against noise and variations. Our simulation results indicate that for a 10 mm line in 45 mm CMOS technology, the energy consumption with hybrid ST-sensor interconnect is ∼5× lower compared to full-swing CMOS interconnect while operating at similar speed. Moreover, the energy consumption is ∼2× lower compared to low-swing CMOS interconnect, in addition to significant improvement in latency.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129170081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Battery assignment and scheduling for drone delivery businesses 无人机送货业务的电池分配和调度

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009165

Sangyoung Park, Licong Zhang, S. Chakraborty

Recent advances in battery and drone technologies have opened up possibilities of commercial use of drones. Private companies are looking into the possibilities of using drones for commercial deliveries from the legal, technical, and economical perspective. Nevertheless, the battery management perspective of such businesses has not yet been thoroughly investigated. In this paper, we identify that battery management of such application has a major impact of the costs, and formulate an optimization problem to reduce the aging of batteries. We identify two sub-problems, battery assignment, and battery scheduling to derive a solution that minimizes the aging of the batteries. We show that the formulation enables leveraging the trade-off relationships between the packet waiting time and battery purchasing cost. The experimental results show the proposed method reduce the electricity and battery purchasing cost by 25%, and average packet waiting time by more than 50%.

最近电池和无人机技术的进步为无人机的商业用途开辟了可能性。私营公司正在从法律、技术和经济的角度研究使用无人机进行商业送货的可能性。然而，这些企业的电池管理观点尚未得到彻底调查。在本文中，我们认识到此类应用的电池管理对成本有重大影响，并制定了优化问题以降低电池的老化。我们确定了两个子问题，电池分配和电池调度，以得出最小化电池老化的解决方案。我们表明，该公式能够利用包等待时间和电池购买成本之间的权衡关系。实验结果表明，该方法可将电力和电池采购成本降低25%，平均包等待时间减少50%以上。

引用次数: 30

A simple yet efficient accuracy configurable adder design 一个简单而高效的精度可配置加法器设计

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009206

Wenbin Xu, S. Sapatnekar, Jiang Hu

Approximate computing is a promising approach for low power IC design and has recently received considerable research attention. To accommodate dynamic levels of approximation, a few accuracy configurable adder designs have been developed in the past. However, these designs tend to incur large area overheads as they rely on either redundant computing or complicated carry prediction. Some of these designs include error detection and correction circuitry, which further increases area. In this work, we investigate a simple accuracy configurable adder design that contains no redundancy or error detection/correction circuitry and uses very simple carry prediction. Simulation results show that our design dominates the latest previous work on accuracy-delay-power tradeoff while using 39% lower area. Moreover, we propose a delay-adaptive self-configuration technique to further improve accuracy-delay-power tradeoff.

近似计算是一种很有前途的低功耗集成电路设计方法，近年来得到了广泛的研究关注。为了适应动态逼近水平，过去已经开发了一些精度可配置加法器设计。然而，这些设计往往会产生较大的面积开销，因为它们依赖于冗余计算或复杂的进位预测。其中一些设计包括错误检测和校正电路，这进一步增加了面积。在这项工作中，我们研究了一种简单的精度可配置加法器设计，它不包含冗余或错误检测/校正电路，并使用非常简单的进位预测。仿真结果表明，我们的设计在精度-延迟-功耗权衡方面优于以往的最新工作，而使用的面积降低了39%。此外，我们提出了一种延迟自适应配置技术，以进一步改善精度-延迟-功率的权衡。

引用次数: 56

Bit-width reduction and customized register for low cost convolutional neural network accelerator 低成本卷积神经网络加速器的位宽缩减和自定义寄存器

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009164

Kyungrak Choi, Woong Choi, Kyungho Shin, Jongsun Park

This paper presents a low area and energy efficient hardware accelerator for the deep convolutional neural networks (CNNs). Based on the multiply-accumulate (MAC) based architecture, three design techniques are proposed to reduce the hardware cost of the convolutional computations. First, to reduce the computational bit-width of convolutions, an adaptive bit-width reduction scheme is proposed based on differential input method. The bit-width reduction approach can reduce the 37 % of operation bit-width with almost ignorable CNN accuracy degradation. Second, it has been found that adapting bi-directional filtering window in CNN accelerator can considerably reduce the energy for data movement with much smaller number of memory accesses. To expedite the bi-directional filtering operations, we also propose a bidirectional first-input-first-output (bi-FIFO). With SRAM bit-cell layout manner, the proposed bi-FIFO facilitates fast data re-distribution with area and energy efficiency. To verify the effectiveness of the proposed techniques, the AlexNet accelerator has been designed. The numerical results show that the proposed adaptive bit-width reduction scheme achieves 25.9% and 47.3% of area and energy savings, respectively. The bi-FIFO based accelerator also achieves 33 % improved processing time.

提出了一种用于深度卷积神经网络(cnn)的低面积节能硬件加速器。在基于多重累积(MAC)架构的基础上，提出了三种设计方法来降低卷积计算的硬件开销。首先，为了减小卷积的计算位宽，提出了一种基于差分输入法的自适应位宽减小方案。比特宽度减小方法可以减少37%的操作比特宽度，而CNN的精度下降几乎可以忽略不计。其次，研究发现，在CNN加速器中采用双向滤波窗口可以大大减少数据移动的能量，而内存访问的次数要少得多。为了加快双向滤波操作，我们还提出了双向先输入先输出(bi-FIFO)。采用SRAM位单元布局方式，实现了数据的快速再分配，同时具有面积和能量效率。为了验证所提出技术的有效性，设计了AlexNet加速器。数值计算结果表明，所提出的自适应比特宽缩减方案分别实现了25.9%和47.3%的面积节约和能源节约。基于双fifo的加速器也实现了33%的处理时间改进。

{"title":"Bit-width reduction and customized register for low cost convolutional neural network accelerator","authors":"Kyungrak Choi, Woong Choi, Kyungho Shin, Jongsun Park","doi":"10.1109/ISLPED.2017.8009164","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009164","url":null,"abstract":"This paper presents a low area and energy efficient hardware accelerator for the deep convolutional neural networks (CNNs). Based on the multiply-accumulate (MAC) based architecture, three design techniques are proposed to reduce the hardware cost of the convolutional computations. First, to reduce the computational bit-width of convolutions, an adaptive bit-width reduction scheme is proposed based on differential input method. The bit-width reduction approach can reduce the 37 % of operation bit-width with almost ignorable CNN accuracy degradation. Second, it has been found that adapting bi-directional filtering window in CNN accelerator can considerably reduce the energy for data movement with much smaller number of memory accesses. To expedite the bi-directional filtering operations, we also propose a bidirectional first-input-first-output (bi-FIFO). With SRAM bit-cell layout manner, the proposed bi-FIFO facilitates fast data re-distribution with area and energy efficiency. To verify the effectiveness of the proposed techniques, the AlexNet accelerator has been designed. The numerical results show that the proposed adaptive bit-width reduction scheme achieves 25.9% and 47.3% of area and energy savings, respectively. The bi-FIFO based accelerator also achieves 33 % improved processing time.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128386959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Low design overhead timing error correction scheme for elastic clock methodology 弹性时钟方法的低设计开销定时纠错方案

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009203

Sungju Ryu, Jongeun Koo, Jae-Joon Kim

The elastic clock scheme is a robust design methodology to ensure timing closure under PVT variation using locally generated clocks and handshaking protocol. However, it still has a chance of timing errors due to delay mismatch between the data-path and delay replica. In this paper, we propose a low design overhead timing error correction scheme tailored to elastic clock. In the proposed scheme, a timing error can be corrected within a cycle using clock stretching. The proposed scheme shows 40.3× and 4.6× reduction in timing margin with 9.1% and 9.0% area overhead over the synchronous baseline and elastic clock design, respectively.

弹性时钟方案是一种鲁棒的设计方法，可以使用本地生成的时钟和握手协议来确保PVT变化下的定时关闭。但是，由于数据路径和延迟副本之间的延迟不匹配，它仍然有可能出现计时错误。本文提出了一种针对弹性时钟的低设计开销定时纠错方案。在提出的方案中，可以使用时钟拉伸在一个周期内纠正定时误差。与同步基线和弹性时钟设计相比，该方案的时间裕度分别降低了40.3倍和4.6倍，面积开销分别为9.1%和9.0%。

引用次数: 2

Efficient query processing in crossbar memory 在交叉内存中高效的查询处理

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009204

M. Imani, Saransh Gupta, Atl Arredondo, T. Simunic

Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions in memory including aggregation, prediction, bit-wise operations, as well as exact and nearest distance search queries. NVQuery is implemented on a content addressable memory (CAM) and exploits the analog characteristic of non-volatile memory in order to enable in-memory processing. To implement nearest distance search in memory, we introduce a novel bitline driving scheme to give weights to the indices of the bits during the search operation. Our experimental evaluation shows that, NVQuery can provide 49.3× performance speedup and 32.9× energy savings as compared to running the same query on traditional processor. In addition, compared to the state-of-the-art query accelerators, NVQuery can achieve 26.2× energy-delay product improvement while providing the similar accuracy.

当今的计算系统使用大量的能量和时间来处理数据库中的基本查询。由于传统计算机的缓存容量和内存带宽有限，大部分时间都花在内存和处理核心之间的数据移动上。在本文中，我们提出了一个基于非易失性内存的查询加速器，称为NVQuery，它在内存中执行几种基本的查询功能，包括聚合、预测、逐位操作以及精确和最近距离搜索查询。NVQuery是在一个内容可寻址存储器(CAM)上实现的，它利用了非易失性存储器的模拟特性来实现内存中的处理。为了实现内存中的最近距离搜索，我们引入了一种新的位线驱动方案，在搜索过程中为位的索引赋予权重。我们的实验评估表明，与在传统处理器上运行相同的查询相比，NVQuery可以提供49.3倍的性能加速和32.9倍的节能。此外，与最先进的查询加速器相比，NVQuery可以在提供相似精度的同时实现26.2倍的能量延迟产品改进。

{"title":"Efficient query processing in crossbar memory","authors":"M. Imani, Saransh Gupta, Atl Arredondo, T. Simunic","doi":"10.1109/ISLPED.2017.8009204","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009204","url":null,"abstract":"Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions in memory including aggregation, prediction, bit-wise operations, as well as exact and nearest distance search queries. NVQuery is implemented on a content addressable memory (CAM) and exploits the analog characteristic of non-volatile memory in order to enable in-memory processing. To implement nearest distance search in memory, we introduce a novel bitline driving scheme to give weights to the indices of the bits during the search operation. Our experimental evaluation shows that, NVQuery can provide 49.3× performance speedup and 32.9× energy savings as compared to running the same query on traditional processor. In addition, compared to the state-of-the-art query accelerators, NVQuery can achieve 26.2× energy-delay product improvement while providing the similar accuracy.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132515155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

E-Spector: Online energy inspection for Android applications E-Spector: Android应用的在线能源检测

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009207

Chengke Wang, Yao Guo, Peng Shen, Xiangqun Chen

Energy consumption is one of the most important aspects of mobile apps. During energy testing, it is important for developers to understand not only the energy consumption rate of an app, but also why energy is consumed. However, existing energy testing tools are more concerned about the accuracy of energy estimation, while typically not providing explanations on why and how exactly energy has been consumed. This paper presents E-Spector, an online energy inspection method for Android apps, which can not only visualize the energy consumption of an app in an instant online manner, but also can tell what happened behind each energy hotspot on the energy curve. E-Spector relies on static analysis and app instrumentation to collect the activities from an app execution in real-time. Then it presents the activities on an instant energy curve, such that the user can easily tell what happened behind each energy spike. Experimental result shows that the energy estimation error of E-Spector is less than 10% and its overhead on energy consumption is about 4%. We also show case studies to demonstrate the applicability and effectiveness of E-Spector in energy monitoring, analysis and bug inspection.

能耗是移动应用程序最重要的方面之一。在能耗测试过程中，开发者不仅要了解应用的能耗率，还要了解能耗的原因。然而，现有的能量测试工具更关注能量估计的准确性，而通常不提供关于为什么以及如何准确消耗能量的解释。本文介绍的E-Spector是一种针对Android应用的在线能耗检测方法，它不仅可以在线实时可视化应用的能耗情况，还可以在能耗曲线上显示每个能耗热点背后的情况。E-Spector依靠静态分析和应用程序检测来实时收集应用程序执行中的活动。然后，它会在即时能量曲线上呈现活动，这样用户就可以很容易地知道每个能量峰值背后发生了什么。实验结果表明，E-Spector的能量估计误差小于10%，能耗开销约为4%。我们还展示了案例研究，以展示E-Spector在能源监测、分析和缺陷检查方面的适用性和有效性。

{"title":"E-Spector: Online energy inspection for Android applications","authors":"Chengke Wang, Yao Guo, Peng Shen, Xiangqun Chen","doi":"10.1109/ISLPED.2017.8009207","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009207","url":null,"abstract":"Energy consumption is one of the most important aspects of mobile apps. During energy testing, it is important for developers to understand not only the energy consumption rate of an app, but also why energy is consumed. However, existing energy testing tools are more concerned about the accuracy of energy estimation, while typically not providing explanations on why and how exactly energy has been consumed. This paper presents E-Spector, an online energy inspection method for Android apps, which can not only visualize the energy consumption of an app in an instant online manner, but also can tell what happened behind each energy hotspot on the energy curve. E-Spector relies on static analysis and app instrumentation to collect the activities from an app execution in real-time. Then it presents the activities on an instant energy curve, such that the user can easily tell what happened behind each energy spike. Experimental result shows that the energy estimation error of E-Spector is less than 10% and its overhead on energy consumption is about 4%. We also show case studies to demonstrate the applicability and effectiveness of E-Spector in energy monitoring, analysis and bug inspection.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130621914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

QuARK: Quality-configurable approximate STT-MRAM cache by fine-grained tuning of reliability-energy knobs 夸克:质量可配置的近似STT-MRAM缓存通过可靠性-能量旋钮的细粒度调整

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009198

Amir Mahdi Hosseini Monazzah, Majid Namaki-Shoushtari, S. Miremadi, A. Rahmani, N. Dutt

Emerging STT-MRAM memories are promising alternatives for SRAM memories to tackle their low density and high static power consumption, but impose high energy consumption for reliable read/write operations. However, absolute data integrity is not required for many approximate computing applications, allowing energy savings with minimal quality loss. This paper proposes QuARK, a hardware/software approach for trading reliability of STT-MRAM caches for energy savings in the on-chip memory hierarchy of multi- and many-core systems running approximate applications. In contrast to SRAM-based cache-way-level actuators, QuARK utilizes fine-grained cache-line-level actuation knobs with different levels of reliability for individual read and write accesses which are unique to STT-MRAM and suitable for systems running multiple applications with mixed accuracy sensitivity, thus avoiding interapplication actuation interference. Our experimental results with a set of recognition, mining and synthesis (RMS) benchmarks demonstrate up to 40% energy savings over a fully-protected STT-MRAM cache, with negligible loss in the quality of the generated outputs.

新兴的STT-MRAM存储器是SRAM存储器的有希望的替代品，可以解决其低密度和高静态功耗的问题，但对可靠的读/写操作施加了高能耗。然而，对于许多近似计算应用程序来说，绝对的数据完整性是不需要的，这样可以在最小化质量损失的情况下节省能源。本文提出了QuARK，这是一种硬件/软件方法，用于在运行近似应用的多核和多核系统的片上存储器层次中交换STT-MRAM缓存的可靠性以节省能源。与基于sram的缓存路径级执行器相比，QuARK采用细粒度的缓存行级执行旋钮，具有不同级别的可靠性，用于单独的读写访问，这是STT-MRAM所特有的，适用于运行具有混合精度灵敏度的多个应用程序的系统，从而避免了应用程序间的执行干扰。我们使用一组识别、挖掘和合成(RMS)基准测试的实验结果表明，与完全保护的STT-MRAM缓存相比，可节省高达40%的能源，而生成输出的质量损失可以忽略不计。

{"title":"QuARK: Quality-configurable approximate STT-MRAM cache by fine-grained tuning of reliability-energy knobs","authors":"Amir Mahdi Hosseini Monazzah, Majid Namaki-Shoushtari, S. Miremadi, A. Rahmani, N. Dutt","doi":"10.1109/ISLPED.2017.8009198","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009198","url":null,"abstract":"Emerging STT-MRAM memories are promising alternatives for SRAM memories to tackle their low density and high static power consumption, but impose high energy consumption for reliable read/write operations. However, absolute data integrity is not required for many approximate computing applications, allowing energy savings with minimal quality loss. This paper proposes QuARK, a hardware/software approach for trading reliability of STT-MRAM caches for energy savings in the on-chip memory hierarchy of multi- and many-core systems running approximate applications. In contrast to SRAM-based cache-way-level actuators, QuARK utilizes fine-grained cache-line-level actuation knobs with different levels of reliability for individual read and write accesses which are unique to STT-MRAM and suitable for systems running multiple applications with mixed accuracy sensitivity, thus avoiding interapplication actuation interference. Our experimental results with a set of recognition, mining and synthesis (RMS) benchmarks demonstrate up to 40% energy savings over a fully-protected STT-MRAM cache, with negligible loss in the quality of the generated outputs.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132246221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀