首页 > 最新文献

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)最新文献

英文 中文
Bit-width reduction and customized register for low cost convolutional neural network accelerator 低成本卷积神经网络加速器的位宽缩减和自定义寄存器
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009164
Kyungrak Choi, Woong Choi, Kyungho Shin, Jongsun Park
This paper presents a low area and energy efficient hardware accelerator for the deep convolutional neural networks (CNNs). Based on the multiply-accumulate (MAC) based architecture, three design techniques are proposed to reduce the hardware cost of the convolutional computations. First, to reduce the computational bit-width of convolutions, an adaptive bit-width reduction scheme is proposed based on differential input method. The bit-width reduction approach can reduce the 37 % of operation bit-width with almost ignorable CNN accuracy degradation. Second, it has been found that adapting bi-directional filtering window in CNN accelerator can considerably reduce the energy for data movement with much smaller number of memory accesses. To expedite the bi-directional filtering operations, we also propose a bidirectional first-input-first-output (bi-FIFO). With SRAM bit-cell layout manner, the proposed bi-FIFO facilitates fast data re-distribution with area and energy efficiency. To verify the effectiveness of the proposed techniques, the AlexNet accelerator has been designed. The numerical results show that the proposed adaptive bit-width reduction scheme achieves 25.9% and 47.3% of area and energy savings, respectively. The bi-FIFO based accelerator also achieves 33 % improved processing time.
提出了一种用于深度卷积神经网络(cnn)的低面积节能硬件加速器。在基于多重累积(MAC)架构的基础上,提出了三种设计方法来降低卷积计算的硬件开销。首先,为了减小卷积的计算位宽,提出了一种基于差分输入法的自适应位宽减小方案。比特宽度减小方法可以减少37%的操作比特宽度,而CNN的精度下降几乎可以忽略不计。其次,研究发现,在CNN加速器中采用双向滤波窗口可以大大减少数据移动的能量,而内存访问的次数要少得多。为了加快双向滤波操作,我们还提出了双向先输入先输出(bi-FIFO)。采用SRAM位单元布局方式,实现了数据的快速再分配,同时具有面积和能量效率。为了验证所提出技术的有效性,设计了AlexNet加速器。数值计算结果表明,所提出的自适应比特宽缩减方案分别实现了25.9%和47.3%的面积节约和能源节约。基于双fifo的加速器也实现了33%的处理时间改进。
{"title":"Bit-width reduction and customized register for low cost convolutional neural network accelerator","authors":"Kyungrak Choi, Woong Choi, Kyungho Shin, Jongsun Park","doi":"10.1109/ISLPED.2017.8009164","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009164","url":null,"abstract":"This paper presents a low area and energy efficient hardware accelerator for the deep convolutional neural networks (CNNs). Based on the multiply-accumulate (MAC) based architecture, three design techniques are proposed to reduce the hardware cost of the convolutional computations. First, to reduce the computational bit-width of convolutions, an adaptive bit-width reduction scheme is proposed based on differential input method. The bit-width reduction approach can reduce the 37 % of operation bit-width with almost ignorable CNN accuracy degradation. Second, it has been found that adapting bi-directional filtering window in CNN accelerator can considerably reduce the energy for data movement with much smaller number of memory accesses. To expedite the bi-directional filtering operations, we also propose a bidirectional first-input-first-output (bi-FIFO). With SRAM bit-cell layout manner, the proposed bi-FIFO facilitates fast data re-distribution with area and energy efficiency. To verify the effectiveness of the proposed techniques, the AlexNet accelerator has been designed. The numerical results show that the proposed adaptive bit-width reduction scheme achieves 25.9% and 47.3% of area and energy savings, respectively. The bi-FIFO based accelerator also achieves 33 % improved processing time.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128386959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
AXSERBUS: A quality-configurable approximate serial bus for energy-efficient sensing AXSERBUS:一种质量可配置的近似串行总线,用于节能传感
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009172
Younghyun Kim, Setareh Behroozi, V. Raghunathan, A. Raghunathan
Mobile, wearable, and implantable devices integrate an increasing number and variety of sensors such as microphones, image sensors, and accelerometers. These devices spend substantial amounts of time reading the sensors within them, thereby incurring significant energy dissipation over off-chip serial interconnects. This paper proposes AXSERBUS, a quality-configurable approximate serial bus that exploits the locality of sensory data and the error resiliency of sensing applications to reduce energy dissipation. AXSERBUS significantly reduces signal transitions by encoding the differentials of sensory data in three encoding modes, depending on the magnitude of the differentials: very small differentials are zeroed out, incurring no energy dissipation; intermediate differentials are encoded using special low-transition count patterns; and for high differentials, the absolute value (not the differential) of the data is transmitted. Compared to previous schemes, the proposed multi-level encoding results in more data being encoded as low-energy patterns. In addition, in the intermediate differential encoding mode, the differentials are encoded in an approximate manner, and the approximation bounds are proportional to the magnitude of the differentials. Since small differentials are more frequent than large differentials in sensory data, the proposed encoding scheme also minimizes quality degradation. We demonstrate that AXSERBUS achieves improved energy vs. quality tradeoffs compared to previous schemes. In the context of an optical character recognition (OCR) application, AXSERBUS achieves 79.4% reduction in dynamic power dissipation, while maintaining accuracy above 95%.
移动、可穿戴和可植入设备集成了越来越多的各种传感器,如麦克风、图像传感器和加速度计。这些设备花费大量时间读取其中的传感器,从而在片外串行互连上产生显著的能量耗散。本文提出了一种质量可配置的近似串行总线AXSERBUS,它利用传感数据的局域性和传感应用的错误弹性来减少能量消耗。AXSERBUS通过三种编码模式对传感数据的差分进行编码,根据差分的大小显著减少了信号转换:非常小的差分被置零,不产生能量耗散;中间差分使用特殊的低转换计数模式进行编码;对于高微分,传输数据的绝对值(而不是微分)。与以前的方案相比,所提出的多级编码使得更多的数据被编码为低能量模式。此外,在中间差分编码模式下,微分以近似方式编码,近似界与微分的大小成正比。由于小的差异比大的差异在感官数据中更频繁,所提出的编码方案也最大限度地降低了质量退化。我们证明,与以前的方案相比,AXSERBUS实现了更好的能源与质量权衡。在光学字符识别(OCR)应用环境中,AXSERBUS实现了动态功耗降低79.4%,同时保持准确率在95%以上。
{"title":"AXSERBUS: A quality-configurable approximate serial bus for energy-efficient sensing","authors":"Younghyun Kim, Setareh Behroozi, V. Raghunathan, A. Raghunathan","doi":"10.1109/ISLPED.2017.8009172","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009172","url":null,"abstract":"Mobile, wearable, and implantable devices integrate an increasing number and variety of sensors such as microphones, image sensors, and accelerometers. These devices spend substantial amounts of time reading the sensors within them, thereby incurring significant energy dissipation over off-chip serial interconnects. This paper proposes AXSERBUS, a quality-configurable approximate serial bus that exploits the locality of sensory data and the error resiliency of sensing applications to reduce energy dissipation. AXSERBUS significantly reduces signal transitions by encoding the differentials of sensory data in three encoding modes, depending on the magnitude of the differentials: very small differentials are zeroed out, incurring no energy dissipation; intermediate differentials are encoded using special low-transition count patterns; and for high differentials, the absolute value (not the differential) of the data is transmitted. Compared to previous schemes, the proposed multi-level encoding results in more data being encoded as low-energy patterns. In addition, in the intermediate differential encoding mode, the differentials are encoded in an approximate manner, and the approximation bounds are proportional to the magnitude of the differentials. Since small differentials are more frequent than large differentials in sensory data, the proposed encoding scheme also minimizes quality degradation. We demonstrate that AXSERBUS achieves improved energy vs. quality tradeoffs compared to previous schemes. In the context of an optical character recognition (OCR) application, AXSERBUS achieves 79.4% reduction in dynamic power dissipation, while maintaining accuracy above 95%.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124359149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Invited paper: Low power requirements and side-channel protection of encryption engines: Challenges and opportunities 邀请论文:低功耗要求和加密引擎的侧信道保护:挑战与机遇
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009186
Monodeep Kar, Arvind Singh, S. Mathew, Anand Rajan, V. De, S. Mukhopadhyay
Power attack is a critical challenge to security of encryption engines. Countermeasures to side-channel attacks often come at high power, area, or performance overhead. Therefore, design of side-channel secure encryption engines is a critical challenge for power-/resource-constrained platforms. This paper discusses that although low-power need imposes critical challenge for side-channel security, but circuit techniques traditionally developed for power management also present new opportunities for side-channel resistance. As a case-study, we show the feasibility of using integrated voltage regulator, normally used for efficient power management, for increasing side-channel resistance of AES engines.
功率攻击是加密引擎安全性面临的严峻挑战。对侧信道攻击的对策通常需要很高的功率、面积或性能开销。因此,侧信道安全加密引擎的设计是功率/资源受限平台的关键挑战。本文讨论了尽管低功耗需求对侧通道安全性提出了严峻的挑战,但传统的电源管理电路技术也为侧通道电阻提供了新的机会。作为一个案例研究,我们展示了使用集成电压调节器的可行性,通常用于有效的电源管理,以增加AES发动机的侧通道电阻。
{"title":"Invited paper: Low power requirements and side-channel protection of encryption engines: Challenges and opportunities","authors":"Monodeep Kar, Arvind Singh, S. Mathew, Anand Rajan, V. De, S. Mukhopadhyay","doi":"10.1109/ISLPED.2017.8009186","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009186","url":null,"abstract":"Power attack is a critical challenge to security of encryption engines. Countermeasures to side-channel attacks often come at high power, area, or performance overhead. Therefore, design of side-channel secure encryption engines is a critical challenge for power-/resource-constrained platforms. This paper discusses that although low-power need imposes critical challenge for side-channel security, but circuit techniques traditionally developed for power management also present new opportunities for side-channel resistance. As a case-study, we show the feasibility of using integrated voltage regulator, normally used for efficient power management, for increasing side-channel resistance of AES engines.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123101127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Spin-torque sensors with differential signaling for fast and energy efficient global interconnects 具有差分信号的自旋扭矩传感器,用于快速和节能的全球互连
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009155
Z. Azim, K. Roy
We propose a hybrid global interconnect that combines Spin-Torque (ST) sensors with differential amplifiers to greatly reduce the overall power consumption while minimizing the delay along the line. ST-sensor based interconnects have recently been proposed that show significant energy efficiency compared to conventional full swing CMOS interconnects. However, the latency of ST-sensor interconnects can be rather high due to inefficient signal regeneration along the line. As a solution, we propose the use of differential amplifiers as repeaters along with ST-sensor as receiver to speed up the interconnect delay. Moreover, the introduction of differential signaling greatly increases the robustness of the design against noise and variations. Our simulation results indicate that for a 10 mm line in 45 mm CMOS technology, the energy consumption with hybrid ST-sensor interconnect is ∼5× lower compared to full-swing CMOS interconnect while operating at similar speed. Moreover, the energy consumption is ∼2× lower compared to low-swing CMOS interconnect, in addition to significant improvement in latency.
我们提出了一种混合全局互连,将自旋扭矩(ST)传感器与差分放大器相结合,以大大降低总体功耗,同时最大限度地降低线路延迟。与传统的全摆幅CMOS互连相比,最近提出了基于st传感器的互连,显示出显着的能源效率。然而,由于沿线路的信号再生效率低下,st传感器互连的延迟可能相当高。作为一种解决方案,我们建议使用差分放大器作为中继器,st传感器作为接收器,以加快互连延迟。此外,差分信号的引入大大增加了设计对噪声和变化的鲁棒性。我们的仿真结果表明,对于采用45毫米CMOS技术的10毫米线,在相同速度下,混合st传感器互连的能耗比全摆幅CMOS互连低约5倍。此外,与低摆幅CMOS互连相比,能耗降低约2倍,延迟也有显著改善。
{"title":"Spin-torque sensors with differential signaling for fast and energy efficient global interconnects","authors":"Z. Azim, K. Roy","doi":"10.1109/ISLPED.2017.8009155","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009155","url":null,"abstract":"We propose a hybrid global interconnect that combines Spin-Torque (ST) sensors with differential amplifiers to greatly reduce the overall power consumption while minimizing the delay along the line. ST-sensor based interconnects have recently been proposed that show significant energy efficiency compared to conventional full swing CMOS interconnects. However, the latency of ST-sensor interconnects can be rather high due to inefficient signal regeneration along the line. As a solution, we propose the use of differential amplifiers as repeaters along with ST-sensor as receiver to speed up the interconnect delay. Moreover, the introduction of differential signaling greatly increases the robustness of the design against noise and variations. Our simulation results indicate that for a 10 mm line in 45 mm CMOS technology, the energy consumption with hybrid ST-sensor interconnect is ∼5× lower compared to full-swing CMOS interconnect while operating at similar speed. Moreover, the energy consumption is ∼2× lower compared to low-swing CMOS interconnect, in addition to significant improvement in latency.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129170081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simple yet efficient accuracy configurable adder design 一个简单而高效的精度可配置加法器设计
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009206
Wenbin Xu, S. Sapatnekar, Jiang Hu
Approximate computing is a promising approach for low power IC design and has recently received considerable research attention. To accommodate dynamic levels of approximation, a few accuracy configurable adder designs have been developed in the past. However, these designs tend to incur large area overheads as they rely on either redundant computing or complicated carry prediction. Some of these designs include error detection and correction circuitry, which further increases area. In this work, we investigate a simple accuracy configurable adder design that contains no redundancy or error detection/correction circuitry and uses very simple carry prediction. Simulation results show that our design dominates the latest previous work on accuracy-delay-power tradeoff while using 39% lower area. Moreover, we propose a delay-adaptive self-configuration technique to further improve accuracy-delay-power tradeoff.
近似计算是一种很有前途的低功耗集成电路设计方法,近年来得到了广泛的研究关注。为了适应动态逼近水平,过去已经开发了一些精度可配置加法器设计。然而,这些设计往往会产生较大的面积开销,因为它们依赖于冗余计算或复杂的进位预测。其中一些设计包括错误检测和校正电路,这进一步增加了面积。在这项工作中,我们研究了一种简单的精度可配置加法器设计,它不包含冗余或错误检测/校正电路,并使用非常简单的进位预测。仿真结果表明,我们的设计在精度-延迟-功耗权衡方面优于以往的最新工作,而使用的面积降低了39%。此外,我们提出了一种延迟自适应配置技术,以进一步改善精度-延迟-功率的权衡。
{"title":"A simple yet efficient accuracy configurable adder design","authors":"Wenbin Xu, S. Sapatnekar, Jiang Hu","doi":"10.1109/ISLPED.2017.8009206","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009206","url":null,"abstract":"Approximate computing is a promising approach for low power IC design and has recently received considerable research attention. To accommodate dynamic levels of approximation, a few accuracy configurable adder designs have been developed in the past. However, these designs tend to incur large area overheads as they rely on either redundant computing or complicated carry prediction. Some of these designs include error detection and correction circuitry, which further increases area. In this work, we investigate a simple accuracy configurable adder design that contains no redundancy or error detection/correction circuitry and uses very simple carry prediction. Simulation results show that our design dominates the latest previous work on accuracy-delay-power tradeoff while using 39% lower area. Moreover, we propose a delay-adaptive self-configuration technique to further improve accuracy-delay-power tradeoff.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132509012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Battery assignment and scheduling for drone delivery businesses 无人机送货业务的电池分配和调度
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009165
Sangyoung Park, Licong Zhang, S. Chakraborty
Recent advances in battery and drone technologies have opened up possibilities of commercial use of drones. Private companies are looking into the possibilities of using drones for commercial deliveries from the legal, technical, and economical perspective. Nevertheless, the battery management perspective of such businesses has not yet been thoroughly investigated. In this paper, we identify that battery management of such application has a major impact of the costs, and formulate an optimization problem to reduce the aging of batteries. We identify two sub-problems, battery assignment, and battery scheduling to derive a solution that minimizes the aging of the batteries. We show that the formulation enables leveraging the trade-off relationships between the packet waiting time and battery purchasing cost. The experimental results show the proposed method reduce the electricity and battery purchasing cost by 25%, and average packet waiting time by more than 50%.
最近电池和无人机技术的进步为无人机的商业用途开辟了可能性。私营公司正在从法律、技术和经济的角度研究使用无人机进行商业送货的可能性。然而,这些企业的电池管理观点尚未得到彻底调查。在本文中,我们认识到此类应用的电池管理对成本有重大影响,并制定了优化问题以降低电池的老化。我们确定了两个子问题,电池分配和电池调度,以得出最小化电池老化的解决方案。我们表明,该公式能够利用包等待时间和电池购买成本之间的权衡关系。实验结果表明,该方法可将电力和电池采购成本降低25%,平均包等待时间减少50%以上。
{"title":"Battery assignment and scheduling for drone delivery businesses","authors":"Sangyoung Park, Licong Zhang, S. Chakraborty","doi":"10.1109/ISLPED.2017.8009165","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009165","url":null,"abstract":"Recent advances in battery and drone technologies have opened up possibilities of commercial use of drones. Private companies are looking into the possibilities of using drones for commercial deliveries from the legal, technical, and economical perspective. Nevertheless, the battery management perspective of such businesses has not yet been thoroughly investigated. In this paper, we identify that battery management of such application has a major impact of the costs, and formulate an optimization problem to reduce the aging of batteries. We identify two sub-problems, battery assignment, and battery scheduling to derive a solution that minimizes the aging of the batteries. We show that the formulation enables leveraging the trade-off relationships between the packet waiting time and battery purchasing cost. The experimental results show the proposed method reduce the electricity and battery purchasing cost by 25%, and average packet waiting time by more than 50%.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132170218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Low design overhead timing error correction scheme for elastic clock methodology 弹性时钟方法的低设计开销定时纠错方案
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009203
Sungju Ryu, Jongeun Koo, Jae-Joon Kim
The elastic clock scheme is a robust design methodology to ensure timing closure under PVT variation using locally generated clocks and handshaking protocol. However, it still has a chance of timing errors due to delay mismatch between the data-path and delay replica. In this paper, we propose a low design overhead timing error correction scheme tailored to elastic clock. In the proposed scheme, a timing error can be corrected within a cycle using clock stretching. The proposed scheme shows 40.3× and 4.6× reduction in timing margin with 9.1% and 9.0% area overhead over the synchronous baseline and elastic clock design, respectively.
弹性时钟方案是一种鲁棒的设计方法,可以使用本地生成的时钟和握手协议来确保PVT变化下的定时关闭。但是,由于数据路径和延迟副本之间的延迟不匹配,它仍然有可能出现计时错误。本文提出了一种针对弹性时钟的低设计开销定时纠错方案。在提出的方案中,可以使用时钟拉伸在一个周期内纠正定时误差。与同步基线和弹性时钟设计相比,该方案的时间裕度分别降低了40.3倍和4.6倍,面积开销分别为9.1%和9.0%。
{"title":"Low design overhead timing error correction scheme for elastic clock methodology","authors":"Sungju Ryu, Jongeun Koo, Jae-Joon Kim","doi":"10.1109/ISLPED.2017.8009203","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009203","url":null,"abstract":"The elastic clock scheme is a robust design methodology to ensure timing closure under PVT variation using locally generated clocks and handshaking protocol. However, it still has a chance of timing errors due to delay mismatch between the data-path and delay replica. In this paper, we propose a low design overhead timing error correction scheme tailored to elastic clock. In the proposed scheme, a timing error can be corrected within a cycle using clock stretching. The proposed scheme shows 40.3× and 4.6× reduction in timing margin with 9.1% and 9.0% area overhead over the synchronous baseline and elastic clock design, respectively.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116572096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
E-Spector: Online energy inspection for Android applications E-Spector: Android应用的在线能源检测
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009207
Chengke Wang, Yao Guo, Peng Shen, Xiangqun Chen
Energy consumption is one of the most important aspects of mobile apps. During energy testing, it is important for developers to understand not only the energy consumption rate of an app, but also why energy is consumed. However, existing energy testing tools are more concerned about the accuracy of energy estimation, while typically not providing explanations on why and how exactly energy has been consumed. This paper presents E-Spector, an online energy inspection method for Android apps, which can not only visualize the energy consumption of an app in an instant online manner, but also can tell what happened behind each energy hotspot on the energy curve. E-Spector relies on static analysis and app instrumentation to collect the activities from an app execution in real-time. Then it presents the activities on an instant energy curve, such that the user can easily tell what happened behind each energy spike. Experimental result shows that the energy estimation error of E-Spector is less than 10% and its overhead on energy consumption is about 4%. We also show case studies to demonstrate the applicability and effectiveness of E-Spector in energy monitoring, analysis and bug inspection.
能耗是移动应用程序最重要的方面之一。在能耗测试过程中,开发者不仅要了解应用的能耗率,还要了解能耗的原因。然而,现有的能量测试工具更关注能量估计的准确性,而通常不提供关于为什么以及如何准确消耗能量的解释。本文介绍的E-Spector是一种针对Android应用的在线能耗检测方法,它不仅可以在线实时可视化应用的能耗情况,还可以在能耗曲线上显示每个能耗热点背后的情况。E-Spector依靠静态分析和应用程序检测来实时收集应用程序执行中的活动。然后,它会在即时能量曲线上呈现活动,这样用户就可以很容易地知道每个能量峰值背后发生了什么。实验结果表明,E-Spector的能量估计误差小于10%,能耗开销约为4%。我们还展示了案例研究,以展示E-Spector在能源监测、分析和缺陷检查方面的适用性和有效性。
{"title":"E-Spector: Online energy inspection for Android applications","authors":"Chengke Wang, Yao Guo, Peng Shen, Xiangqun Chen","doi":"10.1109/ISLPED.2017.8009207","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009207","url":null,"abstract":"Energy consumption is one of the most important aspects of mobile apps. During energy testing, it is important for developers to understand not only the energy consumption rate of an app, but also why energy is consumed. However, existing energy testing tools are more concerned about the accuracy of energy estimation, while typically not providing explanations on why and how exactly energy has been consumed. This paper presents E-Spector, an online energy inspection method for Android apps, which can not only visualize the energy consumption of an app in an instant online manner, but also can tell what happened behind each energy hotspot on the energy curve. E-Spector relies on static analysis and app instrumentation to collect the activities from an app execution in real-time. Then it presents the activities on an instant energy curve, such that the user can easily tell what happened behind each energy spike. Experimental result shows that the energy estimation error of E-Spector is less than 10% and its overhead on energy consumption is about 4%. We also show case studies to demonstrate the applicability and effectiveness of E-Spector in energy monitoring, analysis and bug inspection.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130621914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficient query processing in crossbar memory 在交叉内存中高效的查询处理
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009204
M. Imani, Saransh Gupta, Atl Arredondo, T. Simunic
Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions in memory including aggregation, prediction, bit-wise operations, as well as exact and nearest distance search queries. NVQuery is implemented on a content addressable memory (CAM) and exploits the analog characteristic of non-volatile memory in order to enable in-memory processing. To implement nearest distance search in memory, we introduce a novel bitline driving scheme to give weights to the indices of the bits during the search operation. Our experimental evaluation shows that, NVQuery can provide 49.3× performance speedup and 32.9× energy savings as compared to running the same query on traditional processor. In addition, compared to the state-of-the-art query accelerators, NVQuery can achieve 26.2× energy-delay product improvement while providing the similar accuracy.
当今的计算系统使用大量的能量和时间来处理数据库中的基本查询。由于传统计算机的缓存容量和内存带宽有限,大部分时间都花在内存和处理核心之间的数据移动上。在本文中,我们提出了一个基于非易失性内存的查询加速器,称为NVQuery,它在内存中执行几种基本的查询功能,包括聚合、预测、逐位操作以及精确和最近距离搜索查询。NVQuery是在一个内容可寻址存储器(CAM)上实现的,它利用了非易失性存储器的模拟特性来实现内存中的处理。为了实现内存中的最近距离搜索,我们引入了一种新的位线驱动方案,在搜索过程中为位的索引赋予权重。我们的实验评估表明,与在传统处理器上运行相同的查询相比,NVQuery可以提供49.3倍的性能加速和32.9倍的节能。此外,与最先进的查询加速器相比,NVQuery可以在提供相似精度的同时实现26.2倍的能量延迟产品改进。
{"title":"Efficient query processing in crossbar memory","authors":"M. Imani, Saransh Gupta, Atl Arredondo, T. Simunic","doi":"10.1109/ISLPED.2017.8009204","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009204","url":null,"abstract":"Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions in memory including aggregation, prediction, bit-wise operations, as well as exact and nearest distance search queries. NVQuery is implemented on a content addressable memory (CAM) and exploits the analog characteristic of non-volatile memory in order to enable in-memory processing. To implement nearest distance search in memory, we introduce a novel bitline driving scheme to give weights to the indices of the bits during the search operation. Our experimental evaluation shows that, NVQuery can provide 49.3× performance speedup and 32.9× energy savings as compared to running the same query on traditional processor. In addition, compared to the state-of-the-art query accelerators, NVQuery can achieve 26.2× energy-delay product improvement while providing the similar accuracy.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132515155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
QuARK: Quality-configurable approximate STT-MRAM cache by fine-grained tuning of reliability-energy knobs 夸克:质量可配置的近似STT-MRAM缓存通过可靠性-能量旋钮的细粒度调整
Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009198
Amir Mahdi Hosseini Monazzah, Majid Namaki-Shoushtari, S. Miremadi, A. Rahmani, N. Dutt
Emerging STT-MRAM memories are promising alternatives for SRAM memories to tackle their low density and high static power consumption, but impose high energy consumption for reliable read/write operations. However, absolute data integrity is not required for many approximate computing applications, allowing energy savings with minimal quality loss. This paper proposes QuARK, a hardware/software approach for trading reliability of STT-MRAM caches for energy savings in the on-chip memory hierarchy of multi- and many-core systems running approximate applications. In contrast to SRAM-based cache-way-level actuators, QuARK utilizes fine-grained cache-line-level actuation knobs with different levels of reliability for individual read and write accesses which are unique to STT-MRAM and suitable for systems running multiple applications with mixed accuracy sensitivity, thus avoiding interapplication actuation interference. Our experimental results with a set of recognition, mining and synthesis (RMS) benchmarks demonstrate up to 40% energy savings over a fully-protected STT-MRAM cache, with negligible loss in the quality of the generated outputs.
新兴的STT-MRAM存储器是SRAM存储器的有希望的替代品,可以解决其低密度和高静态功耗的问题,但对可靠的读/写操作施加了高能耗。然而,对于许多近似计算应用程序来说,绝对的数据完整性是不需要的,这样可以在最小化质量损失的情况下节省能源。本文提出了QuARK,这是一种硬件/软件方法,用于在运行近似应用的多核和多核系统的片上存储器层次中交换STT-MRAM缓存的可靠性以节省能源。与基于sram的缓存路径级执行器相比,QuARK采用细粒度的缓存行级执行旋钮,具有不同级别的可靠性,用于单独的读写访问,这是STT-MRAM所特有的,适用于运行具有混合精度灵敏度的多个应用程序的系统,从而避免了应用程序间的执行干扰。我们使用一组识别、挖掘和合成(RMS)基准测试的实验结果表明,与完全保护的STT-MRAM缓存相比,可节省高达40%的能源,而生成输出的质量损失可以忽略不计。
{"title":"QuARK: Quality-configurable approximate STT-MRAM cache by fine-grained tuning of reliability-energy knobs","authors":"Amir Mahdi Hosseini Monazzah, Majid Namaki-Shoushtari, S. Miremadi, A. Rahmani, N. Dutt","doi":"10.1109/ISLPED.2017.8009198","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009198","url":null,"abstract":"Emerging STT-MRAM memories are promising alternatives for SRAM memories to tackle their low density and high static power consumption, but impose high energy consumption for reliable read/write operations. However, absolute data integrity is not required for many approximate computing applications, allowing energy savings with minimal quality loss. This paper proposes QuARK, a hardware/software approach for trading reliability of STT-MRAM caches for energy savings in the on-chip memory hierarchy of multi- and many-core systems running approximate applications. In contrast to SRAM-based cache-way-level actuators, QuARK utilizes fine-grained cache-line-level actuation knobs with different levels of reliability for individual read and write accesses which are unique to STT-MRAM and suitable for systems running multiple applications with mixed accuracy sensitivity, thus avoiding interapplication actuation interference. Our experimental results with a set of recognition, mining and synthesis (RMS) benchmarks demonstrate up to 40% energy savings over a fully-protected STT-MRAM cache, with negligible loss in the quality of the generated outputs.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132246221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1