首页 > 最新文献

ACM Great Lakes Symposium on VLSI最新文献

英文 中文
A new methodology for reduced cost of resilience 一种降低弹性成本的新方法
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591600
A. Kahng, Seokhyeong Kang, Jiajia Li
Resilient design techniques are used to (i) ensure correct operation under dynamic variations; and (ii) improve design performance (e.g., through timing speculation). However, significant overheads (e.g., 17% and 15% energy penalties due to throughput degradation and additional circuits) are incurred by existing resilient design techniques. For instance, resilient designs require additional circuits to detect and correct timing errors. Further, when there is an error, the additional cycles needed to restore a previous correct state degrade throughput, which diminishes the performance benefit of using resilient designs. In this work, we propose a methodology for resilient design implementation to minimize the costs of resilience in terms of power, area and throughput degradation. Our methodology uses two levers: selective-endpoint optimization (i.e., sensitivity-based margin insertion) and clock skew optimization. We integrate the two optimization techniques in an iterative optimization flow which comprehends toggle rate information and the tradeoff between cost of resilience and margin on combinational paths. Our proposed flow achieves energy reductions of up to 19% and 21% compared to a conventional design (with only margin used to attain robustness) and a brute-force implementation, respectively. These benefits increase in the context of an adaptive voltage scaling strategy.
弹性设计技术用于(i)确保在动态变化下的正确操作;(ii)提高设计性能(例如,通过时间推测)。然而,现有的弹性设计技术产生了显著的开销(例如,由于吞吐量下降和额外电路而导致的17%和15%的能量损失)。例如,弹性设计需要额外的电路来检测和纠正定时错误。此外,当出现错误时,恢复先前正确状态所需的额外周期会降低吞吐量,从而降低使用弹性设计的性能优势。在这项工作中,我们提出了一种弹性设计实施方法,以最大限度地减少弹性在功率,面积和吞吐量退化方面的成本。我们的方法使用两个杠杆:选择性端点优化(即,基于灵敏度的边际插入)和时钟倾斜优化。我们将两种优化技术整合在一个迭代优化流程中,该流程理解切换率信息以及组合路径上弹性成本和边际之间的权衡。与传统设计(仅用于获得稳健性)和强力实施相比,我们提出的流程分别实现了19%和21%的能耗降低。这些好处在自适应电压缩放策略的背景下增加。
{"title":"A new methodology for reduced cost of resilience","authors":"A. Kahng, Seokhyeong Kang, Jiajia Li","doi":"10.1145/2591513.2591600","DOIUrl":"https://doi.org/10.1145/2591513.2591600","url":null,"abstract":"Resilient design techniques are used to (i) ensure correct operation under dynamic variations; and (ii) improve design performance (e.g., through timing speculation). However, significant overheads (e.g., 17% and 15% energy penalties due to throughput degradation and additional circuits) are incurred by existing resilient design techniques. For instance, resilient designs require additional circuits to detect and correct timing errors. Further, when there is an error, the additional cycles needed to restore a previous correct state degrade throughput, which diminishes the performance benefit of using resilient designs. In this work, we propose a methodology for resilient design implementation to minimize the costs of resilience in terms of power, area and throughput degradation. Our methodology uses two levers: selective-endpoint optimization (i.e., sensitivity-based margin insertion) and clock skew optimization. We integrate the two optimization techniques in an iterative optimization flow which comprehends toggle rate information and the tradeoff between cost of resilience and margin on combinational paths. Our proposed flow achieves energy reductions of up to 19% and 21% compared to a conventional design (with only margin used to attain robustness) and a brute-force implementation, respectively. These benefits increase in the context of an adaptive voltage scaling strategy.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125381922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A feasibility study on robust programmable delay element design based on neuron-MOS mechanism 基于神经元- mos机制的鲁棒可编程延迟元件设计可行性研究
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591591
Renyuan Zhang, M. Kaneko
The feasibility of programmable delay elements (PDEs) design based on Neuron-MOS mechanism is investigated in this work. By applying the capacitor coupling technology, the charging/discharging current of a clock buffer can be digitally programmed to generate various switching delay without static power consumption. No any additional transistor is introduced into the charging/discharging path, that reduces the performance fluctuation due to process variations for MOS transistors. From the circuit simulation results, the delay change of proposed PDE is less than one third compared to that of the conventional PDE circuits. In order to reduce the temperature sensitivity, another Neuron-MOS-based PDE circuit is also suggested by employing a temperature insensitive reference-current-generator. This type of PDE circuit achieves a delay change within 0.1% when the temperature fluctuates from 25 to 75 degree. In general, both types of suggested PDE circuits achieve better or fair performances over the robustness, power consumption and delay range.
研究了基于神经元- mos机制的可编程延迟元件(PDEs)设计的可行性。通过应用电容耦合技术,可以对时钟缓冲器的充放电电流进行数字编程,产生各种开关延迟,而不需要静态功耗。没有任何额外的晶体管被引入到充电/放电路径中,这减少了由于MOS晶体管的工艺变化而导致的性能波动。从电路仿真结果来看,与传统的PDE电路相比,所提出的PDE电路的延迟变化小于三分之一。为了降低温度敏感性,还提出了另一种基于神经元- mos的PDE电路,该电路采用温度不敏感基准电流发生器。当温度从25到75度波动时,这种类型的PDE电路实现0.1%的延迟变化。总的来说,两种建议的PDE电路在鲁棒性、功耗和延迟范围上都取得了更好或公平的性能。
{"title":"A feasibility study on robust programmable delay element design based on neuron-MOS mechanism","authors":"Renyuan Zhang, M. Kaneko","doi":"10.1145/2591513.2591591","DOIUrl":"https://doi.org/10.1145/2591513.2591591","url":null,"abstract":"The feasibility of programmable delay elements (PDEs) design based on Neuron-MOS mechanism is investigated in this work. By applying the capacitor coupling technology, the charging/discharging current of a clock buffer can be digitally programmed to generate various switching delay without static power consumption. No any additional transistor is introduced into the charging/discharging path, that reduces the performance fluctuation due to process variations for MOS transistors. From the circuit simulation results, the delay change of proposed PDE is less than one third compared to that of the conventional PDE circuits. In order to reduce the temperature sensitivity, another Neuron-MOS-based PDE circuit is also suggested by employing a temperature insensitive reference-current-generator. This type of PDE circuit achieves a delay change within 0.1% when the temperature fluctuates from 25 to 75 degree. In general, both types of suggested PDE circuits achieve better or fair performances over the robustness, power consumption and delay range.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125129835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Create, then innovate 先创造,再创新
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2597171
Gene A. Frantz
Innovation seems to be a measure of success for most technologists. We are proud of our innovations which have significantly contributed to society. But we seem not to speak much about creativity and how it relates to innovation. Is creativity part of the innovation process? Or is the innovation process the result of creativity? This talk will suggest an interesting set of definitions that put these two concepts into perspective. Examples will be shown that will support this proposed definitions.
对大多数技术人员来说,创新似乎是衡量成功的标准。我们为我们的创新为社会做出了重大贡献而感到自豪。但我们似乎不怎么谈论创造力以及它与创新的关系。创造力是创新过程的一部分吗?还是创新过程是创造力的结果?这次演讲将提出一组有趣的定义,将这两个概念放在一起。将展示支持这一建议定义的示例。
{"title":"Create, then innovate","authors":"Gene A. Frantz","doi":"10.1145/2591513.2597171","DOIUrl":"https://doi.org/10.1145/2591513.2597171","url":null,"abstract":"Innovation seems to be a measure of success for most technologists. We are proud of our innovations which have significantly contributed to society. But we seem not to speak much about creativity and how it relates to innovation. Is creativity part of the innovation process? Or is the innovation process the result of creativity? This talk will suggest an interesting set of definitions that put these two concepts into perspective. Examples will be shown that will support this proposed definitions.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An area efficient low power high speed S-Box implementation using power-gated PLA 采用功率门控PLA的面积高效低功耗高速S-Box实现
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591575
Ho Joon Lee, Yong-Bin Kim
Advanced Encryption Standard (AES) is one of the most common symmetric encryption algorithms. The hardware complexity in AES is dominated by AES substitution box (S-Box), which is considered as one of the most complicated and costly part of the system because it is the only non-linear structure. This paper presents a low power design of Rijndael S-Box for the SubByte transformation using power-gating and PLA design techniques to reduce area and leakage power during stand-by mode. The proposed design is implemented using 110nm standard CMOS process with 1.2V power supply. The proposed design reduces the total leakage power and the total transistor count to 10% and 50% of the conventional design, respectively while improving the speed performance by ten times.
高级加密标准AES (Advanced Encryption Standard)是最常用的对称加密算法之一。AES的硬件复杂度主要由AES替换盒(S-Box)控制,它是AES系统中唯一的非线性结构,被认为是系统中最复杂和最昂贵的部分之一。本文提出了一种用于子字节转换的Rijndael S-Box的低功耗设计,采用功率门控和PLA设计技术来减少待机模式下的面积和泄漏功率。本设计采用110nm标准CMOS工艺和1.2V电源实现。该设计将总泄漏功率和总晶体管数分别降低到传统设计的10%和50%,同时将速度性能提高了10倍。
{"title":"An area efficient low power high speed S-Box implementation using power-gated PLA","authors":"Ho Joon Lee, Yong-Bin Kim","doi":"10.1145/2591513.2591575","DOIUrl":"https://doi.org/10.1145/2591513.2591575","url":null,"abstract":"Advanced Encryption Standard (AES) is one of the most common symmetric encryption algorithms. The hardware complexity in AES is dominated by AES substitution box (S-Box), which is considered as one of the most complicated and costly part of the system because it is the only non-linear structure. This paper presents a low power design of Rijndael S-Box for the SubByte transformation using power-gating and PLA design techniques to reduce area and leakage power during stand-by mode. The proposed design is implemented using 110nm standard CMOS process with 1.2V power supply. The proposed design reduces the total leakage power and the total transistor count to 10% and 50% of the conventional design, respectively while improving the speed performance by ten times.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"97 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131221195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A qualitative simulation approach for verifying PLL locking property 一种验证锁相环锁定特性的定性仿真方法
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591593
Ibtissem Seghaier, Henda Aridhi, M. Zaki, S. Tahar
Simulation cannot give a full coverage of Phase Locked Loop (PLL) behavior in presence of process variation, jitter and varying initial conditions. Qualitative Simulation is an attracting method that computes behavior envelopes for dynamical systems over continuous ranges of their parameters. Therefore, this method can be employed to verify PLLs locking property given a model that encompasses their imperfections. Extended System of Recurrence Equations (ESREs) offer a unified modeling language to model analog and digital PLLs components. In this paper, an ESRE model is created for both PLLs and their imperfections. Then, a modified qualitative simulation algorithm is used to guarantee that the PLL locking time is sound for every possible initial condition and parameter value. We used our approach to analyze a Charge Pump-PLL for a $0.18mu m$ fabrication process and in the presence of jitter and initial conditions uncertainties. The obtained results show an improvement of simulation coverage by computing the minimum locking time and predicting a non locking case that statistical simulation technique fails to detect.
在过程变化、抖动和初始条件变化的情况下,仿真不能完全覆盖锁相环(PLL)的行为。定性模拟是一种计算动态系统在其参数连续范围内的行为包络的方法。因此,这种方法可以用来验证锁相环锁定属性给定一个模型,包括他们的缺陷。扩展递归方程系统(ESREs)为模拟和数字锁相环元件的建模提供了统一的建模语言。本文针对锁相环及其缺陷建立了一个ESRE模型。然后,采用一种改进的定性仿真算法,保证锁相环锁相时间在任何可能的初始条件和参数值下都是合理的。我们使用我们的方法分析了一个电荷泵锁相环,其制造工艺为0.18 μ m,存在抖动和初始条件不确定性。结果表明,通过计算最小锁定时间和预测统计模拟技术无法检测到的非锁定情况,提高了模拟覆盖率。
{"title":"A qualitative simulation approach for verifying PLL locking property","authors":"Ibtissem Seghaier, Henda Aridhi, M. Zaki, S. Tahar","doi":"10.1145/2591513.2591593","DOIUrl":"https://doi.org/10.1145/2591513.2591593","url":null,"abstract":"Simulation cannot give a full coverage of Phase Locked Loop (PLL) behavior in presence of process variation, jitter and varying initial conditions. Qualitative Simulation is an attracting method that computes behavior envelopes for dynamical systems over continuous ranges of their parameters. Therefore, this method can be employed to verify PLLs locking property given a model that encompasses their imperfections. Extended System of Recurrence Equations (ESREs) offer a unified modeling language to model analog and digital PLLs components. In this paper, an ESRE model is created for both PLLs and their imperfections. Then, a modified qualitative simulation algorithm is used to guarantee that the PLL locking time is sound for every possible initial condition and parameter value. We used our approach to analyze a Charge Pump-PLL for a $0.18mu m$ fabrication process and in the presence of jitter and initial conditions uncertainties. The obtained results show an improvement of simulation coverage by computing the minimum locking time and predicting a non locking case that statistical simulation technique fails to detect.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133625594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
TSV power supply array electromigration lifetime analysis in 3D ICS 三维集成电路中TSV电源阵列电迁移寿命分析
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591567
Qiaosha Zou, Zhang Tao, Cong Xu, Yuan Xie
Electromigration (EM) can cause severe reliability issues in contemporary integrated circuits. For the emerging three-dimensional integrated circuits (3D ICs), the introduction of through-silicon vias (TSVs) as the vertical signal carrier complicates the electromigration analysis. In particular, an accurate EM analysis on TSV arrays that are used in the power supply network is critical since the large current going through those TSVs can accelerate their degradation. In this work, we propose a novel EM analysis framework that focuses on TSV arrays in the power supply network, under the circumstance of uneven current distribution. The impacts of various design factors on the EM lifetime are discussed in detail. Our results reveal that the predicted TSV array lifetime is largely biased without proper current distribution analysis, resulting in an unexpected early failure.
在现代集成电路中,电迁移(EM)会导致严重的可靠性问题。对于新兴的三维集成电路(3D ic),引入硅通孔(tsv)作为垂直信号载波使电迁移分析复杂化。特别是,对用于供电网络的TSV阵列进行准确的电磁分析是至关重要的,因为通过这些TSV的大电流会加速它们的退化。在这项工作中,我们提出了一种新的电磁分析框架,主要关注供电网络中电流分布不均匀情况下的TSV阵列。详细讨论了各种设计因素对电磁寿命的影响。我们的研究结果表明,在没有适当的电流分布分析的情况下,TSV阵列的预测寿命存在很大偏差,导致意外的早期失效。
{"title":"TSV power supply array electromigration lifetime analysis in 3D ICS","authors":"Qiaosha Zou, Zhang Tao, Cong Xu, Yuan Xie","doi":"10.1145/2591513.2591567","DOIUrl":"https://doi.org/10.1145/2591513.2591567","url":null,"abstract":"Electromigration (EM) can cause severe reliability issues in contemporary integrated circuits. For the emerging three-dimensional integrated circuits (3D ICs), the introduction of through-silicon vias (TSVs) as the vertical signal carrier complicates the electromigration analysis. In particular, an accurate EM analysis on TSV arrays that are used in the power supply network is critical since the large current going through those TSVs can accelerate their degradation. In this work, we propose a novel EM analysis framework that focuses on TSV arrays in the power supply network, under the circumstance of uneven current distribution. The impacts of various design factors on the EM lifetime are discussed in detail. Our results reveal that the predicted TSV array lifetime is largely biased without proper current distribution analysis, resulting in an unexpected early failure.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115230779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A TSV-cross-link-based approach to 3D-clock network synthesis for improved robustness 一种基于tsv交叉链接的三维时钟网络合成方法,以提高鲁棒性
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591584
Rickard Ewetz, A. Udupa, G. Subbarayan, Cheng-Kok Koh
To obtain high yield for 3D ICs, random open defects, process variations, and thermal induced stress are key issues that must be addressed when synthesizing 3D clock networks. Current research on 3D clock synthesis often focuses on the construction and optimization of a 3D clock tree topology. Moreover, extra circuitry has been proposed to enable pre-bond testing and substitution of through silicon vias (TSVs) with random open defects. However, tree structures inherently have limited robustness to variations and may suffer failures arising from defects and/or process variations. To counter such problems, we propose to use TSVs to add redundancy in a 3D clock network. The proposed 3D network would have a complete 2D clock network on each die, facilitating pre-bond testing. Also, cross links would be inserted within each die using wires and across dies using TSVs to improve timing robustness within each die and across dies, respectively. Moreover, clock buffers are placed outside of zones that have high TSV-induced stress that could influence carrier mobility. Experimental results show that the proposed 3D clock networks have no failures due to random open defects, and on the average have 53% lower skew compared to 3D tree structures.
为了获得高成品率的3D集成电路,随机开放缺陷、工艺变化和热诱发应力是合成3D时钟网络时必须解决的关键问题。目前对三维时钟合成的研究往往集中在三维时钟树拓扑结构的构建和优化上。此外,还提出了额外的电路,以实现键前测试和替换具有随机开放缺陷的硅通孔(tsv)。然而,树形结构固有地对变化具有有限的鲁棒性,并且可能遭受由缺陷和/或过程变化引起的失败。为了解决这些问题,我们建议使用tsv在3D时钟网络中增加冗余。提议的3D网络将在每个芯片上有一个完整的2D时钟网络,便于键合前测试。此外,交叉链接将使用导线插入每个模具内,并使用tsv插入跨模具,以分别提高每个模具内和跨模具内的时序稳健性。此外,时钟缓冲器被放置在具有可能影响载流子迁移率的高tsv诱导应力的区域之外。实验结果表明,所提出的三维时钟网络不存在随机开放缺陷导致的故障,与三维树形结构相比,平均偏差降低53%。
{"title":"A TSV-cross-link-based approach to 3D-clock network synthesis for improved robustness","authors":"Rickard Ewetz, A. Udupa, G. Subbarayan, Cheng-Kok Koh","doi":"10.1145/2591513.2591584","DOIUrl":"https://doi.org/10.1145/2591513.2591584","url":null,"abstract":"To obtain high yield for 3D ICs, random open defects, process variations, and thermal induced stress are key issues that must be addressed when synthesizing 3D clock networks. Current research on 3D clock synthesis often focuses on the construction and optimization of a 3D clock tree topology. Moreover, extra circuitry has been proposed to enable pre-bond testing and substitution of through silicon vias (TSVs) with random open defects. However, tree structures inherently have limited robustness to variations and may suffer failures arising from defects and/or process variations. To counter such problems, we propose to use TSVs to add redundancy in a 3D clock network. The proposed 3D network would have a complete 2D clock network on each die, facilitating pre-bond testing. Also, cross links would be inserted within each die using wires and across dies using TSVs to improve timing robustness within each die and across dies, respectively. Moreover, clock buffers are placed outside of zones that have high TSV-induced stress that could influence carrier mobility. Experimental results show that the proposed 3D clock networks have no failures due to random open defects, and on the average have 53% lower skew compared to 3D tree structures.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115905799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A multi-stage leakage aware resource management technique for reconfigurable architectures 面向可重构体系结构的多级泄漏感知资源管理技术
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591526
Pham Nam Khanh, Ashutosh Kumar Singh, Akash Kumar
Shrinking size of transistors has enabled us to integrate more and more logic elements into FPGA chips leading to higher computing power. However, it also brings serious concern to the leakage power dissipation of the FPGA devices. One of the major reasons for leakage power dissipation in FPGA is the utilization of prefetching technique to minimize the reconfiguration overhead (delay) in Partially Reconfigurable (PR) FPGAs. This technique creates delays between the reconfiguration and execution parts of a task, which may lead up to 44% leakage power of FPGA since the SRAM-cells containing reconfiguration information cannot be powered down. In this work, a resource management approach containing scheduling, placement and post-placement stages has been proposed to address the aforementioned issue. In scheduling stage, a leakage-aware cost function is derived to cope with the leakage power. The placement stage uses a cost function that allows designers to decide a trade-off between performance and leakage-saving. The post-placement stage employs a heuristic approach and shows further improvements. Experiments show that our approach can achieve large leakage savings for both synthetic and real life applications with acceptable extended deadline. Furthermore, different variants of the proposed approach can reduce leakage power by 40-65% when compared to a performance-driven approach and by 15-43% when compared to state-of-the-art works.
晶体管尺寸的缩小使我们能够将越来越多的逻辑元件集成到FPGA芯片中,从而提高计算能力。然而,这也给FPGA器件的泄漏功耗带来了严重的问题。FPGA中泄漏功耗的主要原因之一是在部分可重构(PR) FPGA中利用预取技术来最小化重构开销(延迟)。这种技术在任务的重新配置和执行部分之间造成延迟,由于包含重新配置信息的sram单元无法断电,这可能导致FPGA泄漏功率高达44%。在这项工作中,提出了一种包含调度、安置和安置后阶段的资源管理方法来解决上述问题。在调度阶段,导出了泄漏感知代价函数来处理泄漏功率。放置阶段使用成本函数,使设计人员能够在性能和节省泄漏之间做出权衡。安置后阶段采用启发式方法,并显示出进一步的改进。实验表明,我们的方法可以在可接受的延长期限内实现合成和实际应用中的大量泄漏节约。此外,与性能驱动的方法相比,该方法的不同变体可以减少40-65%的泄漏功率,与最先进的工程相比,可以减少15-43%的泄漏功率。
{"title":"A multi-stage leakage aware resource management technique for reconfigurable architectures","authors":"Pham Nam Khanh, Ashutosh Kumar Singh, Akash Kumar","doi":"10.1145/2591513.2591526","DOIUrl":"https://doi.org/10.1145/2591513.2591526","url":null,"abstract":"Shrinking size of transistors has enabled us to integrate more and more logic elements into FPGA chips leading to higher computing power. However, it also brings serious concern to the leakage power dissipation of the FPGA devices. One of the major reasons for leakage power dissipation in FPGA is the utilization of prefetching technique to minimize the reconfiguration overhead (delay) in Partially Reconfigurable (PR) FPGAs. This technique creates delays between the reconfiguration and execution parts of a task, which may lead up to 44% leakage power of FPGA since the SRAM-cells containing reconfiguration information cannot be powered down. In this work, a resource management approach containing scheduling, placement and post-placement stages has been proposed to address the aforementioned issue. In scheduling stage, a leakage-aware cost function is derived to cope with the leakage power. The placement stage uses a cost function that allows designers to decide a trade-off between performance and leakage-saving. The post-placement stage employs a heuristic approach and shows further improvements. Experiments show that our approach can achieve large leakage savings for both synthetic and real life applications with acceptable extended deadline. Furthermore, different variants of the proposed approach can reduce leakage power by 40-65% when compared to a performance-driven approach and by 15-43% when compared to state-of-the-art works.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"1047 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123144698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A task-oriented vision system 任务导向的视觉系统
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591602
Yang Xiao, K. Irick, J. Sampson, N. Vijaykrishnan, Chuanjun Zhang
Recently, biologically inspired vision systems have been the focus of intense research effort to emulate the high energy-efficiency, performance and robustness of mammalian vision systems. However, previous vision accelerators have only focused on speeding up computationally intense portions of the system without exploiting effects seen in the human brain that demonstrate the task influence in the vision mechanism. In this paper, we propose a task-oriented two-level vision system which is composed of Saliency and SURF. To the best of our knowledge, our design is the first embedded system that utilizes task influence in the computation of visual attention and recognition. As a result, we show that the new system can achieve at most 12.75% accuracy improvement while saving 25% computation work.
近年来,生物学启发的视觉系统一直是研究的焦点,以模仿哺乳动物视觉系统的高能效、高性能和鲁棒性。然而,以前的视觉加速器只专注于加速系统中计算密集型的部分,而没有利用人类大脑中显示的任务影响视觉机制的效应。本文提出了一种由Saliency和SURF组成的面向任务的两级视觉系统。据我们所知,我们的设计是第一个利用任务影响来计算视觉注意和识别的嵌入式系统。结果表明,新系统在节省25%的计算量的同时,精度提高了12.75%。
{"title":"A task-oriented vision system","authors":"Yang Xiao, K. Irick, J. Sampson, N. Vijaykrishnan, Chuanjun Zhang","doi":"10.1145/2591513.2591602","DOIUrl":"https://doi.org/10.1145/2591513.2591602","url":null,"abstract":"Recently, biologically inspired vision systems have been the focus of intense research effort to emulate the high energy-efficiency, performance and robustness of mammalian vision systems. However, previous vision accelerators have only focused on speeding up computationally intense portions of the system without exploiting effects seen in the human brain that demonstrate the task influence in the vision mechanism. In this paper, we propose a task-oriented two-level vision system which is composed of Saliency and SURF. To the best of our knowledge, our design is the first embedded system that utilizes task influence in the computation of visual attention and recognition. As a result, we show that the new system can achieve at most 12.75% accuracy improvement while saving 25% computation work.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125239661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transient analysis of gate inside junctionless transistor (GI-JLT) 栅极内无结晶体管(GI-JLT)瞬态分析
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591557
Pankaj Kumar, P. Kondekar, Sangeeta Singh
In this letter, the transient performance analysis of n-type Gate Inside JunctionLess Transistor (GI-JLT) has been evaluated. 3-D Bohm Quantum Potential (BQP) transport device simulation has been used to evaluate its delay and power dissipation performance. GI-JLT shows better device performance characteristics than GAA-JLT for low power and high frequency applications, because of its larger gate electrostatic control on the device operation.
本文对n型栅内无结晶体管(GI-JLT)的瞬态性能进行了分析。利用三维玻姆量子势(BQP)输运器件仿真对其延迟性能和功耗性能进行了评价。由于GI-JLT对器件工作的栅极静电控制更大,因此在低功率和高频应用中,GI-JLT比GAA-JLT表现出更好的器件性能特征。
{"title":"Transient analysis of gate inside junctionless transistor (GI-JLT)","authors":"Pankaj Kumar, P. Kondekar, Sangeeta Singh","doi":"10.1145/2591513.2591557","DOIUrl":"https://doi.org/10.1145/2591513.2591557","url":null,"abstract":"In this letter, the transient performance analysis of n-type Gate Inside JunctionLess Transistor (GI-JLT) has been evaluated. 3-D Bohm Quantum Potential (BQP) transport device simulation has been used to evaluate its delay and power dissipation performance. GI-JLT shows better device performance characteristics than GAA-JLT for low power and high frequency applications, because of its larger gate electrostatic control on the device operation.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"288 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124156050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
ACM Great Lakes Symposium on VLSI
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1