首页 > 最新文献

ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design最新文献

英文 中文
A New Mismatch-Dependent Low Power Technique with Shadow Match-Line Voltage-Detecting Scheme for CAMs 基于阴影匹配线电压检测的凸轮失匹配低功耗检测技术
Jianwei Zhang, Y. Ye, Bin-Da Liu
A new mismatch-dependent low-power technique is presented for content-addressable memories (CAMs). With a novel shadow match-line voltage-detecting scheme, the word circuits realize fast self-disable of the charging paths in case of mismatches. Since the majority of CAMs words are mismatched, a significant power is reduced with a high search speed. Simulation results show the proposed 256-word times 144-bit ternary CAM, using 0.13-mum 1.2-V CMOS process, achieves 0.51 fJ/bit/search for the word circuit with less than 900 ps search time. The achievement illustrates a 77% energy-delay-product (EDP) reduction as compared to the speed-optimized current-saving scheme
提出了一种新的基于失匹配的低功耗内容寻址存储器技术。该电路采用了一种新颖的阴影匹配线电压检测方案,实现了充电路径在不匹配情况下的快速自禁用。由于大多数CAMs单词是不匹配的,因此在高搜索速度的同时显著降低了功率。仿真结果表明,采用0.13 μ m 1.2 v CMOS工艺的256字144位三元制CAM,在小于900 ps的搜索时间下,实现了字电路的0.51 fJ/bit/搜索。这一成果表明,与速度优化的节电方案相比,能量延迟积(EDP)降低了77%
{"title":"A New Mismatch-Dependent Low Power Technique with Shadow Match-Line Voltage-Detecting Scheme for CAMs","authors":"Jianwei Zhang, Y. Ye, Bin-Da Liu","doi":"10.1145/1165573.1165605","DOIUrl":"https://doi.org/10.1145/1165573.1165605","url":null,"abstract":"A new mismatch-dependent low-power technique is presented for content-addressable memories (CAMs). With a novel shadow match-line voltage-detecting scheme, the word circuits realize fast self-disable of the charging paths in case of mismatches. Since the majority of CAMs words are mismatched, a significant power is reduced with a high search speed. Simulation results show the proposed 256-word times 144-bit ternary CAM, using 0.13-mum 1.2-V CMOS process, achieves 0.51 fJ/bit/search for the word circuit with less than 900 ps search time. The achievement illustrates a 77% energy-delay-product (EDP) reduction as compared to the speed-optimized current-saving scheme","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130223543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Stall Cycle Redistribution in a Transparent Fetch Pipeline 透明获取管道中的失速周期重新分配
Eric L. Hill, Mikko H. Lipasti
Power and power density are now primary design constraints for modern high performance microprocessors. Up to 70% of the dynamic power consumed can be attributed to the clocking system. A consequence of this trend is that clock gating has emerged as both a necessary and efficient method to significantly reduce dynamic power. Transparent pipelining, a recently proposed fine-grain clock gating technique, has the potential to significantly reduce clock power above and beyond conventional pipestage-level clock gating. Previous studies of transparent pipelining have focused on the circuit and implementation-related issues of this approach, while neglecting the broader microarchitectural implications. This paper aims to quantify the microarchitectural opportunities that are afforded by the use of transparent pipelining in a processor's fetch pipeline. We develop a technique, based on stall cycle redistribution, designed to improve the performance of transparent pipelining on fetch and other high utilization pipelines. We show that stall cycle redistribution can dramatically reduce the clocking overhead of an aggressively pipelined cell-like microprocessor
功率和功率密度现在是现代高性能微处理器的主要设计限制。高达70%的动态功耗可归因于时钟系统。这种趋势的结果是时钟门控已经成为一种必要和有效的方法,以显着降低动态功率。透明管道是最近提出的一种细粒度时钟门控技术,它有可能大大降低时钟功耗,超过传统的管道级时钟门控。以前对透明管道的研究主要集中在电路和实现相关的问题上,而忽略了更广泛的微架构含义。本文旨在量化在处理器的获取管道中使用透明管道所提供的微架构机会。我们开发了一种基于失速周期再分配的技术,旨在提高透明管道在读取和其他高利用率管道上的性能。我们的研究表明,失速周期的重新分配可以显著降低具有侵略性的流水线式单元微处理器的时钟开销
{"title":"Stall Cycle Redistribution in a Transparent Fetch Pipeline","authors":"Eric L. Hill, Mikko H. Lipasti","doi":"10.1145/1165573.1165583","DOIUrl":"https://doi.org/10.1145/1165573.1165583","url":null,"abstract":"Power and power density are now primary design constraints for modern high performance microprocessors. Up to 70% of the dynamic power consumed can be attributed to the clocking system. A consequence of this trend is that clock gating has emerged as both a necessary and efficient method to significantly reduce dynamic power. Transparent pipelining, a recently proposed fine-grain clock gating technique, has the potential to significantly reduce clock power above and beyond conventional pipestage-level clock gating. Previous studies of transparent pipelining have focused on the circuit and implementation-related issues of this approach, while neglecting the broader microarchitectural implications. This paper aims to quantify the microarchitectural opportunities that are afforded by the use of transparent pipelining in a processor's fetch pipeline. We develop a technique, based on stall cycle redistribution, designed to improve the performance of transparent pipelining on fetch and other high utilization pipelines. We show that stall cycle redistribution can dramatically reduce the clocking overhead of an aggressively pipelined cell-like microprocessor","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122469229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust Level Converter Design for Sub-threshold Logic 亚阈值逻辑的鲁棒电平变换器设计
I. Chang, Jae-Joon Kim, K. Roy
The large supply voltage difference between sub-threshold core logic and I/O makes it extremely challenging to convert signals from core circuit to I/O circuit. In this paper, we propose two novel circuits, clock synchronizer and reduced swing inverter to design dynamic and static level converters for sub-threshold logic. Circuit simulations shows that our level converters work at frequency > 500kHz between 20degC and 40degC with a supply voltage of 0.25V
亚阈值核心逻辑和I/O之间的大电压差使得信号从核心电路转换到I/O电路极具挑战性。本文提出了时钟同步器和减摆幅逆变器两种新颖的电路来设计亚阈值逻辑的动态电平变换器和静态电平变换器。电路仿真表明,我们的电平变换器工作频率为> 500kHz,在20°c和40°c之间,电源电压为0.25V
{"title":"Robust Level Converter Design for Sub-threshold Logic","authors":"I. Chang, Jae-Joon Kim, K. Roy","doi":"10.1145/1165573.1165579","DOIUrl":"https://doi.org/10.1145/1165573.1165579","url":null,"abstract":"The large supply voltage difference between sub-threshold core logic and I/O makes it extremely challenging to convert signals from core circuit to I/O circuit. In this paper, we propose two novel circuits, clock synchronizer and reduced swing inverter to design dynamic and static level converters for sub-threshold logic. Circuit simulations shows that our level converters work at frequency > 500kHz between 20degC and 40degC with a supply voltage of 0.25V","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130373609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
A Low Power Viterbi Decoder Implementation using Scarce State Transition and Path Pruning Scheme for High Throughput Wireless Applications 基于稀疏状态转换和路径修剪的低功耗Viterbi解码器实现
Jie Jin, C. Tsui
This paper presents a low power Viterbi decoder design based on scarce state transition (SST). We propose an approach which seamlessly integrates the path pruning techniques with the SST decoding to reduce the average add-compare-select (ACS) computation. The scheme has very low overhead and is practical for implementation. We also propose an uneven-partitioned memory architecture for the survivor memory unit to reduce the memory access power during the trace back operation. The proposed decoder is implemented in SMIC 0.18mum CMOS process. Simulation results show that significant power consumption reduction can be achieved for high throughput wireless systems such as MB-OFDM ultra-wide-band applications
提出了一种基于稀缺状态转换(SST)的低功耗维特比译码器设计。我们提出了一种将路径修剪技术与SST解码无缝集成的方法,以减少平均添加比较选择(ACS)计算。该方案开销很低,易于实现。我们还提出了幸存者内存单元的非均匀分区内存架构,以减少追溯操作期间的内存访问功率。该解码器采用中芯0.18 μ m CMOS工艺实现。仿真结果表明,对于MB-OFDM超宽带应用等高吞吐量无线系统,可以实现显著的功耗降低
{"title":"A Low Power Viterbi Decoder Implementation using Scarce State Transition and Path Pruning Scheme for High Throughput Wireless Applications","authors":"Jie Jin, C. Tsui","doi":"10.1145/1165573.1165673","DOIUrl":"https://doi.org/10.1145/1165573.1165673","url":null,"abstract":"This paper presents a low power Viterbi decoder design based on scarce state transition (SST). We propose an approach which seamlessly integrates the path pruning techniques with the SST decoding to reduce the average add-compare-select (ACS) computation. The scheme has very low overhead and is practical for implementation. We also propose an uneven-partitioned memory architecture for the survivor memory unit to reduce the memory access power during the trace back operation. The proposed decoder is implemented in SMIC 0.18mum CMOS process. Simulation results show that significant power consumption reduction can be achieved for high throughput wireless systems such as MB-OFDM ultra-wide-band applications","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116895458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Low Power Light-weight Embedded Systems 低功耗轻量级嵌入式系统
M. Sarrafzadeh, F. Dabiri, R. Jafari, T. Massey, A. Nahapetian
Light-weight embedded systems are now gaining more popularity due to the recent technological advances in fabrication that have resulted in more powerful tiny processors with greater communication capabilities that pose various scientific challenges for researchers. Perhaps the most significant challenge is the energy consumption concern and reliability, mainly due to the small size of batteries. In this tutorial, we portray a brief description of low-power, light-weight embedded systems, depict several power profiling studies previously conducted, and present several research challenges that require low-power consumption in embedded systems. For each challenge, we highlight how low-power designs may enhance the overall performance of the system. Finally, we present a several techniques that minimize the power consumption in such systems
由于最近在制造方面的技术进步,轻量级嵌入式系统现在越来越受欢迎,这些技术进步导致更强大的微型处理器具有更大的通信能力,这给研究人员带来了各种科学挑战。也许最大的挑战是能源消耗问题和可靠性,主要是由于电池的小尺寸。在本教程中,我们简要描述了低功耗、轻量级嵌入式系统,描述了以前进行的一些功耗分析研究,并提出了嵌入式系统中需要低功耗的几个研究挑战。对于每个挑战,我们都强调了低功耗设计如何提高系统的整体性能。最后,我们提出了一些在这种系统中最小化功耗的技术
{"title":"Low Power Light-weight Embedded Systems","authors":"M. Sarrafzadeh, F. Dabiri, R. Jafari, T. Massey, A. Nahapetian","doi":"10.1145/1165573.1165623","DOIUrl":"https://doi.org/10.1145/1165573.1165623","url":null,"abstract":"Light-weight embedded systems are now gaining more popularity due to the recent technological advances in fabrication that have resulted in more powerful tiny processors with greater communication capabilities that pose various scientific challenges for researchers. Perhaps the most significant challenge is the energy consumption concern and reliability, mainly due to the small size of batteries. In this tutorial, we portray a brief description of low-power, light-weight embedded systems, depict several power profiling studies previously conducted, and present several research challenges that require low-power consumption in embedded systems. For each challenge, we highlight how low-power designs may enhance the overall performance of the system. Finally, we present a several techniques that minimize the power consumption in such systems","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124436715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Process Variation Aware Cache Leakage Management 进程变化感知缓存泄漏管理
Ke Meng, R. Joseph
In a few technology generations, limitations of fabrication processes have made accurate design time power estimates a daunting challenge. Static leakage current which comprises a significant fraction of total power due to large on-chip caches, is exponentially dependent on widely varying physical parameters such as gate length, gate oxide thickness, and dopant ion concentration. In large structures like on-chip caches, this may mean that one portion of a cache may consume an order of magnitude larger static power than equivalently sized regions. Under this climate, egalitarian management of physical resources is clearly untenable. In this paper, we analyze the effects of within-die and die-to-die leakage variation for on-chip caches. We then propose way prioritization, a manufacturing variation aware scheme that minimizes cache leakage energy. Our results show that significant average power reductions are possible without undue hardware complexity or performance compromise
在几代技术中,制造工艺的局限性使得精确的设计时间功率估计成为一项艰巨的挑战。由于片上高速缓存,静态泄漏电流占总功率的很大一部分,它与栅极长度、栅极氧化物厚度和掺杂离子浓度等广泛变化的物理参数呈指数关系。在像片上缓存这样的大型结构中,这可能意味着缓存的一部分可能比同等大小的区域消耗一个数量级的静态功率。在这种氛围下,对物质资源的平等主义管理显然是站不住脚的。在本文中,我们分析了模内和模间泄漏变化对片上高速缓存的影响。然后,我们提出了方法优先化,这是一种制造变化感知方案,可以最大限度地减少缓存泄漏能量。我们的结果表明,在不损害硬件复杂性或性能的情况下,显著的平均功耗降低是可能的
{"title":"Process Variation Aware Cache Leakage Management","authors":"Ke Meng, R. Joseph","doi":"10.1145/1165573.1165636","DOIUrl":"https://doi.org/10.1145/1165573.1165636","url":null,"abstract":"In a few technology generations, limitations of fabrication processes have made accurate design time power estimates a daunting challenge. Static leakage current which comprises a significant fraction of total power due to large on-chip caches, is exponentially dependent on widely varying physical parameters such as gate length, gate oxide thickness, and dopant ion concentration. In large structures like on-chip caches, this may mean that one portion of a cache may consume an order of magnitude larger static power than equivalently sized regions. Under this climate, egalitarian management of physical resources is clearly untenable. In this paper, we analyze the effects of within-die and die-to-die leakage variation for on-chip caches. We then propose way prioritization, a manufacturing variation aware scheme that minimizes cache leakage energy. Our results show that significant average power reductions are possible without undue hardware complexity or performance compromise","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"20 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124544651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
Power Reduction in an H.264 Encoder Through Algorithmic and Logic Transformations 通过算法和逻辑转换降低H.264编码器的功耗
M. Koziri, G. Stamoulis, I. Katsavounidis
The H.264 video coding standard can achieve considerably higher coding efficiency than previous video coding standards. The keys to this high coding efficiency are the two prediction modes (intra & inter) provided by H.264. Unfortunately, these result in a considerably higher encoder complexity that adversely affects speed and power, which are both significant for the mobile multimedia applications targeted by the standard. Therefore, it is of high importance to design architectures that minimize the speed and power overhead of the prediction modes. In this paper we present a new algorithm, and the logic transformations that enable it, that can replace the standard sum of absolute differences (SAD) approach in the two main prediction modes, and provide a power efficient hardware implementation without perceivable degradation in coding efficiency or video quality
H.264视频编码标准可以实现比以前的视频编码标准更高的编码效率。H.264提供的两种预测模式(intra和inter)是实现高编码效率的关键。不幸的是,这导致了相当高的编码器复杂性,从而对速度和功率产生不利影响,这对于标准所针对的移动多媒体应用程序来说都是非常重要的。因此,设计使预测模式的速度和功耗开销最小化的架构是非常重要的。在本文中,我们提出了一种新的算法,以及实现它的逻辑转换,它可以取代两种主要预测模式中的标准绝对差和(SAD)方法,并提供了一种节能的硬件实现,而不会在编码效率或视频质量上出现明显的下降
{"title":"Power Reduction in an H.264 Encoder Through Algorithmic and Logic Transformations","authors":"M. Koziri, G. Stamoulis, I. Katsavounidis","doi":"10.1145/1165573.1165598","DOIUrl":"https://doi.org/10.1145/1165573.1165598","url":null,"abstract":"The H.264 video coding standard can achieve considerably higher coding efficiency than previous video coding standards. The keys to this high coding efficiency are the two prediction modes (intra & inter) provided by H.264. Unfortunately, these result in a considerably higher encoder complexity that adversely affects speed and power, which are both significant for the mobile multimedia applications targeted by the standard. Therefore, it is of high importance to design architectures that minimize the speed and power overhead of the prediction modes. In this paper we present a new algorithm, and the logic transformations that enable it, that can replace the standard sum of absolute differences (SAD) approach in the two main prediction modes, and provide a power efficient hardware implementation without perceivable degradation in coding efficiency or video quality","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114728468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Selective Writeback: Exploiting Transient Values for Energy-Efficiency and Performance 选择性回写:利用瞬态值提高能源效率和性能
D. Balkan, J. Sharkey, D. Ponomarev, K. Ghose
Today's superscalar microprocessors use large, heavily-ported physical register files (RFs) to increase the instruction throughput. The high complexity and power dissipation of such RFs mainly stem from the need to maintain each and every result for a large number of cycles after the result generation. We observed that a significant fraction (about 45%) of the result values are delivered to their consumers via the bypass network (consumed "on-the-fly") and are never read out from the destination registers. In this paper, we first formulate conditions for identifying such transient values and describe their microarchitectural implementation; then we propose a technique to avoid the writeback of such transient values into the RF. With 64-entry integer and floating point register files, our technique achieves an 11% performance improvement and 29% reduction in the RF energy consumption compared to the baseline machine with the same number of registers. Furthermore, for the same performance target, the selective writeback scheme results in a 38% reduction in the energy consumption of the RF compared to the baseline machine
今天的超标量微处理器使用大的、重移植的物理寄存器文件(RFs)来增加指令吞吐量。这种rf的高复杂性和高功耗主要源于需要在结果生成后的大量周期内维护每个结果。我们观察到,相当一部分(约45%)的结果值通过旁路网络(“在运行中”消耗)传递给它们的消费者,并且从未从目标寄存器中读出。在本文中,我们首先制定了识别这些暂态值的条件,并描述了它们的微架构实现;然后,我们提出了一种技术来避免将这些瞬态值回写到RF中。使用64项整数和浮点寄存器文件,与具有相同数量寄存器的基准机器相比,我们的技术实现了11%的性能提升和29%的射频能耗降低。此外,对于相同的性能目标,与基准机器相比,选择性回写方案使RF的能耗降低了38%
{"title":"Selective Writeback: Exploiting Transient Values for Energy-Efficiency and Performance","authors":"D. Balkan, J. Sharkey, D. Ponomarev, K. Ghose","doi":"10.1145/1165573.1165584","DOIUrl":"https://doi.org/10.1145/1165573.1165584","url":null,"abstract":"Today's superscalar microprocessors use large, heavily-ported physical register files (RFs) to increase the instruction throughput. The high complexity and power dissipation of such RFs mainly stem from the need to maintain each and every result for a large number of cycles after the result generation. We observed that a significant fraction (about 45%) of the result values are delivered to their consumers via the bypass network (consumed \"on-the-fly\") and are never read out from the destination registers. In this paper, we first formulate conditions for identifying such transient values and describe their microarchitectural implementation; then we propose a technique to avoid the writeback of such transient values into the RF. With 64-entry integer and floating point register files, our technique achieves an 11% performance improvement and 29% reduction in the RF energy consumption compared to the baseline machine with the same number of registers. Furthermore, for the same performance target, the selective writeback scheme results in a 38% reduction in the energy consumption of the RF compared to the baseline machine","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126133933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Pulsed Low-Voltage Swing Latch for Reduced Power Dissipation in High-Frequency Microprocessors 用于降低高频微处理器功耗的脉冲低压摆锁存器
P. Lu, N. Cao, L. Sigal, P. Woltgens, R. Robertazzi, D. Heidel
We have reported previously (Pong-Fei Lu et al., 2004) a low-swing latch (LSL) with superior performance-power tradeoff compared to the conventional pass-gate master-slave latch. In this paper, hardware results are presented for the proposed LSL with pulsed clock waveforms. The motivation is to combine low-voltage swing with pulsed signals to further reduce overall system power in high-frequency microprocessors. We have designed a 65-bit accumulator loop experiment to mimic a microprocessor pipeline stage. The local clock buffer design features a mode switch to toggle between two-phase (c1/c2) master-slave clocking and one-phase pulsed (c2 only) clocking. Our data show that 15-25% system power saving can be achieved in pulsed mode compared to non-pulsed mode. Power contribution from individual components is also presented
我们之前报道过(Pong-Fei Lu et al., 2004)一种低摆幅锁存器(LSL),与传统的通闸主从锁存器相比,具有优越的性能-功率权衡。本文给出了采用脉冲时钟波形的LSL的硬件结果。其动机是将低压摆幅与脉冲信号相结合,以进一步降低高频微处理器的整体系统功率。我们设计了一个65位累加器环路实验来模拟微处理器流水线阶段。本地时钟缓冲器设计的特点是模式切换,可以在两相(c1/c2)主从时钟和单相脉冲(仅c2)时钟之间切换。我们的数据表明,与非脉冲模式相比,脉冲模式可以节省15-25%的系统功率。同时给出了各个部件的功率贡献
{"title":"A Pulsed Low-Voltage Swing Latch for Reduced Power Dissipation in High-Frequency Microprocessors","authors":"P. Lu, N. Cao, L. Sigal, P. Woltgens, R. Robertazzi, D. Heidel","doi":"10.1145/1165573.1165593","DOIUrl":"https://doi.org/10.1145/1165573.1165593","url":null,"abstract":"We have reported previously (Pong-Fei Lu et al., 2004) a low-swing latch (LSL) with superior performance-power tradeoff compared to the conventional pass-gate master-slave latch. In this paper, hardware results are presented for the proposed LSL with pulsed clock waveforms. The motivation is to combine low-voltage swing with pulsed signals to further reduce overall system power in high-frequency microprocessors. We have designed a 65-bit accumulator loop experiment to mimic a microprocessor pipeline stage. The local clock buffer design features a mode switch to toggle between two-phase (c1/c2) master-slave clocking and one-phase pulsed (c2 only) clocking. Our data show that 15-25% system power saving can be achieved in pulsed mode compared to non-pulsed mode. Power contribution from individual components is also presented","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125173432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Energy/Power Breakdown of Pipelined Nanometer Caches (90nm/65nm/45nm/32nm) 流水线纳米缓存(90nm/65nm/45nm/32nm)的能量/功率分解
Samuel Rodríguez, B. Jacob
As transistors continue to scale down into the nanometer regime, device leakage currents are becoming the dominant cause of power dissipation in nanometer caches, making it essential to model these leakage effects properly. Moreover, typical microprocessor caches are pipelined to keep up with the speed of the processor, and the effects of pipelining overhead need to be properly accounted for. In this paper, we present a detailed study of pipelined nanometer caches with detailed energy/power dissipation breakdowns showing where and how the power is dissipated within a nanometer cache. We explore a three-dimensional pipelined cache design space that includes cache size (16kB to 512kB), cache associativity (direct-mapped to 16-way) and process technology (90nm, 65nm, 45nm and 32nm). Among our findings, we show that cache bitline leakage is increasingly becoming the dominant cause of power dissipation in nanometer technology nodes. We show that subthreshold leakage is the main cause of static power dissipation, and that gate leakage is, surprisingly, not a significant contributor to total cache power, even for 32nm caches. We also show that accounting for cache pipelining overhead is necessary, as power dissipated by the pipeline elements is a significant part of cache power
随着晶体管继续缩小到纳米级,器件泄漏电流正在成为纳米缓存中功耗的主要原因,因此正确模拟这些泄漏效应至关重要。此外,典型的微处理器缓存是流水线的,以跟上处理器的速度,流水线开销的影响需要适当地考虑。在本文中,我们对流水线纳米缓存进行了详细的研究,并提供了详细的能量/功耗分解,显示了纳米缓存中功率的耗散位置和方式。我们探索了一个三维流水线缓存设计空间,包括缓存大小(16kB到512kB)、缓存关联性(直接映射到16路)和工艺技术(90nm、65nm、45nm和32nm)。在我们的研究结果中,我们发现缓存位线泄漏越来越成为纳米技术节点功耗的主要原因。我们表明,亚阈值泄漏是静态功耗的主要原因,并且栅极泄漏,令人惊讶的是,即使对于32nm缓存,也不是总缓存功耗的重要贡献者。我们还表明,考虑缓存管道开销是必要的,因为管道元素消耗的功率是缓存功率的重要组成部分
{"title":"Energy/Power Breakdown of Pipelined Nanometer Caches (90nm/65nm/45nm/32nm)","authors":"Samuel Rodríguez, B. Jacob","doi":"10.1145/1165573.1165581","DOIUrl":"https://doi.org/10.1145/1165573.1165581","url":null,"abstract":"As transistors continue to scale down into the nanometer regime, device leakage currents are becoming the dominant cause of power dissipation in nanometer caches, making it essential to model these leakage effects properly. Moreover, typical microprocessor caches are pipelined to keep up with the speed of the processor, and the effects of pipelining overhead need to be properly accounted for. In this paper, we present a detailed study of pipelined nanometer caches with detailed energy/power dissipation breakdowns showing where and how the power is dissipated within a nanometer cache. We explore a three-dimensional pipelined cache design space that includes cache size (16kB to 512kB), cache associativity (direct-mapped to 16-way) and process technology (90nm, 65nm, 45nm and 32nm). Among our findings, we show that cache bitline leakage is increasingly becoming the dominant cause of power dissipation in nanometer technology nodes. We show that subthreshold leakage is the main cause of static power dissipation, and that gate leakage is, surprisingly, not a significant contributor to total cache power, even for 32nm caches. We also show that accounting for cache pipelining overhead is necessary, as power dissipated by the pipeline elements is a significant part of cache power","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124666362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
期刊
ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1