首页 > 最新文献

ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design最新文献

英文 中文
A Novel Power Optimization Technique for Ultra-Low Power RFICs 一种新的超低功耗射频集成电路功率优化技术
A. Shameli, P. Heydari
This paper presents a novel power optimization technique for ultra-low power (ULP) RFICs. A new figure of merit, namely the gm fT- to-current ratio (gmfT/ID), is defined for a MOS transistor, which accounts for both the unity-gain frequency and current consumption. It is demonstrated both analytically and experimentally that the gmfT/ID reaches its maximum value in moderate inversion region. Next, using the proposed method, a power optimized common-gate low-noise amplifier (LNA) with active load has been designed and fabricated in a CMOS 0.18mum process operating at 950MHz. Measurement results show a noise-figure (NF) of 4.9dB and a small signal gain of 15.6dB with a record-breaking power dissipation of only 100muW
提出了一种新的超低功耗射频集成电路的功率优化技术。为MOS晶体管定义了一个新的性能指标,即gmfT/ID,它同时考虑了单位增益频率和电流消耗。分析和实验均表明,gmfT/ID在中等逆温区达到最大值。然后,利用所提出的方法,设计并制作了一个功率优化的有源负载共门低噪声放大器(LNA),工作频率为950MHz,采用CMOS 0.18mum工艺。测量结果表明,噪声系数(NF)为4.9dB,信号增益为15.6dB,功耗仅为100muW
{"title":"A Novel Power Optimization Technique for Ultra-Low Power RFICs","authors":"A. Shameli, P. Heydari","doi":"10.1145/1165573.1165639","DOIUrl":"https://doi.org/10.1145/1165573.1165639","url":null,"abstract":"This paper presents a novel power optimization technique for ultra-low power (ULP) RFICs. A new figure of merit, namely the g<sub>m </sub>f<sub>T</sub>- to-current ratio (g<sub>m</sub>f<sub>T</sub>/I<sub>D</sub>), is defined for a MOS transistor, which accounts for both the unity-gain frequency and current consumption. It is demonstrated both analytically and experimentally that the g<sub>m</sub>f<sub>T</sub>/I<sub>D</sub> reaches its maximum value in moderate inversion region. Next, using the proposed method, a power optimized common-gate low-noise amplifier (LNA) with active load has been designed and fabricated in a CMOS 0.18mum process operating at 950MHz. Measurement results show a noise-figure (NF) of 4.9dB and a small signal gain of 15.6dB with a record-breaking power dissipation of only 100muW","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125419060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Energy-Efficient Dynamic Instruction Scheduling Logic through Instruction Grouping 基于指令分组的高效动态指令调度逻辑
Hiroshi Sasaki, Masaaki Kondo, Hiroshi Nakamura
Dynamic instruction scheduling logic is quite complex and dissipates significant energy in microprocessors that support superscalar and out-of-order execution. We propose a novel microarchitectural technique to reduce the complexity and energy consumption of the dynamic instruction scheduling logic. The proposed method groups several instructions as a single issue unit and reduces the required number of ports and the size of the structure for dispatch, wakeup, select, and issue. The present paper describes the microarchitecture mechanisms and shows evaluation results for energy savings and performance. These results reveal that the proposed technique can greatly reduce energy with almost no performance degradation, compared to the conventional dynamic instruction scheduling logic
动态指令调度逻辑非常复杂,并且在支持超标量和乱序执行的微处理器中消耗大量能量。为了降低动态指令调度逻辑的复杂度和能耗,提出了一种新的微体系结构技术。该方法将多个指令分组为单个问题单元,减少了分派、唤醒、选择和问题所需的端口数量和结构大小。本文描述了微体系结构机制,并展示了节能和性能的评估结果。这些结果表明,与传统的动态指令调度逻辑相比,该技术可以在几乎没有性能下降的情况下大大降低能量
{"title":"Energy-Efficient Dynamic Instruction Scheduling Logic through Instruction Grouping","authors":"Hiroshi Sasaki, Masaaki Kondo, Hiroshi Nakamura","doi":"10.1145/1165573.1165585","DOIUrl":"https://doi.org/10.1145/1165573.1165585","url":null,"abstract":"Dynamic instruction scheduling logic is quite complex and dissipates significant energy in microprocessors that support superscalar and out-of-order execution. We propose a novel microarchitectural technique to reduce the complexity and energy consumption of the dynamic instruction scheduling logic. The proposed method groups several instructions as a single issue unit and reduces the required number of ports and the size of the structure for dispatch, wakeup, select, and issue. The present paper describes the microarchitecture mechanisms and shows evaluation results for energy savings and performance. These results reveal that the proposed technique can greatly reduce energy with almost no performance degradation, compared to the conventional dynamic instruction scheduling logic","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132365737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Power Reduction of Multiple Disks Using Dynamic Cache Resizing and Speed Control 使用动态缓存调整大小和速度控制的多磁盘功耗降低
Le Cai, Yung-Hsiang Lu
This paper presents an energy-conservation method for multiple disks and their cache memory. Our method periodically resizes the cache memory and controls the rotation speeds under performance constraints. The cache memory stores the data from the disks for reuse. Enlarging the cache memory reduces disk accesses and disk utilization. This allows the disks to reduce their speeds and conserve energy because the disks' power consumption is quadratic to their speeds. However, the cache memory itself consumes power to retain data. Shrinking cache memory can save memory power while increasing disk accesses and degrading performance. Choosing proper cache sizes and rotation speeds can reduce the energy consumption of both memory and disks with satisfactory performance. We model cache resizing and speed setting as an optimization problem with minimizing the power consumption as objective and limiting disk utilization as constraints. We compare our method with the methods resizing cache based on request rates. The simulation results show that our method achieves better energy-savings while limiting disk access latency
提出了一种多磁盘及其缓存的节能方法。我们的方法定期调整缓存内存大小,并在性能限制下控制旋转速度。缓存存储器存储磁盘上的数据,以便重用。增大缓存容量可以减少磁盘访问,降低磁盘利用率。这允许磁盘降低其速度并节省能量,因为磁盘的功耗是其速度的二次元。但是,高速缓存本身需要消耗电力来保存数据。缩小缓存内存可以节省内存功率,但增加磁盘访问并降低性能。选择合适的缓存大小和旋转速度可以在性能满意的情况下减少内存和磁盘的能耗。我们将缓存大小调整和速度设置建模为一个优化问题,以最小化功耗为目标,限制磁盘利用率为约束。我们将我们的方法与基于请求速率调整缓存大小的方法进行比较。仿真结果表明,该方法在限制磁盘访问延迟的同时,实现了较好的节能效果
{"title":"Power Reduction of Multiple Disks Using Dynamic Cache Resizing and Speed Control","authors":"Le Cai, Yung-Hsiang Lu","doi":"10.1145/1165573.1165617","DOIUrl":"https://doi.org/10.1145/1165573.1165617","url":null,"abstract":"This paper presents an energy-conservation method for multiple disks and their cache memory. Our method periodically resizes the cache memory and controls the rotation speeds under performance constraints. The cache memory stores the data from the disks for reuse. Enlarging the cache memory reduces disk accesses and disk utilization. This allows the disks to reduce their speeds and conserve energy because the disks' power consumption is quadratic to their speeds. However, the cache memory itself consumes power to retain data. Shrinking cache memory can save memory power while increasing disk accesses and degrading performance. Choosing proper cache sizes and rotation speeds can reduce the energy consumption of both memory and disks with satisfactory performance. We model cache resizing and speed setting as an optimization problem with minimizing the power consumption as objective and limiting disk utilization as constraints. We compare our method with the methods resizing cache based on request rates. The simulation results show that our method achieves better energy-savings while limiting disk access latency","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121274108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Register File Caching for Energy Efficiency 为能源效率注册文件缓存
Hui Zeng, K. Ghose
With the use of faster clocks and larger instruction windows in high-end superscalar processors, the physical register files (RFs) can no longer be accessed in a single cycle. To combat the consequential performance penalty, the RFs employ multiple levels of bypassing. Register file caching, which caches a small subset of the registers in a faster, smaller structure called the register file cache (RFC) has also been proposed as a remedy for this problem. We introduce a relatively simple RFC design that partitions the RFC into two separate components: a FIFO queue for holding register values that are used over a short duration following their writeback and another small set-associative cache holding values that are likely to be used over a longer duration. Results written to the RFC are easily classified into these categories and the classification bit is also used to predict the nature of the result for the next execution of the same instruction. We show that significant energy savings - about 38% on the average - occurs in accessing register operands when a 28-entry RFC is used, together with a 96-entry RF with no additional bypassing when compared with a base case design that has 128 registers with a 2 cycle access time and having one additional level of bypassing. The performance drop compared against the base case is also negligible (0.3% drop)
随着在高端超标量处理器中使用更快的时钟和更大的指令窗口,物理寄存器文件(RFs)不再能够在单个周期内被访问。为了应对随之而来的性能损失,RFs采用了多级绕过。寄存器文件缓存,它将寄存器的一小部分缓存到一个更快、更小的结构中,称为寄存器文件缓存(RFC),也被提议作为这个问题的补救措施。我们介绍了一个相对简单的RFC设计,它将RFC划分为两个独立的组件:一个FIFO队列用于保存寄存器值,这些值在回写后的短时间内使用,另一个小的集合关联缓存保存可能在较长时间内使用的值。写入RFC的结果很容易分为这些类别,分类位也用于预测同一指令下一次执行的结果性质。我们表明,与具有128个寄存器、2个周期访问时间并具有一个额外的旁路级别的基本情况设计相比,使用28个入口RFC以及96个入口RF时,在访问寄存器操作数时发生了显着的节能(平均约38%)。与基本情况相比,性能下降也可以忽略不计(下降0.3%)。
{"title":"Register File Caching for Energy Efficiency","authors":"Hui Zeng, K. Ghose","doi":"10.1145/1165573.1165633","DOIUrl":"https://doi.org/10.1145/1165573.1165633","url":null,"abstract":"With the use of faster clocks and larger instruction windows in high-end superscalar processors, the physical register files (RFs) can no longer be accessed in a single cycle. To combat the consequential performance penalty, the RFs employ multiple levels of bypassing. Register file caching, which caches a small subset of the registers in a faster, smaller structure called the register file cache (RFC) has also been proposed as a remedy for this problem. We introduce a relatively simple RFC design that partitions the RFC into two separate components: a FIFO queue for holding register values that are used over a short duration following their writeback and another small set-associative cache holding values that are likely to be used over a longer duration. Results written to the RFC are easily classified into these categories and the classification bit is also used to predict the nature of the result for the next execution of the same instruction. We show that significant energy savings - about 38% on the average - occurs in accessing register operands when a 28-entry RFC is used, together with a 96-entry RF with no additional bypassing when compared with a base case design that has 128 registers with a 2 cycle access time and having one additional level of bypassing. The performance drop compared against the base case is also negligible (0.3% drop)","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133031217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Variability-Aware Device Optimization under ION and Leakage Current Constraints 离子和漏电流约束下的可变感知器件优化
J. Jaffari, M. Anis
In this paper, a novel device optimization methodology is presented that is constrained by the total leakage and the ON current of the device. The devised technique locates a maximum yield rectangular cube in a three-dimensional feasible space composed by oxide thickness, halo peak doping, and halo characteristic length parameters. The center of this cube is considered as the maximum yield design point with the highest immunity against variations. Monte Carlo simulations show that the optimized bulk-MOS device for 45 nm gate length satisfies the on current and leakage constraints under a variability of up to 30% in the three parameters
本文提出了一种受总漏电流和导通电流约束的器件优化方法。该技术在由氧化物厚度、光晕峰掺杂和光晕特征长度参数组成的三维可行空间中定位出最大产率的矩形立方体。该立方体的中心被认为是对变化具有最高抗扰度的最大屈服设计点。蒙特卡罗仿真结果表明,优化后的45 nm栅极长度的块体mos器件在三个参数变化高达30%的情况下满足电流和泄漏约束
{"title":"Variability-Aware Device Optimization under ION and Leakage Current Constraints","authors":"J. Jaffari, M. Anis","doi":"10.1145/1165573.1165601","DOIUrl":"https://doi.org/10.1145/1165573.1165601","url":null,"abstract":"In this paper, a novel device optimization methodology is presented that is constrained by the total leakage and the ON current of the device. The devised technique locates a maximum yield rectangular cube in a three-dimensional feasible space composed by oxide thickness, halo peak doping, and halo characteristic length parameters. The center of this cube is considered as the maximum yield design point with the highest immunity against variations. Monte Carlo simulations show that the optimized bulk-MOS device for 45 nm gate length satisfies the on current and leakage constraints under a variability of up to 30% in the three parameters","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122906631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Low Power Viterbi Decoder Implementation using Scarce State Transition and Path Pruning Scheme for High Throughput Wireless Applications 基于稀疏状态转换和路径修剪的低功耗Viterbi解码器实现
Jie Jin, C. Tsui
This paper presents a low power Viterbi decoder design based on scarce state transition (SST). We propose an approach which seamlessly integrates the path pruning techniques with the SST decoding to reduce the average add-compare-select (ACS) computation. The scheme has very low overhead and is practical for implementation. We also propose an uneven-partitioned memory architecture for the survivor memory unit to reduce the memory access power during the trace back operation. The proposed decoder is implemented in SMIC 0.18mum CMOS process. Simulation results show that significant power consumption reduction can be achieved for high throughput wireless systems such as MB-OFDM ultra-wide-band applications
提出了一种基于稀缺状态转换(SST)的低功耗维特比译码器设计。我们提出了一种将路径修剪技术与SST解码无缝集成的方法,以减少平均添加比较选择(ACS)计算。该方案开销很低,易于实现。我们还提出了幸存者内存单元的非均匀分区内存架构,以减少追溯操作期间的内存访问功率。该解码器采用中芯0.18 μ m CMOS工艺实现。仿真结果表明,对于MB-OFDM超宽带应用等高吞吐量无线系统,可以实现显著的功耗降低
{"title":"A Low Power Viterbi Decoder Implementation using Scarce State Transition and Path Pruning Scheme for High Throughput Wireless Applications","authors":"Jie Jin, C. Tsui","doi":"10.1145/1165573.1165673","DOIUrl":"https://doi.org/10.1145/1165573.1165673","url":null,"abstract":"This paper presents a low power Viterbi decoder design based on scarce state transition (SST). We propose an approach which seamlessly integrates the path pruning techniques with the SST decoding to reduce the average add-compare-select (ACS) computation. The scheme has very low overhead and is practical for implementation. We also propose an uneven-partitioned memory architecture for the survivor memory unit to reduce the memory access power during the trace back operation. The proposed decoder is implemented in SMIC 0.18mum CMOS process. Simulation results show that significant power consumption reduction can be achieved for high throughput wireless systems such as MB-OFDM ultra-wide-band applications","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116895458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis 基于ipc -功率依赖建模和瞬态分析的微架构块的温度感知布局
Vidyasagar Nookala, D. Lilja, S. Sapatnekar
Operating temperatures have become an important concern in high performance microprocessors. Floorplanning or block-level placement offers excellent potential for thermal optimization through better heat spreading between the blocks, but these optimizations can also impact the throughput of a microarchitecture, measured in terms of the number of instructions per cycle (IPC). In nanometer technologies, global buses can have multicycle delays that depend on the positions of the blocks, and it is important for a floorplanner to be microarchitecturally-aware to be sure that thermal and IPC considerations are appropriately balanced. This paper proposes a methodology for thermally-aware microarchitecture floorplanning. The approach models the interactions between the IPC and the temperature distribution, and incorporates both factors in the floorplanning cost function. Our approach uses transient modeling and optimizes both the peak and the average temperatures, and employs a design of experiments (DOE) based strategy, which effectively captures the huge exponential search space with a small number of cycle-accurate simulations. A comparison with a technique based on previous work indicates that the proposed approach results in good reductions both in the average and the peak temperatures for a range of SPEC benchmarks
工作温度已经成为高性能微处理器的一个重要问题。地板规划或块级布局通过更好地在块之间传播热量,为热优化提供了极好的潜力,但这些优化也会影响微架构的吞吐量,以每周期指令数(IPC)来衡量。在纳米技术中,全局总线可能有多周期延迟,这取决于块的位置,对于地板规划人员来说,了解微架构以确保适当平衡散热和IPC考虑是很重要的。本文提出了一种热敏感微建筑平面规划方法。该方法模拟了IPC和温度分布之间的相互作用,并将这两个因素纳入了地板规划成本函数中。该方法采用瞬态建模,对峰值和平均温度进行优化,并采用基于实验设计(DOE)的策略,通过少量周期精确的模拟,有效地捕获了巨大的指数搜索空间。与基于先前工作的技术的比较表明,所提出的方法可以很好地降低SPEC基准范围内的平均温度和峰值温度
{"title":"Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis","authors":"Vidyasagar Nookala, D. Lilja, S. Sapatnekar","doi":"10.1145/1165573.1165644","DOIUrl":"https://doi.org/10.1145/1165573.1165644","url":null,"abstract":"Operating temperatures have become an important concern in high performance microprocessors. Floorplanning or block-level placement offers excellent potential for thermal optimization through better heat spreading between the blocks, but these optimizations can also impact the throughput of a microarchitecture, measured in terms of the number of instructions per cycle (IPC). In nanometer technologies, global buses can have multicycle delays that depend on the positions of the blocks, and it is important for a floorplanner to be microarchitecturally-aware to be sure that thermal and IPC considerations are appropriately balanced. This paper proposes a methodology for thermally-aware microarchitecture floorplanning. The approach models the interactions between the IPC and the temperature distribution, and incorporates both factors in the floorplanning cost function. Our approach uses transient modeling and optimizes both the peak and the average temperatures, and employs a design of experiments (DOE) based strategy, which effectively captures the huge exponential search space with a small number of cycle-accurate simulations. A comparison with a technique based on previous work indicates that the proposed approach results in good reductions both in the average and the peak temperatures for a range of SPEC benchmarks","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123199120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Stall Cycle Redistribution in a Transparent Fetch Pipeline 透明获取管道中的失速周期重新分配
Eric L. Hill, Mikko H. Lipasti
Power and power density are now primary design constraints for modern high performance microprocessors. Up to 70% of the dynamic power consumed can be attributed to the clocking system. A consequence of this trend is that clock gating has emerged as both a necessary and efficient method to significantly reduce dynamic power. Transparent pipelining, a recently proposed fine-grain clock gating technique, has the potential to significantly reduce clock power above and beyond conventional pipestage-level clock gating. Previous studies of transparent pipelining have focused on the circuit and implementation-related issues of this approach, while neglecting the broader microarchitectural implications. This paper aims to quantify the microarchitectural opportunities that are afforded by the use of transparent pipelining in a processor's fetch pipeline. We develop a technique, based on stall cycle redistribution, designed to improve the performance of transparent pipelining on fetch and other high utilization pipelines. We show that stall cycle redistribution can dramatically reduce the clocking overhead of an aggressively pipelined cell-like microprocessor
功率和功率密度现在是现代高性能微处理器的主要设计限制。高达70%的动态功耗可归因于时钟系统。这种趋势的结果是时钟门控已经成为一种必要和有效的方法,以显着降低动态功率。透明管道是最近提出的一种细粒度时钟门控技术,它有可能大大降低时钟功耗,超过传统的管道级时钟门控。以前对透明管道的研究主要集中在电路和实现相关的问题上,而忽略了更广泛的微架构含义。本文旨在量化在处理器的获取管道中使用透明管道所提供的微架构机会。我们开发了一种基于失速周期再分配的技术,旨在提高透明管道在读取和其他高利用率管道上的性能。我们的研究表明,失速周期的重新分配可以显著降低具有侵略性的流水线式单元微处理器的时钟开销
{"title":"Stall Cycle Redistribution in a Transparent Fetch Pipeline","authors":"Eric L. Hill, Mikko H. Lipasti","doi":"10.1145/1165573.1165583","DOIUrl":"https://doi.org/10.1145/1165573.1165583","url":null,"abstract":"Power and power density are now primary design constraints for modern high performance microprocessors. Up to 70% of the dynamic power consumed can be attributed to the clocking system. A consequence of this trend is that clock gating has emerged as both a necessary and efficient method to significantly reduce dynamic power. Transparent pipelining, a recently proposed fine-grain clock gating technique, has the potential to significantly reduce clock power above and beyond conventional pipestage-level clock gating. Previous studies of transparent pipelining have focused on the circuit and implementation-related issues of this approach, while neglecting the broader microarchitectural implications. This paper aims to quantify the microarchitectural opportunities that are afforded by the use of transparent pipelining in a processor's fetch pipeline. We develop a technique, based on stall cycle redistribution, designed to improve the performance of transparent pipelining on fetch and other high utilization pipelines. We show that stall cycle redistribution can dramatically reduce the clocking overhead of an aggressively pipelined cell-like microprocessor","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122469229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Energy-Efficient Virtual Memory System with Flash Memory as the Secondary Storage 以快闪记忆体为辅助储存体的高能效虚拟记忆体系统
Hung-Wei Tseng, Han-Lin Li, Chia-Lin Yang
The traditional virtual memory system is designed for decades assuming a magnetic disk as the secondary storage. Recently, flash memory becomes a popular storage alternative for many portable devices with the continuing improvements on its capacity, reliability and much lower power consumption than mechanical hard drives. The NAND flash memory is organized with blocks, and each block contains a set of pages. The characteristics of flash memory are quite different from a magnetic disk. Therefore, in this paper, we revisit virtual memory system design considering limitations imposed by flash memory. In particular, we study the effects of the subpaging technique and storage cache management. In the traditional virtual memory system, a full page is written back to the secondary storage on a page fault. We found that this could result in unnecessary writes thereby wasting energy. The subpaging technique that partitions a page into subunits, and only dirty subpages are written to flash memory is beneficial to the energy efficiency. For the storage cache management, unlike traditional disk cache management, care needs to be taken to guarantee that the flash pages of a main memory page are replaced from the cache in sequence. Experimental results show that the average energy reduction of combined subpaging and caching techniques is 35.6%
几十年来,传统的虚拟存储系统都是假定磁盘作为辅助存储器而设计的。最近,闪存成为许多便携式设备的流行存储选择,其容量,可靠性和比机械硬盘驱动器低得多的功耗不断改进。NAND闪存是用块组织的,每个块包含一组页面。闪存的特性与磁盘有很大的不同。因此,在本文中,我们重新审视虚拟存储系统的设计,考虑到闪存的限制。特别地,我们研究了子分页技术和存储缓存管理的效果。在传统的虚拟内存系统中,当出现页面故障时,会将一整页写回辅助存储。我们发现这可能导致不必要的写入,从而浪费能源。子分页技术将页面划分为子单元,并且只将脏的子页写入闪存,这有利于提高能源效率。对于存储缓存管理,与传统的磁盘缓存管理不同,需要注意保证主存页面的flash页面按顺序从缓存中替换。实验结果表明,子分页和缓存技术相结合,平均能耗降低35.6%
{"title":"An Energy-Efficient Virtual Memory System with Flash Memory as the Secondary Storage","authors":"Hung-Wei Tseng, Han-Lin Li, Chia-Lin Yang","doi":"10.1145/1165573.1165675","DOIUrl":"https://doi.org/10.1145/1165573.1165675","url":null,"abstract":"The traditional virtual memory system is designed for decades assuming a magnetic disk as the secondary storage. Recently, flash memory becomes a popular storage alternative for many portable devices with the continuing improvements on its capacity, reliability and much lower power consumption than mechanical hard drives. The NAND flash memory is organized with blocks, and each block contains a set of pages. The characteristics of flash memory are quite different from a magnetic disk. Therefore, in this paper, we revisit virtual memory system design considering limitations imposed by flash memory. In particular, we study the effects of the subpaging technique and storage cache management. In the traditional virtual memory system, a full page is written back to the secondary storage on a page fault. We found that this could result in unnecessary writes thereby wasting energy. The subpaging technique that partitions a page into subunits, and only dirty subpages are written to flash memory is beneficial to the energy efficiency. For the storage cache management, unlike traditional disk cache management, care needs to be taken to guarantee that the flash pages of a main memory page are replaced from the cache in sequence. Experimental results show that the average energy reduction of combined subpaging and caching techniques is 35.6%","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130893619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Robust Level Converter Design for Sub-threshold Logic 亚阈值逻辑的鲁棒电平变换器设计
I. Chang, Jae-Joon Kim, K. Roy
The large supply voltage difference between sub-threshold core logic and I/O makes it extremely challenging to convert signals from core circuit to I/O circuit. In this paper, we propose two novel circuits, clock synchronizer and reduced swing inverter to design dynamic and static level converters for sub-threshold logic. Circuit simulations shows that our level converters work at frequency > 500kHz between 20degC and 40degC with a supply voltage of 0.25V
亚阈值核心逻辑和I/O之间的大电压差使得信号从核心电路转换到I/O电路极具挑战性。本文提出了时钟同步器和减摆幅逆变器两种新颖的电路来设计亚阈值逻辑的动态电平变换器和静态电平变换器。电路仿真表明,我们的电平变换器工作频率为> 500kHz,在20°c和40°c之间,电源电压为0.25V
{"title":"Robust Level Converter Design for Sub-threshold Logic","authors":"I. Chang, Jae-Joon Kim, K. Roy","doi":"10.1145/1165573.1165579","DOIUrl":"https://doi.org/10.1145/1165573.1165579","url":null,"abstract":"The large supply voltage difference between sub-threshold core logic and I/O makes it extremely challenging to convert signals from core circuit to I/O circuit. In this paper, we propose two novel circuits, clock synchronizer and reduced swing inverter to design dynamic and static level converters for sub-threshold logic. Circuit simulations shows that our level converters work at frequency > 500kHz between 20degC and 40degC with a supply voltage of 0.25V","PeriodicalId":119229,"journal":{"name":"ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130373609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
期刊
ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1