首页 > 最新文献

Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.最新文献

英文 中文
ILP-based optimization of sequential circuits for low power 基于ilp的低功耗时序电路优化
Feng Gao, J. Hayes
The power consumption of a sequential circuit can be reduced by decomposing it into subcircuits which can be turned off when inactive. Power can also be reduced by careful state encoding. Modeling a given circuit as a finite-state machine, we formulate its decomposition into submachines as an integer linear programming (ILP) problem, and automatically generate the ILP model with power minimization as the objective. A simple, but powerful state encoding method is used for the submachines to further reduce power consumption. We present experimental results which show that circuits designed by our approach consume 30% to 90% less power than conventional circuits.
顺序电路的功耗可以通过将其分解成子电路来降低,这些子电路可以在非活动时关闭。通过仔细的状态编码也可以降低功耗。将给定电路建模为有限状态机,将其分解为子机,将其分解为整数线性规划(ILP)问题,并以功率最小化为目标自动生成ILP模型。为了进一步降低功耗,采用了一种简单但功能强大的状态编码方法。我们给出的实验结果表明,用我们的方法设计的电路比传统电路功耗低30%到90%。
{"title":"ILP-based optimization of sequential circuits for low power","authors":"Feng Gao, J. Hayes","doi":"10.1145/871506.871542","DOIUrl":"https://doi.org/10.1145/871506.871542","url":null,"abstract":"The power consumption of a sequential circuit can be reduced by decomposing it into subcircuits which can be turned off when inactive. Power can also be reduced by careful state encoding. Modeling a given circuit as a finite-state machine, we formulate its decomposition into submachines as an integer linear programming (ILP) problem, and automatically generate the ILP model with power minimization as the objective. A simple, but powerful state encoding method is used for the submachines to further reduce power consumption. We present experimental results which show that circuits designed by our approach consume 30% to 90% less power than conventional circuits.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133589643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A critical analysis of application-adaptive multiple clock processors 应用自适应多时钟处理器的关键分析
Emil Talpes, Diana Marculescu
Enabled by the continuous advancement in fabrication technology, present day synchronous microprocessors include more than 100 million transistors and have clock speeds well in excess of the 1GHz mark. Distributing a low-skew clock signal in this frequency range to all areas of a large chip is a task of growing complexity. As a solution to this problem, designers have recently suggested the use of frequency islands that are locally clocked and externally communicate using mixed timing communication schemes. Such a design style fits nicely the recently proposed concept of voltage islands that, in addition, can potentially enable fine grain dynamic power management. This paper proposes a design exploration framework for application-adaptive multiple clock processors which provides the means for analyzing and identifying the right inter-domain communication scheme and the proper granularity for the choice of voltage/frequency. In addition, the proposed design exploration framework allows for comparative analysis of newly proposed or existing application-driven dynamic power management strategies. Such a design exploration framework and accompanying results can help designers and computer architects in choosing the right design strategy for achieving better power-performance trade-offs in multiple clock high-end processors.
由于制造技术的不断进步,目前的同步微处理器包括超过1亿个晶体管,时钟速度远远超过1GHz。将这个频率范围内的低偏度时钟信号分布到大型芯片的所有区域是一项越来越复杂的任务。为了解决这个问题,设计师最近建议使用频率岛,即本地时钟和外部使用混合定时通信方案进行通信。这种设计风格非常适合最近提出的电压岛概念,此外,它还可以潜在地实现细粒度动态电源管理。本文提出了一种应用自适应多时钟处理器的设计探索框架,为分析和确定合适的域间通信方案和选择合适的电压/频率粒度提供了手段。此外,提出的设计探索框架允许对新提出的或现有的应用驱动的动态电源管理策略进行比较分析。这样的设计探索框架和相关结果可以帮助设计人员和计算机架构师选择正确的设计策略,以便在多时钟高端处理器中实现更好的功耗性能权衡。
{"title":"A critical analysis of application-adaptive multiple clock processors","authors":"Emil Talpes, Diana Marculescu","doi":"10.1145/871506.871576","DOIUrl":"https://doi.org/10.1145/871506.871576","url":null,"abstract":"Enabled by the continuous advancement in fabrication technology, present day synchronous microprocessors include more than 100 million transistors and have clock speeds well in excess of the 1GHz mark. Distributing a low-skew clock signal in this frequency range to all areas of a large chip is a task of growing complexity. As a solution to this problem, designers have recently suggested the use of frequency islands that are locally clocked and externally communicate using mixed timing communication schemes. Such a design style fits nicely the recently proposed concept of voltage islands that, in addition, can potentially enable fine grain dynamic power management. This paper proposes a design exploration framework for application-adaptive multiple clock processors which provides the means for analyzing and identifying the right inter-domain communication scheme and the proper granularity for the choice of voltage/frequency. In addition, the proposed design exploration framework allows for comparative analysis of newly proposed or existing application-driven dynamic power management strategies. Such a design exploration framework and accompanying results can help designers and computer architects in choosing the right design strategy for achieving better power-performance trade-offs in multiple clock high-end processors.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130780134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Reducing power density through activity migration 通过活动迁移降低功率密度
Seongmoo Heo, K. Barr, K. Asanović
Power dissipation is unevenly distributed in modern microprocessors leading to localized hot spots with significantly greater die temperature than surrounding cooler regions. Excessive junction temperature reduces reliability and can lead to catastrophic failure. We examine the use of activity migration which reduces peak junction temperature by moving computation between multiple replicated units. Using a thermal model that includes the temperature dependence of leakage power, we show that sustainable power dissipation can be increased by nearly a factor of two for a given junction temperature limit. Alternatively, peak die temperature can be reduced by 12.4/spl deg/C at the same clock frequency. The model predicts that migration intervals of around 20-200 /spl mu/s are required to achieve the maximum sustainable power increase. We evaluate several different forms of replication and migration policy control.
在现代微处理器中,功耗分布不均匀,导致局部热点的芯片温度明显高于周围较冷的区域。过高的结温降低了可靠性,并可能导致灾难性的故障。我们研究了通过在多个复制单元之间移动计算来降低峰值结温的活动迁移的使用。使用包含泄漏功率的温度依赖性的热模型,我们表明在给定的结温极限下,可持续功耗可以增加近两倍。或者,在相同的时钟频率下,峰值芯片温度可以降低12.4/spl度/C。该模型预测,要实现最大的可持续电力增长,迁移间隔约为20-200 /spl mu/s。我们评估了几种不同形式的复制和迁移策略控制。
{"title":"Reducing power density through activity migration","authors":"Seongmoo Heo, K. Barr, K. Asanović","doi":"10.1145/871506.871561","DOIUrl":"https://doi.org/10.1145/871506.871561","url":null,"abstract":"Power dissipation is unevenly distributed in modern microprocessors leading to localized hot spots with significantly greater die temperature than surrounding cooler regions. Excessive junction temperature reduces reliability and can lead to catastrophic failure. We examine the use of activity migration which reduces peak junction temperature by moving computation between multiple replicated units. Using a thermal model that includes the temperature dependence of leakage power, we show that sustainable power dissipation can be increased by nearly a factor of two for a given junction temperature limit. Alternatively, peak die temperature can be reduced by 12.4/spl deg/C at the same clock frequency. The model predicts that migration intervals of around 20-200 /spl mu/s are required to achieve the maximum sustainable power increase. We evaluate several different forms of replication and migration policy control.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"609 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134332440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 341
A 225 MHz resonant clocked ASIC chip 一个225mhz谐振时钟ASIC芯片
C. Ziesler, Joohee Kim, V. Sathe, M. Papaefthymiou
We have recently designed, fabricated, and successfully tested an experimental chip that validates a novel method for reducing clock dissipation through energy recovery. Our approach includes a single-phase sinusoidal clock signal, an L-C resonant sinusoidal clock generator, and an energy recovering flip-flop. Our chip comprises a dual-mode ASIC with two independent clock systems, one conventional and one energy recovering, and was fabricated in a 0.25 /spl mu/m bulk CMOS process. The ASIC computes a pipelined discrete wavelet transform with self-test and contains over 3500 gates. We have verified correct functionality and obtained power measurements in both modes of operation for frequencies up to 225 MHz. In the energy recovering mode, our power measurements account for all of the dissipation factors, including the operation of the integrated resonant clock generator, and show a net energy savings over the conventional mode of operation. For example, at 115 MHz, measured dissipation is between 60% and 75% of the conventional mode, depending on primary input activity. To our knowledge, this is the first ever published account of a direct experimentally-measured comparison between a complete energy recovering ASIC chip and its conventional implementation correctly operating in silicon at frequencies exceeding 100 MHz.
我们最近设计、制造并成功测试了一种实验芯片,该芯片验证了一种通过能量恢复来减少时钟耗散的新方法。我们的方法包括一个单相正弦时钟信号,一个lc谐振正弦时钟发生器和一个能量恢复触发器。我们的芯片由双模ASIC组成,具有两个独立的时钟系统,一个是常规时钟系统,一个是能量回收时钟系统,并以0.25 /spl mu/m的批量CMOS工艺制造。该专用集成电路计算具有自检功能的流水线离散小波变换,包含3500多个门。我们已经验证了正确的功能,并在频率高达225 MHz的两种工作模式下获得了功率测量。在能量回收模式下,我们的功率测量考虑了所有耗散因素,包括集成谐振时钟发生器的运行,并显示出比传统运行模式节省的净能量。例如,在115 MHz时,根据主输入活动的不同,测量到的耗散在传统模式的60%到75%之间。据我们所知,这是有史以来第一次发表的直接实验测量比较完整的能量回收ASIC芯片与其在超过100 MHz的频率下在硅中正确运行的传统实现之间的比较。
{"title":"A 225 MHz resonant clocked ASIC chip","authors":"C. Ziesler, Joohee Kim, V. Sathe, M. Papaefthymiou","doi":"10.1145/871506.871523","DOIUrl":"https://doi.org/10.1145/871506.871523","url":null,"abstract":"We have recently designed, fabricated, and successfully tested an experimental chip that validates a novel method for reducing clock dissipation through energy recovery. Our approach includes a single-phase sinusoidal clock signal, an L-C resonant sinusoidal clock generator, and an energy recovering flip-flop. Our chip comprises a dual-mode ASIC with two independent clock systems, one conventional and one energy recovering, and was fabricated in a 0.25 /spl mu/m bulk CMOS process. The ASIC computes a pipelined discrete wavelet transform with self-test and contains over 3500 gates. We have verified correct functionality and obtained power measurements in both modes of operation for frequencies up to 225 MHz. In the energy recovering mode, our power measurements account for all of the dissipation factors, including the operation of the integrated resonant clock generator, and show a net energy savings over the conventional mode of operation. For example, at 115 MHz, measured dissipation is between 60% and 75% of the conventional mode, depending on primary input activity. To our knowledge, this is the first ever published account of a direct experimentally-measured comparison between a complete energy recovering ASIC chip and its conventional implementation correctly operating in silicon at frequencies exceeding 100 MHz.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114702518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Energy-aware architectures for a Real-Valued FFT implementation 实值FFT实现的能量感知架构
Alice Wang, A. Chandrakasan
Energy-aware design is highly desirable for systems that encounter a wide diversity of operating scenarios. This is in contrast to traditional low power design for the worst case scenario, which may not be globally energy efficient. Energy-aware design focuses on enabling architectures which scale down-energy as quality requirements are relaxed. A new energy-scalable system design methodology is proposed for a Real-Valued FFT processor which supports variable bit precision (8 and 16-bit precision) and variable FFT length (128-512 point). Two energy-aware architectures, Ensemble of Point Solutions method and Reuse of Point Solutions method, are described and evaluated. Simulated and measured results show a 66% energy savings for 8-bit datapath and 52% savings for 128-point FFT length over a non-scalable approach.
对于遇到各种操作场景的系统来说,节能设计是非常可取的。这与传统的低功耗设计形成鲜明对比,在最坏的情况下,传统的低功耗设计可能不是全球节能的。能源意识设计的重点是使架构能够按比例缩小能耗,因为质量要求是放松的。提出了一种新的能量可扩展的实值FFT处理器设计方法,该处理器支持可变位精度(8位和16位精度)和可变FFT长度(128-512点)。描述并评价了两种能量感知体系结构,即点解决方案集成方法和点解决方案重用方法。模拟和测量结果表明,与不可扩展的方法相比,8位数据路径节省66%的能量,128点FFT长度节省52%的能量。
{"title":"Energy-aware architectures for a Real-Valued FFT implementation","authors":"Alice Wang, A. Chandrakasan","doi":"10.1109/LPE.2003.1231919","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231919","url":null,"abstract":"Energy-aware design is highly desirable for systems that encounter a wide diversity of operating scenarios. This is in contrast to traditional low power design for the worst case scenario, which may not be globally energy efficient. Energy-aware design focuses on enabling architectures which scale down-energy as quality requirements are relaxed. A new energy-scalable system design methodology is proposed for a Real-Valued FFT processor which supports variable bit precision (8 and 16-bit precision) and variable FFT length (128-512 point). Two energy-aware architectures, Ensemble of Point Solutions method and Reuse of Point Solutions method, are described and evaluated. Simulated and measured results show a 66% energy savings for 8-bit datapath and 52% savings for 128-point FFT length over a non-scalable approach.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124233952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Low-voltage low-power fast-settling CMOS operational transconductance amplifiers for switched-capacitor applications 用于开关电容应用的低电压低功率快速沉降CMOS操作跨导放大器
M. Yavari, O. Shoaei
This paper presents a new fully differential operational transconductance amplifier (OTA) for low-voltage and fast-settling switched-capacitor circuits in digital CMOS technology. The proposed two-stage OTA is a hybrid class A/AB that combines a folded cascode as the first stage with active current mirrors as the second stage. It employs a hybrid cascode compensation scheme, merged Ahuja and improved Ahuja style compensations, for fast settling.
本文提出了一种新型的全差分跨导运算放大器(OTA),用于数字CMOS技术中的低压、快速沉降开关电容电路。提出的两级OTA是a /AB级混合,将折叠级联码作为第一级,将有源电流反射镜作为第二级。它采用混合级联补偿方案,合并了Ahuja和改进的Ahuja风格补偿,以实现快速沉降。
{"title":"Low-voltage low-power fast-settling CMOS operational transconductance amplifiers for switched-capacitor applications","authors":"M. Yavari, O. Shoaei","doi":"10.1109/LPE.2003.1231910","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231910","url":null,"abstract":"This paper presents a new fully differential operational transconductance amplifier (OTA) for low-voltage and fast-settling switched-capacitor circuits in digital CMOS technology. The proposed two-stage OTA is a hybrid class A/AB that combines a folded cascode as the first stage with active current mirrors as the second stage. It employs a hybrid cascode compensation scheme, merged Ahuja and improved Ahuja style compensations, for fast settling.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122656790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning 节能的D-TLB和数据缓存使用语义感知多边分区
H. Lee, C. Ballapuram
The memory subsystem, including address translations and cache accesses, consumes a major portion of the overall energy on a processor. In this paper, we address the memory energy issues by using a streamlined architectural partitioning technique that effectively reduces energy consumption in the memory subsystem without compromising performance. It is achieved by decoupling the d-TLB lookups and the data cache accesses, based on the semantic regions defined by programming languages and software convention, into discrete reference substreams - stack, global static, and heap. Their unique access behaviors and locality characteristics are analyzed and exploited for power reduction. Our results show that an average of 35% energy can be reduced in the d-TLB and the data cache. Furthermore, an average of 46% energy can be saved by selectively multi-porting the semantic-aware d-TLBs and data caches against their monolithic counterparts.
内存子系统,包括地址转换和缓存访问,消耗了处理器总能量的很大一部分。在本文中,我们通过使用一种流线型的体系结构分区技术来解决内存能量问题,该技术在不影响性能的情况下有效地降低了内存子系统的能量消耗。它是通过将d-TLB查找和数据缓存访问解耦来实现的,基于编程语言和软件约定定义的语义区域,将d-TLB查找和数据缓存访问解耦为离散的引用子流——堆栈、全局静态和堆。分析了其独特的接入行为和局部性特征,并利用其降低功耗。我们的结果表明,在d-TLB和数据缓存中平均可以减少35%的能量。此外,通过选择性地对语义感知的d- tlb和数据缓存进行多端口,可以节省平均46%的能量。
{"title":"Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning","authors":"H. Lee, C. Ballapuram","doi":"10.1145/871506.871583","DOIUrl":"https://doi.org/10.1145/871506.871583","url":null,"abstract":"The memory subsystem, including address translations and cache accesses, consumes a major portion of the overall energy on a processor. In this paper, we address the memory energy issues by using a streamlined architectural partitioning technique that effectively reduces energy consumption in the memory subsystem without compromising performance. It is achieved by decoupling the d-TLB lookups and the data cache accesses, based on the semantic regions defined by programming languages and software convention, into discrete reference substreams - stack, global static, and heap. Their unique access behaviors and locality characteristics are analyzed and exploited for power reduction. Our results show that an average of 35% energy can be reduced in the d-TLB and the data cache. Furthermore, an average of 46% energy can be saved by selectively multi-porting the semantic-aware d-TLBs and data caches against their monolithic counterparts.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128144128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Power efficient comparators for long arguments in superscalar processors 用于超标量处理器中长参数的节能比较器
D. Ponomarev, Gürhan Küçük, O. Ergin, K. Ghose
Traditional pulldown comparators that are used to implement associative addressing logic in superscalar microprocessors dissipate energy on a mismatch in any bit position in the comparands. As mismatches occur much more frequently than matches in many situations, such circuits are extremely energy-inefficient. In recognition of this inefficiency, a series of dissipate-on-match comparator designs have been proposed to address the power considerations. These designs, however, are limited to at most 8 bit long arguments. In this paper, we examine the designs of energy-efficient comparators capable of comparing arguments as long as 32 bits in size. Such long comparands are routinely used in the load-store queues, caches, BTBs and TLBs. We use the actual layout data and the realistic bit patterns of the comparands (obtained from the simulated execution of SPEC 2000 benchmarks) to show the energy impact from the use of the new comparators. In general, a non-trivial combination of traditional and dissipate-on-match 8 bit comparator blocks represents the most energy-efficient and fastest solution. As an example of this general approach, we show how fast and energy-efficient comparators can be designed for comparing addresses within the load-store queue of a superscalar processor.
在超标量微处理器中用于实现关联寻址逻辑的传统下拉比较器在比较数中任何位的不匹配上消耗能量。由于在许多情况下,不匹配比匹配发生的频率要高得多,这种电路的能量效率非常低。认识到这种低效率,已经提出了一系列的匹配耗散比较器设计来解决功率方面的考虑。然而,这些设计仅限于最多8位长的参数。在本文中,我们研究了能够比较长度为32位的参数的节能比较器的设计。这样长的比较号通常用于负载存储队列、缓存、btb和tlb。我们使用实际的布局数据和比较器的真实位模式(从SPEC 2000基准测试的模拟执行中获得)来显示使用新的比较器对能量的影响。一般来说,传统和匹配耗散的8位比较器块的非平凡组合代表了最节能和最快的解决方案。作为这种通用方法的一个示例,我们展示了如何设计快速和节能的比较器来比较超标量处理器的负载存储队列中的地址。
{"title":"Power efficient comparators for long arguments in superscalar processors","authors":"D. Ponomarev, Gürhan Küçük, O. Ergin, K. Ghose","doi":"10.1145/871506.871601","DOIUrl":"https://doi.org/10.1145/871506.871601","url":null,"abstract":"Traditional pulldown comparators that are used to implement associative addressing logic in superscalar microprocessors dissipate energy on a mismatch in any bit position in the comparands. As mismatches occur much more frequently than matches in many situations, such circuits are extremely energy-inefficient. In recognition of this inefficiency, a series of dissipate-on-match comparator designs have been proposed to address the power considerations. These designs, however, are limited to at most 8 bit long arguments. In this paper, we examine the designs of energy-efficient comparators capable of comparing arguments as long as 32 bits in size. Such long comparands are routinely used in the load-store queues, caches, BTBs and TLBs. We use the actual layout data and the realistic bit patterns of the comparands (obtained from the simulated execution of SPEC 2000 benchmarks) to show the energy impact from the use of the new comparators. In general, a non-trivial combination of traditional and dissipate-on-match 8 bit comparator blocks represents the most energy-efficient and fastest solution. As an example of this general approach, we show how fast and energy-efficient comparators can be designed for comparing addresses within the load-store queue of a superscalar processor.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130351239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optimal body bias selection for leakage improvement and process compensation over different technology generations 不同技术世代的泄漏改善和过程补偿的最佳体偏置选择
C. Neau, K. Roy
We present techniques to determine the optimal body bias (forward or reverse) to minimize leakage current and compensate process variations in scaled CMOS technologies. A circuit trades off sub-threshold leakage with band-to-band tunneling leakage at the source/drain junctions to determine the optimal substrate bias for different technology generations and under process variations. Using optimal body bias results in 43% and 42% savings in leakage for predictive 70 nm and 50 nm NMOS devices, respectively. This technique also reduces the effects of die-to-die and intra-die process variations in transistor length and supply voltage by 43% and 60%, respectively, in 50 nm NMOS devices, resulting in improved yield.
我们提出了确定最佳体偏置(正向或反向)的技术,以最小化泄漏电流并补偿缩放CMOS技术中的工艺变化。一个电路在源/漏极处用带对带隧道漏交换亚阈值泄漏,以确定不同技术世代和工艺变化下的最佳衬底偏压。对于预测的70纳米和50纳米NMOS器件,使用最佳体偏置可分别节省43%和42%的泄漏。在50 nm NMOS器件中,该技术还将晶体管长度和电源电压的模间和模内工艺变化的影响分别降低了43%和60%,从而提高了良率。
{"title":"Optimal body bias selection for leakage improvement and process compensation over different technology generations","authors":"C. Neau, K. Roy","doi":"10.1109/LPE.2003.1231846","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231846","url":null,"abstract":"We present techniques to determine the optimal body bias (forward or reverse) to minimize leakage current and compensate process variations in scaled CMOS technologies. A circuit trades off sub-threshold leakage with band-to-band tunneling leakage at the source/drain junctions to determine the optimal substrate bias for different technology generations and under process variations. Using optimal body bias results in 43% and 42% savings in leakage for predictive 70 nm and 50 nm NMOS devices, respectively. This technique also reduces the effects of die-to-die and intra-die process variations in transistor length and supply voltage by 43% and 60%, respectively, in 50 nm NMOS devices, resulting in improved yield.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127289215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 115
Understanding and minimizing ground bounce during mode transition of power gating structures 理解和减少电源门控结构模式转换时的地面反弹
Suhwan Kim, S. Kosonocky, D. Knebel
We introduce and analyze the ground bounce due to power mode transition in power gating structures. To reduce the ground bounce, we propose novel power gating structures in which sleep transistors are turned on in a non-uniform stepwise manner. Our power gating structures reduce the magnitude of peak current and voltage glitches in the power distribution network as well as the minimum time required to stabilize power and ground. Experimental simulation results with PowerSpice fixtured in a package model demonstrate the effectiveness of the proposed power gate switching noise reduction techniques.
介绍并分析了功率门控结构中由于功率模式转换引起的地弹跳。为了减少地面反弹,我们提出了一种新的功率门控结构,其中休眠晶体管以非均匀的逐步方式打开。我们的电源门控结构减少了配电网络中峰值电流和电压故障的幅度,以及稳定电源和接地所需的最短时间。在封装模型中安装PowerSpice的实验仿真结果验证了所提出的功率门开关降噪技术的有效性。
{"title":"Understanding and minimizing ground bounce during mode transition of power gating structures","authors":"Suhwan Kim, S. Kosonocky, D. Knebel","doi":"10.1145/871506.871515","DOIUrl":"https://doi.org/10.1145/871506.871515","url":null,"abstract":"We introduce and analyze the ground bounce due to power mode transition in power gating structures. To reduce the ground bounce, we propose novel power gating structures in which sleep transistors are turned on in a non-uniform stepwise manner. Our power gating structures reduce the magnitude of peak current and voltage glitches in the power distribution network as well as the minimum time required to stabilize power and ground. Experimental simulation results with PowerSpice fixtured in a package model demonstrate the effectiveness of the proposed power gate switching noise reduction techniques.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130612741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 201
期刊
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1