首页 > 最新文献

Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.最新文献

英文 中文
Uncertainty-based scheduling: energy-efficient ordering for tasks with variable execution time [processor scheduling] 基于不确定性的调度:可变执行时间任务的节能排序[处理器调度]
F. Gruian, K. Kuchcinski
Energy consumption reduction is today an important design issue for all kinds of digital systems. Offering both flexibility and efficient energy management, variable speed processor architectures are preferred for low energy consumption even in hard real-time systems. For this type of system, the main approach consists in trading speed for lower energy while meeting all deadlines. For tasks with varying execution time, speed scheduling is most efficient if performed at run-time. This paper presents a new ordering technique for such tasks, that reduces the energy consumption resulting from the run-time speed scheduling. Without affecting the real-time behavior, our uncertainty-based scheduling (UBS) is a low complexity but energy-efficient method that can be applied on top of already existent real-time scheduling techniques, such as EDF. These claims are backed up by extensive simulation results accompanied by measurements on a platform based on an Intel i80200 XScale processor.
降低能耗是当今各种数字系统设计的一个重要问题。变速处理器架构提供了灵活性和高效的能源管理,即使在硬实时系统中也是低能耗的首选。对于这种类型的系统,主要的方法是在满足所有截止日期的同时,用速度换取更低的能量。对于具有不同执行时间的任务,在运行时执行速度调度是最有效的。本文提出了一种新的任务排序技术,减少了运行时速度调度带来的能量消耗。在不影响实时行为的前提下,基于不确定性的调度(UBS)是一种低复杂度、高能效的调度方法,可以应用于现有的实时调度技术,如EDF。这些说法得到了基于英特尔i80200 XScale处理器平台的广泛模拟结果和测量结果的支持。
{"title":"Uncertainty-based scheduling: energy-efficient ordering for tasks with variable execution time [processor scheduling]","authors":"F. Gruian, K. Kuchcinski","doi":"10.1109/LPE.2003.1231953","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231953","url":null,"abstract":"Energy consumption reduction is today an important design issue for all kinds of digital systems. Offering both flexibility and efficient energy management, variable speed processor architectures are preferred for low energy consumption even in hard real-time systems. For this type of system, the main approach consists in trading speed for lower energy while meeting all deadlines. For tasks with varying execution time, speed scheduling is most efficient if performed at run-time. This paper presents a new ordering technique for such tasks, that reduces the energy consumption resulting from the run-time speed scheduling. Without affecting the real-time behavior, our uncertainty-based scheduling (UBS) is a low complexity but energy-efficient method that can be applied on top of already existent real-time scheduling techniques, such as EDF. These claims are backed up by extensive simulation results accompanied by measurements on a platform based on an Intel i80200 XScale processor.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114283197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Level conversion for dual-supply systems [low power logic IC design] 双电源系统的电平转换[低功耗逻辑IC设计]
F. Ishihara, F. Sheikh, B. Nikolić
Dual-supply voltage design using a clustered voltage scaling (CVS) scheme is an effective approach to reduce chip power. The optimal CVS design relies on a level converter (LC) implemented in a flip-flop to minimize energy, delay, and area penalties due to level conversion. Novel flip-flops presented in this paper incorporate a half-latch LC and a precharged LC. These flip-flops are optimized in the energy-delay design space to achieve over 30% reduction of energy-delay product and about 10% savings of total power in a CVS design as compared to the conventional flipflop. These benefits are accompanied by 24% robustness improvement and 18% layout area reduction.
采用集束电压缩放(CVS)方案的双电源电压设计是降低芯片功耗的有效方法。最佳CVS设计依赖于在触发器中实现的电平转换器(LC),以最大限度地减少电平转换带来的能量、延迟和面积损失。本文提出了一种新型触发器,包括半锁存LC和预充电LC。这些触发器在能量延迟设计空间中进行了优化,与传统触发器相比,在CVS设计中可以减少30%以上的能量延迟产品,节省约10%的总功率。这些好处伴随着24%的鲁棒性提高和18%的布局面积减少。
{"title":"Level conversion for dual-supply systems [low power logic IC design]","authors":"F. Ishihara, F. Sheikh, B. Nikolić","doi":"10.1109/LPE.2003.1231854","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231854","url":null,"abstract":"Dual-supply voltage design using a clustered voltage scaling (CVS) scheme is an effective approach to reduce chip power. The optimal CVS design relies on a level converter (LC) implemented in a flip-flop to minimize energy, delay, and area penalties due to level conversion. Novel flip-flops presented in this paper incorporate a half-latch LC and a precharged LC. These flip-flops are optimized in the energy-delay design space to achieve over 30% reduction of energy-delay product and about 10% savings of total power in a CVS design as compared to the conventional flipflop. These benefits are accompanied by 24% robustness improvement and 18% layout area reduction.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"5 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124319199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Branch prediction on demand: an energy-efficient solution [microprocessor architecture] 按需分支预测:一种节能解决方案[微处理器架构]
D. Chaver, L. Piñuel, M. Prieto, F. Tirado, M. Huang
High-end processors typically incorporate complex branch predictors consisting of many large structures that together consume a notable fraction of total chip power (more than 10% in some cases). Depending on the applications, some of these resources may remain underused for long periods of time. We propose a methodology to reduce the energy consumption of the branch predictor by characterizing prediction demand using profiling and dynamically adjusting predictor resources accordingly. Specifically, we disable components of the hybrid direction predictor and resize the branch target buffer. Detailed simulations show that this approach reduces the energy consumption in the branch predictor by an average of 72% and up to 89% with virtually no impact on prediction accuracy and performance.
高端处理器通常包含复杂的分支预测器,这些分支预测器由许多大型结构组成,这些结构加在一起消耗了芯片总功耗的很大一部分(在某些情况下超过10%)。根据应用程序的不同,其中一些资源可能在很长一段时间内未得到充分利用。本文提出了一种减少分支预测器能耗的方法,该方法通过分析来表征预测需求,并相应地动态调整预测器资源。具体来说,我们禁用了混合方向预测器的组件,并调整了分支目标缓冲区的大小。详细的模拟表明,这种方法将分支预测器的能耗平均降低了72%,最高可达89%,而对预测精度和性能几乎没有影响。
{"title":"Branch prediction on demand: an energy-efficient solution [microprocessor architecture]","authors":"D. Chaver, L. Piñuel, M. Prieto, F. Tirado, M. Huang","doi":"10.1109/LPE.2003.1231933","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231933","url":null,"abstract":"High-end processors typically incorporate complex branch predictors consisting of many large structures that together consume a notable fraction of total chip power (more than 10% in some cases). Depending on the applications, some of these resources may remain underused for long periods of time. We propose a methodology to reduce the energy consumption of the branch predictor by characterizing prediction demand using profiling and dynamically adjusting predictor resources accordingly. Specifically, we disable components of the hybrid direction predictor and resize the branch target buffer. Detailed simulations show that this approach reduces the energy consumption in the branch predictor by an average of 72% and up to 89% with virtually no impact on prediction accuracy and performance.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132393492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Voltage scheduling under unpredictabilities: a risk management paradigm [logic design] 不可预测性下的电压调度:一种风险管理范式[逻辑设计]
A. Davoodi, Ankur Srivastava
This paper addresses the problem of voltage scheduling in unpredictable situations. The voltage scheduling problem assigns voltages to operations such that the power is minimized under a clock cycle constraint. In the presence of unpredictabilities, meeting the clock constraint cannot be guaranteed. This paper proposes a novel risk management based technique to solve this problem. The risk management paradigm assigns a quantified value to the amount of risk the designer is willing to take on the clock cycle constraint. The algorithm then assigns voltages in order to meet the expected value of clock cycle constraint while keeping the maximum delay within the specified 'risk' and minimizing the power.
本文研究了不可预测情况下的电压调度问题。电压调度问题将电压分配给操作,使功率在时钟周期约束下最小化。在存在不可预测性的情况下,不能保证满足时钟约束。本文提出了一种新的基于风险管理的技术来解决这一问题。风险管理范例为设计师在时钟周期约束下愿意承担的风险量分配了一个量化的值。然后,该算法分配电压,以满足时钟周期约束的期望值,同时保持最大延迟在指定的“风险”范围内,并使功率最小。
{"title":"Voltage scheduling under unpredictabilities: a risk management paradigm [logic design]","authors":"A. Davoodi, Ankur Srivastava","doi":"10.1145/1059876.1059884","DOIUrl":"https://doi.org/10.1145/1059876.1059884","url":null,"abstract":"This paper addresses the problem of voltage scheduling in unpredictable situations. The voltage scheduling problem assigns voltages to operations such that the power is minimized under a clock cycle constraint. In the presence of unpredictabilities, meeting the clock constraint cannot be guaranteed. This paper proposes a novel risk management based technique to solve this problem. The risk management paradigm assigns a quantified value to the amount of risk the designer is willing to take on the clock cycle constraint. The algorithm then assigns voltages in order to meet the expected value of clock cycle constraint while keeping the maximum delay within the specified 'risk' and minimizing the power.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114171263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A selective filter-bank TLB system [embedded processor MMU for low power] 一种选择性滤波器组TLB系统[低功耗嵌入式处理器MMU]
Jung-Hoon Lee, G. Park, Sung-Bae Park, Shin-Dug Kim
We present a selective filter-bank translation lookaside buffer (TLB) system with low power consumption for embedded processors. The proposed TLB is constructed as multiple banks with a small two-bank buffer, called a filter-bank buffer, located above its associated bank. Either a filter-bank buffer or a main bank TLB can be selectively accessed, based on two bits in the filter-bank buffer. Energy savings are achieved by reducing the number of entries accessed at a time, by using filtering and the bank mechanism. The overhead of the proposed TLB turns out to be negligible compared with other hierarchical structures. Simulation results show that the energy/spl times/delay product can be reduced by about 88% compared with a fully -associative TLB, 75% with respect to a filter-TLB, and 51% relative to a banked-filter TLB.
我们提出了一种低功耗的选择性滤波器组翻译旁置缓冲器(TLB)系统。提议的TLB是由多个银行组成的,其中有一个小的两银行缓冲器,称为滤波器银行缓冲器,位于其相关银行的上方。基于滤波器组缓冲区中的两位,可以选择性地访问滤波器组缓冲区或主组TLB。通过使用过滤和银行机制,减少一次访问的条目数量,可以实现节能。与其他层次结构相比,提议的TLB的开销可以忽略不计。仿真结果表明,与全关联TLB相比,能量/spl时间/延迟积可降低约88%,与滤波器-TLB相比可降低75%,与银行滤波器TLB相比可降低51%。
{"title":"A selective filter-bank TLB system [embedded processor MMU for low power]","authors":"Jung-Hoon Lee, G. Park, Sung-Bae Park, Shin-Dug Kim","doi":"10.1109/LPE.2003.1231885","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231885","url":null,"abstract":"We present a selective filter-bank translation lookaside buffer (TLB) system with low power consumption for embedded processors. The proposed TLB is constructed as multiple banks with a small two-bank buffer, called a filter-bank buffer, located above its associated bank. Either a filter-bank buffer or a main bank TLB can be selectively accessed, based on two bits in the filter-bank buffer. Energy savings are achieved by reducing the number of entries accessed at a time, by using filtering and the bank mechanism. The overhead of the proposed TLB turns out to be negligible compared with other hierarchical structures. Simulation results show that the energy/spl times/delay product can be reduced by about 88% compared with a fully -associative TLB, 75% with respect to a filter-TLB, and 51% relative to a banked-filter TLB.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127955774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A semi-custom voltage-island technique and its application to high-speed serial links [CMOS active power reduction] 半自定义电压岛技术及其在高速串行链路中的应用[CMOS有功功率降低]
J. Carballo, J. Burns, Seung-Moon Yoo, I. Vo, V. R. Norman
Supply-voltage reduction is a known technique for reducing CMOS active power. We propose a semi-custom voltage-island approach based on internal regulation and selective custom design. This approach enables transparent embedding, since no additional external power supply is needed. We apply the approach to high-speed serial links, and we show that high performance is retained through targeted application of custom circuit and logic design. A chip is presented that evaluates the presented approach on a 3000 gate 3.2 Gbps multi-protocol serial-link receiver logic core. When reducing the supply from 1.2 V to 0.95 V, the chip demonstrates power savings of over 25%.
降低电源电压是降低CMOS有功功率的一种已知技术。我们提出了一种基于内部调节和选择性定制设计的半定制电压岛方法。这种方法可以实现透明嵌入,因为不需要额外的外部电源。我们将该方法应用于高速串行链路,并通过有针对性地应用定制电路和逻辑设计来保持高性能。在一个3000门3.2 Gbps多协议串行链路接收逻辑核心上对该方法进行了测试。当电源从1.2 V降低到0.95 V时,芯片显示功耗节省超过25%。
{"title":"A semi-custom voltage-island technique and its application to high-speed serial links [CMOS active power reduction]","authors":"J. Carballo, J. Burns, Seung-Moon Yoo, I. Vo, V. R. Norman","doi":"10.1109/LPE.2003.1231836","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231836","url":null,"abstract":"Supply-voltage reduction is a known technique for reducing CMOS active power. We propose a semi-custom voltage-island approach based on internal regulation and selective custom design. This approach enables transparent embedding, since no additional external power supply is needed. We apply the approach to high-speed serial links, and we show that high performance is retained through targeted application of custom circuit and logic design. A chip is presented that evaluates the presented approach on a 3000 gate 3.2 Gbps multi-protocol serial-link receiver logic core. When reducing the supply from 1.2 V to 0.95 V, the chip demonstrates power savings of over 25%.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128746042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dynamic voltage scaling algorithm for fixed-priority real-time systems using work-demand analysis 基于工作需求分析的固定优先级实时系统动态电压缩放算法
Woonseok Kim, Jihong Kim, S. Min
Dynamic Voltage Scaling (DVS), which adjusts the clock speed and supply voltage dynamically, is an effective technique in reducing the energy consumption of embedded real-time systems. Unlike dynamic-priority real-time scheduling for which highly effective DVS algorithms are available, existing fixed-priority DVS algorithms are less effective in energy efficiency because they are based on inefficient slack estimation methods. This paper describes an efficient on-line slack estimation heuristic for the rate-monotonic (RM) scheduling. The proposed heuristic estimates the slack times using the short term work-demand analysis. The DVS algorithm,based on-the proposed heuristic is also presented. Experimental results show that the proposed DVS algorithm reduces the energy consumption by 25/spl sim/42% over the existing rate-monotonic DVS algorithms.
动态电压缩放(DVS)是一种动态调节时钟速度和电源电压的技术,是降低嵌入式实时系统能耗的有效方法。与动态优先级实时调度不同,现有的固定优先级分布式交换机算法基于低效的松弛估计方法,在能效方面效率较低。提出了一种有效的速率单调调度在线松弛估计启发式算法。提出的启发式方法利用短期工作需求分析来估计闲置时间。并给出了基于该启发式算法的分布式交换机算法。实验结果表明,所提出的分布式交换机算法比现有的速率单调分布式交换机算法能耗降低25/spl sim/42%。
{"title":"Dynamic voltage scaling algorithm for fixed-priority real-time systems using work-demand analysis","authors":"Woonseok Kim, Jihong Kim, S. Min","doi":"10.1145/871506.871605","DOIUrl":"https://doi.org/10.1145/871506.871605","url":null,"abstract":"Dynamic Voltage Scaling (DVS), which adjusts the clock speed and supply voltage dynamically, is an effective technique in reducing the energy consumption of embedded real-time systems. Unlike dynamic-priority real-time scheduling for which highly effective DVS algorithms are available, existing fixed-priority DVS algorithms are less effective in energy efficiency because they are based on inefficient slack estimation methods. This paper describes an efficient on-line slack estimation heuristic for the rate-monotonic (RM) scheduling. The proposed heuristic estimates the slack times using the short term work-demand analysis. The DVS algorithm,based on-the proposed heuristic is also presented. Experimental results show that the proposed DVS algorithm reduces the energy consumption by 25/spl sim/42% over the existing rate-monotonic DVS algorithms.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115654085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Design methodology for fine-grained leakage control in MTCMOS MTCMOS中细粒度泄漏控制的设计方法
B. Calhoun, Frank Honoré, A. Chandrakasan
Multi-threshold CMOS is a popular technique for reducing standby leakage power with low delay overhead. MTCMOS designs typically use large sleep devices to reduce standby leakage at the block level. We provide a formal examination of sneak leakage paths and a design methodology that enables gate-level insertion of sleep devices for sequential and combinational circuits. A fabricated 0.13 /spl mu/m, dual V/sub T/ test chip employs this methodology to implement a low-power FPGA core with gate-level sleep FETs and over 8/spl times/ measured standby current reduction. The methodology allows local sleep regions that reduce leakage in active configurable logic blocks (CLBs) by up to 2.2/spl times/ (measured) for some CLB configurations.
多阈值CMOS是降低待机漏功率和低延迟开销的常用技术。MTCMOS设计通常使用大型睡眠器件来减少块级的待机泄漏。我们提供了潜流泄漏路径的正式检查和设计方法,使顺序和组合电路的睡眠设备的门级插入成为可能。一个已制造的0.13 /spl mu/m,双V/sub / T/测试芯片采用该方法实现了具有门级休眠场效应管和超过8/spl倍/测量待机电流降低的低功耗FPGA核心。该方法允许局部睡眠区域减少主动可配置逻辑块(CLB)的泄漏,对于某些CLB配置,泄漏率可达2.2/spl倍/(测量)。
{"title":"Design methodology for fine-grained leakage control in MTCMOS","authors":"B. Calhoun, Frank Honoré, A. Chandrakasan","doi":"10.1145/871506.871535","DOIUrl":"https://doi.org/10.1145/871506.871535","url":null,"abstract":"Multi-threshold CMOS is a popular technique for reducing standby leakage power with low delay overhead. MTCMOS designs typically use large sleep devices to reduce standby leakage at the block level. We provide a formal examination of sneak leakage paths and a design methodology that enables gate-level insertion of sleep devices for sequential and combinational circuits. A fabricated 0.13 /spl mu/m, dual V/sub T/ test chip employs this methodology to implement a low-power FPGA core with gate-level sleep FETs and over 8/spl times/ measured standby current reduction. The methodology allows local sleep regions that reduce leakage in active configurable logic blocks (CLBs) by up to 2.2/spl times/ (measured) for some CLB configurations.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115767774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Energy optimization techniques in cluster interconnects 集群互联中的能量优化技术
Eun Jung Kim, K. H. Yum, G. Link, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, Mazin S. Yousif, C. Das
Designing energy-efficient clusters has recently become an important concern to make these systems economically attractive for many applications. Since the links and switch buffers consume the major portion of the power budget of the cluster, the focus of this paper is to optimize the energy consumption in these two components. To minimize power in the links, we propose a novel dynamic link shutdown (DLS) technique. The DLS technique makes use of an appropriate adaptive routing algorithm to shutdown the links intelligently. We also present an optimized buffer design for reducing leakage energy. Our analysis on different networks using a complete system simulator reveals that the proposed DLS technique can provide optimized performance-energy behavior (up to 40% energy savings with less than 5% performance degradation in the best case) for the cluster interconnects.
设计节能集群最近成为一个重要的问题,使这些系统在许多应用中具有经济吸引力。由于链路和交换机缓冲区消耗了集群的大部分功率预算,因此本文的重点是优化这两个组件的能耗。为了最大限度地减少链路中的功率,我们提出了一种新的动态链路关闭技术。DLS技术利用适当的自适应路由算法来智能地关闭链路。我们还提出了一种优化的缓冲设计,以减少泄漏能量。我们使用一个完整的系统模拟器对不同的网络进行了分析,结果表明,所提出的DLS技术可以为集群互连提供优化的性能-能量行为(在最佳情况下,节能高达40%,性能下降不到5%)。
{"title":"Energy optimization techniques in cluster interconnects","authors":"Eun Jung Kim, K. H. Yum, G. Link, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, Mazin S. Yousif, C. Das","doi":"10.1145/871506.871620","DOIUrl":"https://doi.org/10.1145/871506.871620","url":null,"abstract":"Designing energy-efficient clusters has recently become an important concern to make these systems economically attractive for many applications. Since the links and switch buffers consume the major portion of the power budget of the cluster, the focus of this paper is to optimize the energy consumption in these two components. To minimize power in the links, we propose a novel dynamic link shutdown (DLS) technique. The DLS technique makes use of an appropriate adaptive routing algorithm to shutdown the links intelligently. We also present an optimized buffer design for reducing leakage energy. Our analysis on different networks using a complete system simulator reveals that the proposed DLS technique can provide optimized performance-energy behavior (up to 40% energy savings with less than 5% performance degradation in the best case) for the cluster interconnects.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122768173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores 一个用于全局异步、局部同步处理器核心的混合时钟问题队列设计
V. Rapaka, Diana Marculescu
Ever shrinking device sizes and innovative micro-architectural and circuit design techniques have made it possible to have multi-million transistor systems running at multi-Gigahertz speeds. However, such a tremendous computational capability comes at a high price in terms of power consumption and design effort in distributing a global clock signal across the chip. One of the most promising strategies that addresses these issues is the globally asynchronous, locally synchronous (GALS) design style where multiple domains are governed by different, locally generated clocks. Due to its inherent complexity, a possible driver application for such a design style is the case of superscalar, out-of-order processors. While micro-architectural evaluations for GALS microprocessors have been made available recently, no concrete implementations have been analyzed in a detailed way. In this paper we propose a mixed-clock issue queue design for high-end, out-of-order superscalar processors, able to sustain different clock rates and speeds for the incoming and out going traffic. We compare and contrast our implementation with existing synchronous versions of issue queues used stand-alone or in conjunction with mixed-clock FIFOs for inter-domain synchronization.
不断缩小的器件尺寸和创新的微结构和电路设计技术使得数百万晶体管系统以千兆赫的速度运行成为可能。然而,如此巨大的计算能力在功耗和在芯片上分配全局时钟信号的设计工作方面付出了高昂的代价。解决这些问题的最有希望的策略之一是全局异步,局部同步(GALS)设计风格,其中多个域由不同的本地生成的时钟管理。由于其固有的复杂性,这种设计风格的一个可能的驱动应用程序是超标量、乱序处理器的情况。虽然最近对GALS微处理器的微体系结构进行了评估,但还没有对具体实现进行详细分析。本文提出了一种用于高端无序超标量处理器的混合时钟问题队列设计,能够为进出流量维持不同的时钟速率和速度。我们将我们的实现与现有的问题队列的同步版本进行比较和对比,这些问题队列单独使用或与混合时钟fifo一起使用,用于域间同步。
{"title":"A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores","authors":"V. Rapaka, Diana Marculescu","doi":"10.1145/871506.871600","DOIUrl":"https://doi.org/10.1145/871506.871600","url":null,"abstract":"Ever shrinking device sizes and innovative micro-architectural and circuit design techniques have made it possible to have multi-million transistor systems running at multi-Gigahertz speeds. However, such a tremendous computational capability comes at a high price in terms of power consumption and design effort in distributing a global clock signal across the chip. One of the most promising strategies that addresses these issues is the globally asynchronous, locally synchronous (GALS) design style where multiple domains are governed by different, locally generated clocks. Due to its inherent complexity, a possible driver application for such a design style is the case of superscalar, out-of-order processors. While micro-architectural evaluations for GALS microprocessors have been made available recently, no concrete implementations have been analyzed in a detailed way. In this paper we propose a mixed-clock issue queue design for high-end, out-of-order superscalar processors, able to sustain different clock rates and speeds for the incoming and out going traffic. We compare and contrast our implementation with existing synchronous versions of issue queues used stand-alone or in conjunction with mixed-clock FIFOs for inter-domain synchronization.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129630429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1