首页 > 最新文献

Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.最新文献

英文 中文
A 0.75-mW analog processor IC for wireless biosignal monitor 一种用于无线生物信号监测的0.75 mw模拟处理器IC
Chih-Jen Yen, Mely Chen Chi, Danny Wen-Yaw Chung, Shing-Hao Lee
This work presents a single-channel analog processor IC for the wireless biosignal monitor. This chip occupies a small die area of 0.52 mm/sup 2/ and has a low power consumption of 0.75 mW at a 5-V supply voltage. The wired and wireless systems constructed by using the designed processor chip and commercial discrete lCs have been validated in this study. Experimental results indicate that the integrated single-chip processor system can amplify, filter, transmit, and receive the simulated ECG signal. Compared to the wired prototype system, wireless transmission provides better long-distance, long-term measuring, recording, and monitoring the biosignal.
本文提出了一种用于无线生物信号监测仪的单通道模拟处理器IC。该芯片占地面积小,为0.52 mm/sup 2/,在5v供电电压下功耗低至0.75 mW。利用所设计的处理器芯片和商用离散lc构建的有线和无线系统已在本研究中得到验证。实验结果表明,该集成单片机系统能够对模拟心电信号进行放大、滤波、发送和接收。与有线原型系统相比,无线传输提供了更好的远距离、长期测量、记录和监测生物信号。
{"title":"A 0.75-mW analog processor IC for wireless biosignal monitor","authors":"Chih-Jen Yen, Mely Chen Chi, Danny Wen-Yaw Chung, Shing-Hao Lee","doi":"10.1145/871506.871616","DOIUrl":"https://doi.org/10.1145/871506.871616","url":null,"abstract":"This work presents a single-channel analog processor IC for the wireless biosignal monitor. This chip occupies a small die area of 0.52 mm/sup 2/ and has a low power consumption of 0.75 mW at a 5-V supply voltage. The wired and wireless systems constructed by using the designed processor chip and commercial discrete lCs have been validated in this study. Experimental results indicate that the integrated single-chip processor system can amplify, filter, transmit, and receive the simulated ECG signal. Compared to the wired prototype system, wireless transmission provides better long-distance, long-term measuring, recording, and monitoring the biosignal.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115829927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Estimating influence of data layout optimizations on SDRAM energy consumption 估计数据布局优化对SDRAM能耗的影响
Hyun Suk Kim, N. Vijaykrishnan, M. Kandemir, E. Brockmeyer, F. Catthoor, M. J. Irwin
An important problem in extracting maximum benefits from an SDRAM-based architecture is to exploit data locality at the page granularity. Frequent switches between data pages can increase memory latency and have an impact on energy consumption. In this paper, we propose a mathematical formulation, using Presburger arithmetic and Ehrhart polynomials to-estimate the number of page breaks statically (i.e., at compile time). The results obtained using video codes indicate that the proposed framework can estimate the number of page breaks with good accuracy.
从基于sdram的架构中获取最大好处的一个重要问题是在页面粒度上利用数据局部性。数据页之间的频繁切换会增加内存延迟,并对能耗产生影响。在本文中,我们提出了一个数学公式,使用Presburger算法和Ehrhart多项式来静态估计分页符的数量(即,在编译时)。使用视频编码的结果表明,该框架能够较准确地估计出分页符的数量。
{"title":"Estimating influence of data layout optimizations on SDRAM energy consumption","authors":"Hyun Suk Kim, N. Vijaykrishnan, M. Kandemir, E. Brockmeyer, F. Catthoor, M. J. Irwin","doi":"10.1145/871506.871520","DOIUrl":"https://doi.org/10.1145/871506.871520","url":null,"abstract":"An important problem in extracting maximum benefits from an SDRAM-based architecture is to exploit data locality at the page granularity. Frequent switches between data pages can increase memory latency and have an impact on energy consumption. In this paper, we propose a mathematical formulation, using Presburger arithmetic and Ehrhart polynomials to-estimate the number of page breaks statically (i.e., at compile time). The results obtained using video codes indicate that the proposed framework can estimate the number of page breaks with good accuracy.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"53 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121011681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Reverse-order source/drain formation with double offset spacer (RODOS) for CMOS low-power, high-speed and low-noise amplifiers 具有双偏置间隔(RODOS)的CMOS低功耗,高速和低噪声放大器的反序源漏形成
W. Choi, J. Lee, Byung-Gook Park
RODOS (Reverse-Order source/drain formation with Double Offset Spacer) was proposed for low-power, high-speed and low-noise amplifiers. Relying on simulation data, we confirmed the high feasibility of the RODOS process. It showed improved performance in linearity (V/sub IP3/). Additionally, by optimizing process parameters, we achieved small gate delay (CV/I) and low static/dynamic power consumption. The process satisfied most of the requirements of LOP and LSTP in ITRS 2002. Finally, we found that devices with the RODOS structure can be a promising alternative to implement low-power, high-speed and low-noise amplifiers for radio on a chip.
提出了用于低功耗、高速、低噪声放大器的RODOS(带双偏置间隔器的反向源漏形成)。通过仿真数据验证了RODOS工艺的高可行性。线性度(V/sub IP3/)有所提高。此外,通过优化工艺参数,我们实现了小栅极延迟(CV/I)和低静态/动态功耗。该工艺满足ITRS 2002中LOP和LSTP的大部分要求。最后,我们发现具有RODOS结构的器件可以成为在芯片上实现低功耗,高速和低噪声无线电放大器的有希望的替代方案。
{"title":"Reverse-order source/drain formation with double offset spacer (RODOS) for CMOS low-power, high-speed and low-noise amplifiers","authors":"W. Choi, J. Lee, Byung-Gook Park","doi":"10.1109/LPE.2003.1231860","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231860","url":null,"abstract":"RODOS (Reverse-Order source/drain formation with Double Offset Spacer) was proposed for low-power, high-speed and low-noise amplifiers. Relying on simulation data, we confirmed the high feasibility of the RODOS process. It showed improved performance in linearity (V/sub IP3/). Additionally, by optimizing process parameters, we achieved small gate delay (CV/I) and low static/dynamic power consumption. The process satisfied most of the requirements of LOP and LSTP in ITRS 2002. Finally, we found that devices with the RODOS structure can be a promising alternative to implement low-power, high-speed and low-noise amplifiers for radio on a chip.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy characterization of a tiled architecture processor with on-chip networks 片上网络平铺结构处理器的能量表征
J. Kim, M. Taylor, Jason E. Miller, D. Wentzlaff
Tiled architectures provide a paradigm for designers to turn silicon resources into processors with burgeoning quantities of programmable functional units and memories. The architecture has a dual responsibility: first, it must expose these resources in a way that is programmable. Second, it needs to manage the power associated with such resources. We present the power management facilities of the 16-tile Raw microprocessor. This design selectively turns on and off 48 SRAM macros, 96 functional unit clusters, 32 fetch units,and over 250 unique processor pipeline, stages, all according to the needs of the computation and environment at hand.
平铺架构为设计人员提供了一个范例,将硅资源转化为具有大量可编程功能单元和存储器的处理器。体系结构有双重责任:首先,它必须以可编程的方式公开这些资源。其次,它需要管理与这些资源相关的权力。我们提出了16块Raw微处理器的电源管理设施。本设计根据计算和环境的需要,选择性地打开和关闭48个SRAM宏、96个功能单元集群、32个读取单元和250多个独特的处理器管道、阶段。
{"title":"Energy characterization of a tiled architecture processor with on-chip networks","authors":"J. Kim, M. Taylor, Jason E. Miller, D. Wentzlaff","doi":"10.1145/871506.871610","DOIUrl":"https://doi.org/10.1145/871506.871610","url":null,"abstract":"Tiled architectures provide a paradigm for designers to turn silicon resources into processors with burgeoning quantities of programmable functional units and memories. The architecture has a dual responsibility: first, it must expose these resources in a way that is programmable. Second, it needs to manage the power associated with such resources. We present the power management facilities of the 16-tile Raw microprocessor. This design selectively turns on and off 48 SRAM macros, 96 functional unit clusters, 32 fetch units,and over 250 unique processor pipeline, stages, all according to the needs of the computation and environment at hand.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124922728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
A low-power VLSI architecture for turbo decoding turbo译码的低功耗VLSI架构
Seok-Jun Lee, Naresh R Shanbhag, A. Singer
Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.
本文提出了一种用于并行级联卷积码turbo译码的低功耗结构。所提出的架构是通过块交错计算的概念推导出来的,然后是折叠、重新定时和电压缩放。块交错计算可以应用于任何对数据块进行操作的数据处理单元,并满足以下三个属性:1)块之间的计算是独立的;2)一个块可以被分割成计算独立的子块;3)子块内的计算是递归的。块交错计算、折叠和重定时的应用使MAP解码器的添加比较选择(ACS)内核的关键路径延迟降低了50% ~ 84%,而面积开销为14% ~ 70%。随后的应用电压缩放导致高达65%的电力节省块交错深度为6。通过晶体管级时序和功率分析工具获得的实验结果表明,在0.25 /spl μ m CMOS工艺中,块交错深度为2可节省20%-44%的功耗。
{"title":"A low-power VLSI architecture for turbo decoding","authors":"Seok-Jun Lee, Naresh R Shanbhag, A. Singer","doi":"10.1145/871506.871599","DOIUrl":"https://doi.org/10.1145/871506.871599","url":null,"abstract":"Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131749197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Routine based OS-aware microprocessor resource adaptation for run-time operating system power saving 基于例程的操作系统感知微处理器资源适配,实现运行时操作系统节能
Tao Li, L. John
The increasingly constrained power budget of today's microprocessor has resulted in a situation where power savings of all components in a system have to be taken into consideration. The operating system (OS) is a major power consumer in many modem applications execution. This paper advocates a routine based OS-aware microprocessor resource adaptation mechanism targeting run-time OS power savings. Simulation results show that compared with the existing sampling-based adaptation schemes, this novel methodology yields more attractive power and performance trade-off on the OS execution. To our knowledge, this paper is the first to address the power saving issue of the OS itself, an increasingly important area that has been largely overlooked in the previous studies.
当今微处理器的功率预算越来越有限,导致必须考虑到系统中所有组件的功耗节省。在许多调制解调器应用程序的执行中,操作系统(OS)是一个主要的功耗消耗者。本文提出了一种基于例程的操作系统感知微处理器资源自适应机制,目标是在运行时节省操作系统的功耗。仿真结果表明,与现有的基于采样的自适应方案相比,该方法在操作系统执行方面具有更好的功耗和性能折衷。据我们所知,这篇论文是第一个解决操作系统本身的省电问题,这是一个越来越重要的领域,在以前的研究中很大程度上被忽视了。
{"title":"Routine based OS-aware microprocessor resource adaptation for run-time operating system power saving","authors":"Tao Li, L. John","doi":"10.1145/871506.871565","DOIUrl":"https://doi.org/10.1145/871506.871565","url":null,"abstract":"The increasingly constrained power budget of today's microprocessor has resulted in a situation where power savings of all components in a system have to be taken into consideration. The operating system (OS) is a major power consumer in many modem applications execution. This paper advocates a routine based OS-aware microprocessor resource adaptation mechanism targeting run-time OS power savings. Simulation results show that compared with the existing sampling-based adaptation schemes, this novel methodology yields more attractive power and performance trade-off on the OS execution. To our knowledge, this paper is the first to address the power saving issue of the OS itself, an increasingly important area that has been largely overlooked in the previous studies.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130441533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An ASIC design methodology with predictably low leakage, using leakage-immune standard cells 一种ASIC设计方法,具有可预测的低泄漏,使用泄漏免疫标准细胞
N. Jayakumar, S. Khatri
In this paper we introduce a low-leakage standard cell based ASIC design methodology which is based on the use of modified standard cells. These cells are designed to consume extremely low and predictable leakage currents in standby mode. For each cell in a standard cell library, we design two low-leakage variants of the cell. If the inputs of a cell during the standby mode of operation are such that the output has a high value, we minimize the leakage in the pull-down network, and vice versa. While technology mapping a circuit, we determine the particular variant to utilize in each instance, so as to minimize leakage of the final mapped design. We have designed and laid out our modified standard cells, and have performed experiments to compare placed-and-routed area, leakage and delays of our method against MTCMOS and a straightforward ASIC flow. Each design style we compare utilizes the same base standard cell library. Our results show that designs obtained using our methodology have better speed and area characteristics than designs implemented in MTCMOS. The exact leakage current obtained for MTCMOS is highly unpredictable, while our method exhibits leakage currents which are precisely estimable. The leakage current for HL designs can be dramatically lower than the worst-case leakage of MTCMOS based designs, and two orders of magnitude compared to traditional standard cells. Also, a design implemented in MTCMOS would require the use of separate power and ground supplies for latches and combinational logic, while our methodology does away with such a requirement.
本文介绍了一种基于改进标准单元的低漏标准单元的专用集成电路设计方法。这些电池被设计成在待机模式下消耗极低且可预测的泄漏电流。对于标准单元库中的每个单元,我们设计了两个低泄漏的单元变体。如果一个电池在待机模式下的输入使得输出有一个高值,我们就可以最小化下拉网络中的漏电,反之亦然。当技术映射电路时,我们确定在每个实例中使用的特定变体,以最大限度地减少最终映射设计的泄漏。我们设计并布置了改进的标准单元,并进行了实验,将我们的方法与MTCMOS和简单的ASIC流的放置和路由面积,泄漏和延迟进行了比较。我们比较的每种设计风格都使用相同的基本标准单元库。我们的研究结果表明,使用我们的方法获得的设计比在MTCMOS中实现的设计具有更好的速度和面积特性。MTCMOS的精确泄漏电流是高度不可预测的,而我们的方法显示的泄漏电流是精确估计的。HL设计的泄漏电流可以显著低于MTCMOS设计的最坏情况泄漏,与传统标准电池相比降低两个数量级。此外,在MTCMOS中实现的设计将需要为锁存器和组合逻辑使用单独的电源和接地电源,而我们的方法则不需要这样的要求。
{"title":"An ASIC design methodology with predictably low leakage, using leakage-immune standard cells","authors":"N. Jayakumar, S. Khatri","doi":"10.1145/871506.871539","DOIUrl":"https://doi.org/10.1145/871506.871539","url":null,"abstract":"In this paper we introduce a low-leakage standard cell based ASIC design methodology which is based on the use of modified standard cells. These cells are designed to consume extremely low and predictable leakage currents in standby mode. For each cell in a standard cell library, we design two low-leakage variants of the cell. If the inputs of a cell during the standby mode of operation are such that the output has a high value, we minimize the leakage in the pull-down network, and vice versa. While technology mapping a circuit, we determine the particular variant to utilize in each instance, so as to minimize leakage of the final mapped design. We have designed and laid out our modified standard cells, and have performed experiments to compare placed-and-routed area, leakage and delays of our method against MTCMOS and a straightforward ASIC flow. Each design style we compare utilizes the same base standard cell library. Our results show that designs obtained using our methodology have better speed and area characteristics than designs implemented in MTCMOS. The exact leakage current obtained for MTCMOS is highly unpredictable, while our method exhibits leakage currents which are precisely estimable. The leakage current for HL designs can be dramatically lower than the worst-case leakage of MTCMOS based designs, and two orders of magnitude compared to traditional standard cells. Also, a design implemented in MTCMOS would require the use of separate power and ground supplies for latches and combinational logic, while our methodology does away with such a requirement.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Strained-Si devices and circuits for low-power applications 用于低功耗应用的应变硅器件和电路
Keunwoo Kim, R. Joshi, C. Chuang
Static and dynamic power for strained-Si devices are analyzed and compared with conventional bulk-Si technology. Optimum device design points are suggested by controlling physical/structural device parameters. Strained-Si CMOS circuits are studied, showing substantially-reduced power consumption due to the unique advantageous features of strained-Si devices. The trade-off between power and performance in strained-Si devices/circuits is discussed. Further, analysis and low-power design points are applied and extended to strained Si on SOI substrate (SSOI) CMOS technology.
分析了应变硅器件的静态和动态功率,并与传统的大块硅技术进行了比较。通过控制器件的物理/结构参数,提出了器件的最佳设计点。研究了应变硅CMOS电路,由于应变硅器件的独特优势,其功耗大大降低。讨论了应变硅器件/电路中功率与性能之间的权衡。此外,分析和低功耗设计要点应用并扩展到应变Si on SOI衬底(SSOI) CMOS技术。
{"title":"Strained-Si devices and circuits for low-power applications","authors":"Keunwoo Kim, R. Joshi, C. Chuang","doi":"10.1109/LPE.2003.1231858","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231858","url":null,"abstract":"Static and dynamic power for strained-Si devices are analyzed and compared with conventional bulk-Si technology. Optimum device design points are suggested by controlling physical/structural device parameters. Strained-Si CMOS circuits are studied, showing substantially-reduced power consumption due to the unique advantageous features of strained-Si devices. The trade-off between power and performance in strained-Si devices/circuits is discussed. Further, analysis and low-power design points are applied and extended to strained Si on SOI substrate (SSOI) CMOS technology.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126272692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Exploiting program hotspots and code sequentiality for instruction cache leakage management 利用程序热点和代码顺序进行指令缓存泄漏管理
Jie S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir
Leakage energy optimization for caches has been the target of much recent effort. In this work, we focus on instruction caches and tailor two techniques that exploit the two major factors that shape the instruction access behavior, namely, hotspot execution and sequentiality. First, we adopt a hotspot detection mechanism by profiling the branch behavior at runtime and utilize this to implement a HotSpot based Leakage Management (HSLM) mechanism. Second, we exploit code sequentiality in implementing a Just-InTime Activation (JITA) that transitions cache lines to active mode just before they are accessed.,We utilize the recently proposed drowsy cache that dynamically scales voltages for leakage reduction and implement various schemes that use different combinations of HSLM and JITA. Our experimental evaluation using the SPEC2000 benchmark suite shows that instruction cache leakage energy consumption can be reduced by 63%, 49% and 29%; on the average, as compared to an unoptimized cache, a recently proposed hardware optimized cache, and a cache optimized using compiler, respectively. Further, we observe that these energy savings can be obtained without a significant impact on performance.
缓存的泄漏能量优化是最近许多努力的目标。在这项工作中,我们专注于指令缓存,并定制了两种技术,利用了影响指令访问行为的两个主要因素,即热点执行和顺序性。首先,通过分析分支运行时的行为,采用热点检测机制,实现基于热点的泄漏管理(HSLM)机制。其次,我们在实现即时激活(JITA)时利用代码顺序性,在访问缓存行之前将它们转换为活动模式。我们利用最近提出的动态缩放电压的休眠缓存来减少泄漏,并实现了使用HSLM和JITA不同组合的各种方案。我们使用SPEC2000基准测试套件进行的实验评估表明,指令缓存泄漏能耗可分别降低63%、49%和29%;平均而言,分别与未优化的缓存、最近提出的硬件优化缓存和使用编译器优化的缓存相比。此外,我们观察到这些节能可以在不显著影响性能的情况下获得。
{"title":"Exploiting program hotspots and code sequentiality for instruction cache leakage management","authors":"Jie S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir","doi":"10.1145/871506.871606","DOIUrl":"https://doi.org/10.1145/871506.871606","url":null,"abstract":"Leakage energy optimization for caches has been the target of much recent effort. In this work, we focus on instruction caches and tailor two techniques that exploit the two major factors that shape the instruction access behavior, namely, hotspot execution and sequentiality. First, we adopt a hotspot detection mechanism by profiling the branch behavior at runtime and utilize this to implement a HotSpot based Leakage Management (HSLM) mechanism. Second, we exploit code sequentiality in implementing a Just-InTime Activation (JITA) that transitions cache lines to active mode just before they are accessed.,We utilize the recently proposed drowsy cache that dynamically scales voltages for leakage reduction and implement various schemes that use different combinations of HSLM and JITA. Our experimental evaluation using the SPEC2000 benchmark suite shows that instruction cache leakage energy consumption can be reduced by 63%, 49% and 29%; on the average, as compared to an unoptimized cache, a recently proposed hardware optimized cache, and a cache optimized using compiler, respectively. Further, we observe that these energy savings can be obtained without a significant impact on performance.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128147186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Reducing energy and delay using efficient victim caches 使用有效的受害者缓存减少能量和延迟
G. Memik, Glenn D. Reinman, W. Mangione-Smith
In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We compare the proposed victim cache techniques to increasing the associativity or the size of the level I data cache and show that the enhanced victim cache technique yield better energy-delay and energy-delay-area products. We also propose techniques that predict the hit/miss behavior of the victim cache accesses and bypass the victim cache when a miss can be determined quickly. We report simulation results obtained from SimpleScalar/ARM modeling a representative Network Processor architecture. The simulations show that the victim cache is able to reduce the energy consumption by as much as 17.6% (8.6% on average) while reducing the execution time by as much as 8.4% (3.7% on average) for a set of representative applications.
在本文中,我们研究了在内存层次结构的第一层中提高命中率的方法。特别地,我们提出了受害者缓存结构,以减少对更多功耗结构(如2级缓存)的访问次数。我们将提出的受害者缓存技术与增加第一级数据缓存的关联性或大小进行比较,并表明增强的受害者缓存技术产生更好的能量延迟和能量延迟区域产品。我们还提出了预测受害者缓存访问的命中/未命中行为的技术,并在可以快速确定未命中时绕过受害者缓存。我们报告了使用SimpleScalar/ARM对具有代表性的网络处理器体系结构进行建模的仿真结果。模拟表明,对于一组代表性应用程序,受害者缓存能够减少多达17.6%(平均8.6%)的能耗,同时减少多达8.4%(平均3.7%)的执行时间。
{"title":"Reducing energy and delay using efficient victim caches","authors":"G. Memik, Glenn D. Reinman, W. Mangione-Smith","doi":"10.1109/LPE.2003.1231873","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231873","url":null,"abstract":"In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We compare the proposed victim cache techniques to increasing the associativity or the size of the level I data cache and show that the enhanced victim cache technique yield better energy-delay and energy-delay-area products. We also propose techniques that predict the hit/miss behavior of the victim cache accesses and bypass the victim cache when a miss can be determined quickly. We report simulation results obtained from SimpleScalar/ARM modeling a representative Network Processor architecture. The simulations show that the victim cache is able to reduce the energy consumption by as much as 17.6% (8.6% on average) while reducing the execution time by as much as 8.4% (3.7% on average) for a set of representative applications.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123656339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
期刊
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1