Chih-Jen Yen, Mely Chen Chi, Danny Wen-Yaw Chung, Shing-Hao Lee
This work presents a single-channel analog processor IC for the wireless biosignal monitor. This chip occupies a small die area of 0.52 mm/sup 2/ and has a low power consumption of 0.75 mW at a 5-V supply voltage. The wired and wireless systems constructed by using the designed processor chip and commercial discrete lCs have been validated in this study. Experimental results indicate that the integrated single-chip processor system can amplify, filter, transmit, and receive the simulated ECG signal. Compared to the wired prototype system, wireless transmission provides better long-distance, long-term measuring, recording, and monitoring the biosignal.
{"title":"A 0.75-mW analog processor IC for wireless biosignal monitor","authors":"Chih-Jen Yen, Mely Chen Chi, Danny Wen-Yaw Chung, Shing-Hao Lee","doi":"10.1145/871506.871616","DOIUrl":"https://doi.org/10.1145/871506.871616","url":null,"abstract":"This work presents a single-channel analog processor IC for the wireless biosignal monitor. This chip occupies a small die area of 0.52 mm/sup 2/ and has a low power consumption of 0.75 mW at a 5-V supply voltage. The wired and wireless systems constructed by using the designed processor chip and commercial discrete lCs have been validated in this study. Experimental results indicate that the integrated single-chip processor system can amplify, filter, transmit, and receive the simulated ECG signal. Compared to the wired prototype system, wireless transmission provides better long-distance, long-term measuring, recording, and monitoring the biosignal.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115829927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyun Suk Kim, N. Vijaykrishnan, M. Kandemir, E. Brockmeyer, F. Catthoor, M. J. Irwin
An important problem in extracting maximum benefits from an SDRAM-based architecture is to exploit data locality at the page granularity. Frequent switches between data pages can increase memory latency and have an impact on energy consumption. In this paper, we propose a mathematical formulation, using Presburger arithmetic and Ehrhart polynomials to-estimate the number of page breaks statically (i.e., at compile time). The results obtained using video codes indicate that the proposed framework can estimate the number of page breaks with good accuracy.
{"title":"Estimating influence of data layout optimizations on SDRAM energy consumption","authors":"Hyun Suk Kim, N. Vijaykrishnan, M. Kandemir, E. Brockmeyer, F. Catthoor, M. J. Irwin","doi":"10.1145/871506.871520","DOIUrl":"https://doi.org/10.1145/871506.871520","url":null,"abstract":"An important problem in extracting maximum benefits from an SDRAM-based architecture is to exploit data locality at the page granularity. Frequent switches between data pages can increase memory latency and have an impact on energy consumption. In this paper, we propose a mathematical formulation, using Presburger arithmetic and Ehrhart polynomials to-estimate the number of page breaks statically (i.e., at compile time). The results obtained using video codes indicate that the proposed framework can estimate the number of page breaks with good accuracy.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"53 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121011681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-25DOI: 10.1109/LPE.2003.1231860
W. Choi, J. Lee, Byung-Gook Park
RODOS (Reverse-Order source/drain formation with Double Offset Spacer) was proposed for low-power, high-speed and low-noise amplifiers. Relying on simulation data, we confirmed the high feasibility of the RODOS process. It showed improved performance in linearity (V/sub IP3/). Additionally, by optimizing process parameters, we achieved small gate delay (CV/I) and low static/dynamic power consumption. The process satisfied most of the requirements of LOP and LSTP in ITRS 2002. Finally, we found that devices with the RODOS structure can be a promising alternative to implement low-power, high-speed and low-noise amplifiers for radio on a chip.
{"title":"Reverse-order source/drain formation with double offset spacer (RODOS) for CMOS low-power, high-speed and low-noise amplifiers","authors":"W. Choi, J. Lee, Byung-Gook Park","doi":"10.1109/LPE.2003.1231860","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231860","url":null,"abstract":"RODOS (Reverse-Order source/drain formation with Double Offset Spacer) was proposed for low-power, high-speed and low-noise amplifiers. Relying on simulation data, we confirmed the high feasibility of the RODOS process. It showed improved performance in linearity (V/sub IP3/). Additionally, by optimizing process parameters, we achieved small gate delay (CV/I) and low static/dynamic power consumption. The process satisfied most of the requirements of LOP and LSTP in ITRS 2002. Finally, we found that devices with the RODOS structure can be a promising alternative to implement low-power, high-speed and low-noise amplifiers for radio on a chip.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiled architectures provide a paradigm for designers to turn silicon resources into processors with burgeoning quantities of programmable functional units and memories. The architecture has a dual responsibility: first, it must expose these resources in a way that is programmable. Second, it needs to manage the power associated with such resources. We present the power management facilities of the 16-tile Raw microprocessor. This design selectively turns on and off 48 SRAM macros, 96 functional unit clusters, 32 fetch units,and over 250 unique processor pipeline, stages, all according to the needs of the computation and environment at hand.
{"title":"Energy characterization of a tiled architecture processor with on-chip networks","authors":"J. Kim, M. Taylor, Jason E. Miller, D. Wentzlaff","doi":"10.1145/871506.871610","DOIUrl":"https://doi.org/10.1145/871506.871610","url":null,"abstract":"Tiled architectures provide a paradigm for designers to turn silicon resources into processors with burgeoning quantities of programmable functional units and memories. The architecture has a dual responsibility: first, it must expose these resources in a way that is programmable. Second, it needs to manage the power associated with such resources. We present the power management facilities of the 16-tile Raw microprocessor. This design selectively turns on and off 48 SRAM macros, 96 functional unit clusters, 32 fetch units,and over 250 unique processor pipeline, stages, all according to the needs of the computation and environment at hand.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124922728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.
本文提出了一种用于并行级联卷积码turbo译码的低功耗结构。所提出的架构是通过块交错计算的概念推导出来的,然后是折叠、重新定时和电压缩放。块交错计算可以应用于任何对数据块进行操作的数据处理单元,并满足以下三个属性:1)块之间的计算是独立的;2)一个块可以被分割成计算独立的子块;3)子块内的计算是递归的。块交错计算、折叠和重定时的应用使MAP解码器的添加比较选择(ACS)内核的关键路径延迟降低了50% ~ 84%,而面积开销为14% ~ 70%。随后的应用电压缩放导致高达65%的电力节省块交错深度为6。通过晶体管级时序和功率分析工具获得的实验结果表明,在0.25 /spl μ m CMOS工艺中,块交错深度为2可节省20%-44%的功耗。
{"title":"A low-power VLSI architecture for turbo decoding","authors":"Seok-Jun Lee, Naresh R Shanbhag, A. Singer","doi":"10.1145/871506.871599","DOIUrl":"https://doi.org/10.1145/871506.871599","url":null,"abstract":"Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131749197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasingly constrained power budget of today's microprocessor has resulted in a situation where power savings of all components in a system have to be taken into consideration. The operating system (OS) is a major power consumer in many modem applications execution. This paper advocates a routine based OS-aware microprocessor resource adaptation mechanism targeting run-time OS power savings. Simulation results show that compared with the existing sampling-based adaptation schemes, this novel methodology yields more attractive power and performance trade-off on the OS execution. To our knowledge, this paper is the first to address the power saving issue of the OS itself, an increasingly important area that has been largely overlooked in the previous studies.
{"title":"Routine based OS-aware microprocessor resource adaptation for run-time operating system power saving","authors":"Tao Li, L. John","doi":"10.1145/871506.871565","DOIUrl":"https://doi.org/10.1145/871506.871565","url":null,"abstract":"The increasingly constrained power budget of today's microprocessor has resulted in a situation where power savings of all components in a system have to be taken into consideration. The operating system (OS) is a major power consumer in many modem applications execution. This paper advocates a routine based OS-aware microprocessor resource adaptation mechanism targeting run-time OS power savings. Simulation results show that compared with the existing sampling-based adaptation schemes, this novel methodology yields more attractive power and performance trade-off on the OS execution. To our knowledge, this paper is the first to address the power saving issue of the OS itself, an increasingly important area that has been largely overlooked in the previous studies.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130441533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we introduce a low-leakage standard cell based ASIC design methodology which is based on the use of modified standard cells. These cells are designed to consume extremely low and predictable leakage currents in standby mode. For each cell in a standard cell library, we design two low-leakage variants of the cell. If the inputs of a cell during the standby mode of operation are such that the output has a high value, we minimize the leakage in the pull-down network, and vice versa. While technology mapping a circuit, we determine the particular variant to utilize in each instance, so as to minimize leakage of the final mapped design. We have designed and laid out our modified standard cells, and have performed experiments to compare placed-and-routed area, leakage and delays of our method against MTCMOS and a straightforward ASIC flow. Each design style we compare utilizes the same base standard cell library. Our results show that designs obtained using our methodology have better speed and area characteristics than designs implemented in MTCMOS. The exact leakage current obtained for MTCMOS is highly unpredictable, while our method exhibits leakage currents which are precisely estimable. The leakage current for HL designs can be dramatically lower than the worst-case leakage of MTCMOS based designs, and two orders of magnitude compared to traditional standard cells. Also, a design implemented in MTCMOS would require the use of separate power and ground supplies for latches and combinational logic, while our methodology does away with such a requirement.
{"title":"An ASIC design methodology with predictably low leakage, using leakage-immune standard cells","authors":"N. Jayakumar, S. Khatri","doi":"10.1145/871506.871539","DOIUrl":"https://doi.org/10.1145/871506.871539","url":null,"abstract":"In this paper we introduce a low-leakage standard cell based ASIC design methodology which is based on the use of modified standard cells. These cells are designed to consume extremely low and predictable leakage currents in standby mode. For each cell in a standard cell library, we design two low-leakage variants of the cell. If the inputs of a cell during the standby mode of operation are such that the output has a high value, we minimize the leakage in the pull-down network, and vice versa. While technology mapping a circuit, we determine the particular variant to utilize in each instance, so as to minimize leakage of the final mapped design. We have designed and laid out our modified standard cells, and have performed experiments to compare placed-and-routed area, leakage and delays of our method against MTCMOS and a straightforward ASIC flow. Each design style we compare utilizes the same base standard cell library. Our results show that designs obtained using our methodology have better speed and area characteristics than designs implemented in MTCMOS. The exact leakage current obtained for MTCMOS is highly unpredictable, while our method exhibits leakage currents which are precisely estimable. The leakage current for HL designs can be dramatically lower than the worst-case leakage of MTCMOS based designs, and two orders of magnitude compared to traditional standard cells. Also, a design implemented in MTCMOS would require the use of separate power and ground supplies for latches and combinational logic, while our methodology does away with such a requirement.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-25DOI: 10.1109/LPE.2003.1231858
Keunwoo Kim, R. Joshi, C. Chuang
Static and dynamic power for strained-Si devices are analyzed and compared with conventional bulk-Si technology. Optimum device design points are suggested by controlling physical/structural device parameters. Strained-Si CMOS circuits are studied, showing substantially-reduced power consumption due to the unique advantageous features of strained-Si devices. The trade-off between power and performance in strained-Si devices/circuits is discussed. Further, analysis and low-power design points are applied and extended to strained Si on SOI substrate (SSOI) CMOS technology.
分析了应变硅器件的静态和动态功率,并与传统的大块硅技术进行了比较。通过控制器件的物理/结构参数,提出了器件的最佳设计点。研究了应变硅CMOS电路,由于应变硅器件的独特优势,其功耗大大降低。讨论了应变硅器件/电路中功率与性能之间的权衡。此外,分析和低功耗设计要点应用并扩展到应变Si on SOI衬底(SSOI) CMOS技术。
{"title":"Strained-Si devices and circuits for low-power applications","authors":"Keunwoo Kim, R. Joshi, C. Chuang","doi":"10.1109/LPE.2003.1231858","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231858","url":null,"abstract":"Static and dynamic power for strained-Si devices are analyzed and compared with conventional bulk-Si technology. Optimum device design points are suggested by controlling physical/structural device parameters. Strained-Si CMOS circuits are studied, showing substantially-reduced power consumption due to the unique advantageous features of strained-Si devices. The trade-off between power and performance in strained-Si devices/circuits is discussed. Further, analysis and low-power design points are applied and extended to strained Si on SOI substrate (SSOI) CMOS technology.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126272692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir
Leakage energy optimization for caches has been the target of much recent effort. In this work, we focus on instruction caches and tailor two techniques that exploit the two major factors that shape the instruction access behavior, namely, hotspot execution and sequentiality. First, we adopt a hotspot detection mechanism by profiling the branch behavior at runtime and utilize this to implement a HotSpot based Leakage Management (HSLM) mechanism. Second, we exploit code sequentiality in implementing a Just-InTime Activation (JITA) that transitions cache lines to active mode just before they are accessed.,We utilize the recently proposed drowsy cache that dynamically scales voltages for leakage reduction and implement various schemes that use different combinations of HSLM and JITA. Our experimental evaluation using the SPEC2000 benchmark suite shows that instruction cache leakage energy consumption can be reduced by 63%, 49% and 29%; on the average, as compared to an unoptimized cache, a recently proposed hardware optimized cache, and a cache optimized using compiler, respectively. Further, we observe that these energy savings can be obtained without a significant impact on performance.
{"title":"Exploiting program hotspots and code sequentiality for instruction cache leakage management","authors":"Jie S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir","doi":"10.1145/871506.871606","DOIUrl":"https://doi.org/10.1145/871506.871606","url":null,"abstract":"Leakage energy optimization for caches has been the target of much recent effort. In this work, we focus on instruction caches and tailor two techniques that exploit the two major factors that shape the instruction access behavior, namely, hotspot execution and sequentiality. First, we adopt a hotspot detection mechanism by profiling the branch behavior at runtime and utilize this to implement a HotSpot based Leakage Management (HSLM) mechanism. Second, we exploit code sequentiality in implementing a Just-InTime Activation (JITA) that transitions cache lines to active mode just before they are accessed.,We utilize the recently proposed drowsy cache that dynamically scales voltages for leakage reduction and implement various schemes that use different combinations of HSLM and JITA. Our experimental evaluation using the SPEC2000 benchmark suite shows that instruction cache leakage energy consumption can be reduced by 63%, 49% and 29%; on the average, as compared to an unoptimized cache, a recently proposed hardware optimized cache, and a cache optimized using compiler, respectively. Further, we observe that these energy savings can be obtained without a significant impact on performance.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128147186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-25DOI: 10.1109/LPE.2003.1231873
G. Memik, Glenn D. Reinman, W. Mangione-Smith
In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We compare the proposed victim cache techniques to increasing the associativity or the size of the level I data cache and show that the enhanced victim cache technique yield better energy-delay and energy-delay-area products. We also propose techniques that predict the hit/miss behavior of the victim cache accesses and bypass the victim cache when a miss can be determined quickly. We report simulation results obtained from SimpleScalar/ARM modeling a representative Network Processor architecture. The simulations show that the victim cache is able to reduce the energy consumption by as much as 17.6% (8.6% on average) while reducing the execution time by as much as 8.4% (3.7% on average) for a set of representative applications.
{"title":"Reducing energy and delay using efficient victim caches","authors":"G. Memik, Glenn D. Reinman, W. Mangione-Smith","doi":"10.1109/LPE.2003.1231873","DOIUrl":"https://doi.org/10.1109/LPE.2003.1231873","url":null,"abstract":"In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We compare the proposed victim cache techniques to increasing the associativity or the size of the level I data cache and show that the enhanced victim cache technique yield better energy-delay and energy-delay-area products. We also propose techniques that predict the hit/miss behavior of the victim cache accesses and bypass the victim cache when a miss can be determined quickly. We report simulation results obtained from SimpleScalar/ARM modeling a representative Network Processor architecture. The simulations show that the victim cache is able to reduce the energy consumption by as much as 17.6% (8.6% on average) while reducing the execution time by as much as 8.4% (3.7% on average) for a set of representative applications.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123656339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}