Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993597
Ming-Hung Chang, Chung-Ying Hsieh, Mei-Wei Chen, W. Hwang
A near-/sub-threshold programmable clock generator is proposed in this paper. The major challenge of the ultra-low voltage (ULV) circuits is that the lock-in range of the delay line is easily affected by the environmental variations. In the proposed clock generator, there is a PVT compensation unit which consists of a set of delay line and a PVT detector. The unit is responsible for adjusting the lock-in range of clock generator to guarantee successful clock lock. In addition, the variation-aware logic design is performed in the clock generator, which improves the reliability on process variation. Also, the adoption of pulse-circulating scheme suppresses process induced output clock jitter. Furthermore, it has the ability to generate the output clock with frequency from 1/8 to 4 times of the reference clock. The clock generator has been designed using UMC 65nm CMOS technology. The frequencies of reference clock are 625 kHz at 0.2V and 5MHz at 0.5V. The power consumptions are 0.18μW and 5.17μW, respectively, at 0.2V and 0.5V. The core area of this clock generator is 0.01mm2.
{"title":"Near-/sub-threshold DLL-based clock generator with PVT-aware locking range compensation","authors":"Ming-Hung Chang, Chung-Ying Hsieh, Mei-Wei Chen, W. Hwang","doi":"10.1109/ISLPED.2011.5993597","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993597","url":null,"abstract":"A near-/sub-threshold programmable clock generator is proposed in this paper. The major challenge of the ultra-low voltage (ULV) circuits is that the lock-in range of the delay line is easily affected by the environmental variations. In the proposed clock generator, there is a PVT compensation unit which consists of a set of delay line and a PVT detector. The unit is responsible for adjusting the lock-in range of clock generator to guarantee successful clock lock. In addition, the variation-aware logic design is performed in the clock generator, which improves the reliability on process variation. Also, the adoption of pulse-circulating scheme suppresses process induced output clock jitter. Furthermore, it has the ability to generate the output clock with frequency from 1/8 to 4 times of the reference clock. The clock generator has been designed using UMC 65nm CMOS technology. The frequencies of reference clock are 625 kHz at 0.2V and 5MHz at 0.5V. The power consumptions are 0.18μW and 5.17μW, respectively, at 0.2V and 0.5V. The core area of this clock generator is 0.01mm2.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"21 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114023877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional cache tag matching is based on addresses to identify correct data in caches. However, this tagging scheme is not efficient because tag bits are unnecessarily large. From our observations, there are not many unique tag bits due to typically small working sets, which are conventionally captured by TLBs. To effectively exploit this fact, we propose TLB index-based cache tagging scheme. This new tagging scheme reduces required number of tag bits to one-fourth of the conventional tagging scheme. The reduced tag bits decrease tag bits array area by 72% and its energy consumption by 58%. From our experiments, our proposed new tagging scheme reduces instruction cache energy consumption by 13% for embedded systems.
{"title":"TLB index-based tagging for cache energy reduction","authors":"Jongmin Lee, Seokin Hong, Soontae Kim","doi":"10.5555/2016802.2016828","DOIUrl":"https://doi.org/10.5555/2016802.2016828","url":null,"abstract":"Conventional cache tag matching is based on addresses to identify correct data in caches. However, this tagging scheme is not efficient because tag bits are unnecessarily large. From our observations, there are not many unique tag bits due to typically small working sets, which are conventionally captured by TLBs. To effectively exploit this fact, we propose TLB index-based cache tagging scheme. This new tagging scheme reduces required number of tag bits to one-fourth of the conventional tagging scheme. The reduced tag bits decrease tag bits array area by 72% and its energy consumption by 58%. From our experiments, our proposed new tagging scheme reduces instruction cache energy consumption by 13% for embedded systems.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122460953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993624
Lei Jiang, Youtao Zhang, Jun Yang
Phase Change Memory (PCM) recently has emerged as a promising memory technology. However it suffers from limited write endurance. Recent studies have shown that the lifetime of PCM cells heavily depends on the RESET energy. Typically, larger than optimal RESET current is employed to accommodate process variation. This leads to over-programming of cells, and dramatically-shortened lifetime. This paper proposes two innovative low power techniques, Fine-Grained Current Regulation (FGCR) and Voltage Upscaling (VU), to cut down the RESET current, leaving a small number of difficult-to-reset cells unused. We then utilize error correction code to rescue those cells. Our experimental results show that FGCR and VU reduce the PCM write power by 33%, and prolong the lifetime of a PCM chip by 71%–102%.
{"title":"Enhancing phase change memory lifetime through fine-grained current regulation and voltage upscaling","authors":"Lei Jiang, Youtao Zhang, Jun Yang","doi":"10.1109/ISLPED.2011.5993624","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993624","url":null,"abstract":"Phase Change Memory (PCM) recently has emerged as a promising memory technology. However it suffers from limited write endurance. Recent studies have shown that the lifetime of PCM cells heavily depends on the RESET energy. Typically, larger than optimal RESET current is employed to accommodate process variation. This leads to over-programming of cells, and dramatically-shortened lifetime. This paper proposes two innovative low power techniques, Fine-Grained Current Regulation (FGCR) and Voltage Upscaling (VU), to cut down the RESET current, leaving a small number of difficult-to-reset cells unused. We then utilize error correction code to rescue those cells. Our experimental results show that FGCR and VU reduce the PCM write power by 33%, and prolong the lifetime of a PCM chip by 71%–102%.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117147641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993631
Yi-Wei Chiu, Jihi-Yu Lin, Ming-Hsien Tu, S. Jou, C. Chuang
This paper presents a new 8T SRAM cell with data-aware cross-point Write operation and series connected Read buffer for low power and low voltage operation. The cell features a shared footer device to control the VGND for cell pass-gate (Write) transistors and the Read buffer. The row-based VGND control and the column-based data-aware Write Word-Line form a cross-point Write structure, thus eliminating Write Half-Select Disturb to facilitate bit-interleaving architecture. Replica based timing tracking circuit is used to control the pulse width of Word-Line Enable (WLE) signal to overcome the large timing variation at low voltage and to reduce the Word-Line active power consumption. A 4Kbit SRAM test chip implemented in 90nm HVT CMOS technology operates at 120MHz at 0.6V and 6MHz at 0.38V with measured power consumption of 2.99uW at 6MHz, 0.38V.
{"title":"8T Single-ended sub-threshold SRAM with cross-point data-aware write operation","authors":"Yi-Wei Chiu, Jihi-Yu Lin, Ming-Hsien Tu, S. Jou, C. Chuang","doi":"10.1109/ISLPED.2011.5993631","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993631","url":null,"abstract":"This paper presents a new 8T SRAM cell with data-aware cross-point Write operation and series connected Read buffer for low power and low voltage operation. The cell features a shared footer device to control the VGND for cell pass-gate (Write) transistors and the Read buffer. The row-based VGND control and the column-based data-aware Write Word-Line form a cross-point Write structure, thus eliminating Write Half-Select Disturb to facilitate bit-interleaving architecture. Replica based timing tracking circuit is used to control the pulse width of Word-Line Enable (WLE) signal to overcome the large timing variation at low voltage and to reduce the Word-Line active power consumption. A 4Kbit SRAM test chip implemented in 90nm HVT CMOS technology operates at 120MHz at 0.6V and 6MHz at 0.38V with measured power consumption of 2.99uW at 6MHz, 0.38V.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126948831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993633
Bushra Ahsan, Lorena Ndreu, I. Sideris, Yiannakis Sazeides, Sachin Idgunji, E. Özer
This work proposes to reduce energy by avoiding access to columns of on-chip SRAM arrays whose cell contents are all 1s or all 0s. We refer to this dynamic phenomenon as the Same-Cell-Content-Column (SCC-column). Analysis reveals that SCC-columns occur frequently in several processor arrays, such as tag arrays of L1 caches, TLBs and predictors. An interval based scheme that employs one bit per column is proposed to track whether we have a SCC-column. We explain how a SCC-column can be leveraged to reduce the energy needed for SRAM read and write accesses. Experimental analysis for a specific processor configuration reveals that the proposed scheme detects SCC-columns effectively. The potential energy savings of the proposed approach at 32nm often exceeds 40% for several processor arrays.
{"title":"Eliminating energy of same-content-cell-columns of on-chip SRAM arrays","authors":"Bushra Ahsan, Lorena Ndreu, I. Sideris, Yiannakis Sazeides, Sachin Idgunji, E. Özer","doi":"10.1109/ISLPED.2011.5993633","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993633","url":null,"abstract":"This work proposes to reduce energy by avoiding access to columns of on-chip SRAM arrays whose cell contents are all 1s or all 0s. We refer to this dynamic phenomenon as the Same-Cell-Content-Column (SCC-column). Analysis reveals that SCC-columns occur frequently in several processor arrays, such as tag arrays of L1 caches, TLBs and predictors. An interval based scheme that employs one bit per column is proposed to track whether we have a SCC-column. We explain how a SCC-column can be leveraged to reduce the energy needed for SRAM read and write accesses. Experimental analysis for a specific processor configuration reveals that the proposed scheme detects SCC-columns effectively. The potential energy savings of the proposed approach at 32nm often exceeds 40% for several processor arrays.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127085403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993672
Junyoung Park, J. Abraham
This paper introduces a design scheme that improves Energy-Delay Product (EDP) in conventional Dynamic Voltage Scaling (DVS) systems by exploiting timing margins. To achieve this scheme, we designed a high-speed Critical Path Monitor composed of several Critical Path Replicas, a Timing Checker, and a Toggle Flip-Flop. The replicas are implemented based on our proposed algorithm, which considers the following two facts: (a) the voltage scaling behavior of logic and interconnect are fundamentally different; (b) various logic gates show different sensitivity in regard to process, temperature, as well as voltage changes. Because the replicas are connected in parallel by C-elements, the longest delay selection among all of the replica delays is performed automatically, improving the system response time. If the utilizable margin is detected by the Timing Checker, the frequency controller increases system clock frequency in order to improve performance at a given voltage level. Using a 45nm CMOS technology, we implemented a 32-bit MIPS processor and multiple Critical Path Monitors. The simulation results reveal that our scheme can improve EDP of the conventional DVS by up to 62%.
{"title":"A fast, accurate and simple critical path monitor for improving energy-delay product in DVS systems","authors":"Junyoung Park, J. Abraham","doi":"10.1109/ISLPED.2011.5993672","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993672","url":null,"abstract":"This paper introduces a design scheme that improves Energy-Delay Product (EDP) in conventional Dynamic Voltage Scaling (DVS) systems by exploiting timing margins. To achieve this scheme, we designed a high-speed Critical Path Monitor composed of several Critical Path Replicas, a Timing Checker, and a Toggle Flip-Flop. The replicas are implemented based on our proposed algorithm, which considers the following two facts: (a) the voltage scaling behavior of logic and interconnect are fundamentally different; (b) various logic gates show different sensitivity in regard to process, temperature, as well as voltage changes. Because the replicas are connected in parallel by C-elements, the longest delay selection among all of the replica delays is performed automatically, improving the system response time. If the utilizable margin is detected by the Timing Checker, the frequency controller increases system clock frequency in order to improve performance at a given voltage level. Using a 45nm CMOS technology, we implemented a 32-bit MIPS processor and multiple Critical Path Monitors. The simulation results reveal that our scheme can improve EDP of the conventional DVS by up to 62%.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132348773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993675
Vaibhav Gupta, Debabrata Mohapatra, S. P. Park, A. Raghunathan, K. Roy
Low-power is an imperative requirement for portable multimedia devices employing various signal processing algorithms and architectures. In most multimedia applications, the final output is interpreted by human senses, which are not perfect. This fact obviates the need to produce exactly correct numerical outputs. Previous research in this context exploits error-resiliency primarily through voltage over-scaling, utilizing algorithmic and architectural techniques to mitigate the resulting errors. In this paper, we propose logic complexity reduction as an alternative approach to take advantage of the relaxation of numerical accuracy. We demonstrate this concept by proposing various imprecise or approximate Full Adder (FA) cells with reduced complexity at the transistor level, and utilize them to design approximate multi-bit adders. In addition to the inherent reduction in switched capacitance, our techniques result in significantly shorter critical paths, enabling voltage scaling. We design architectures for video and image compression algorithms using the proposed approximate arithmetic units, and evaluate them to demonstrate the efficacy of our approach. Post-layout simulations indicate power savings of up to 60% and area savings of up to 37% with an insignificant loss in output quality, when compared to existing implementations.
{"title":"IMPACT: IMPrecise adders for low-power approximate computing","authors":"Vaibhav Gupta, Debabrata Mohapatra, S. P. Park, A. Raghunathan, K. Roy","doi":"10.1109/ISLPED.2011.5993675","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993675","url":null,"abstract":"Low-power is an imperative requirement for portable multimedia devices employing various signal processing algorithms and architectures. In most multimedia applications, the final output is interpreted by human senses, which are not perfect. This fact obviates the need to produce exactly correct numerical outputs. Previous research in this context exploits error-resiliency primarily through voltage over-scaling, utilizing algorithmic and architectural techniques to mitigate the resulting errors. In this paper, we propose logic complexity reduction as an alternative approach to take advantage of the relaxation of numerical accuracy. We demonstrate this concept by proposing various imprecise or approximate Full Adder (FA) cells with reduced complexity at the transistor level, and utilize them to design approximate multi-bit adders. In addition to the inherent reduction in switched capacitance, our techniques result in significantly shorter critical paths, enabling voltage scaling. We design architectures for video and image compression algorithms using the proposed approximate arithmetic units, and evaluate them to demonstrate the efficacy of our approach. Post-layout simulations indicate power savings of up to 60% and area savings of up to 37% with an insignificant loss in output quality, when compared to existing implementations.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133448036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993627
Daeyeon Kim, V. Chandra, R. Aitken, D. Blaauw, D. Sylvester
As process technology scales, SRAM robustness is compromised. In addition, lowering the supply voltage to reduce power consumption further reduces the read and write margins. To maintain robustness, a new bitcell topology, 8-T bitcell, has been proposed and read where write operation can be separately optimized. However, it can aggravate the half select disturb when write word-line boosting is applied or the bitcell sizing is done to enable robust writability. The half select disturb issue limits the use of a bit-interleaved array configuration required for immunity to soft errors. The opposing characteristic between write operation and half select disturb generates a new constraint which should be carefully considered for robust operation of voltage-scaled bit-interleaved 8-T SRAMs. In this paper, we propose bit-interleaved writability analysis that captures the double-sided constraints placed on the word-line pulse width and voltage level to ensure writability while avoiding half select disturb issue. Using the proposed analysis, we investigate the effectiveness of word-line boosting and device sizing optimization on improving bitcell robustness in low voltage region. With 57.7% of area overhead and 0.1V of word-line boosting, we can achieve 4.6σ of VTH mismatch tolerance at 0.6V and it shows 41% of energy saving.
{"title":"Variation-aware static and dynamic writability analysis for voltage-scaled bit-interleaved 8-T SRAMs","authors":"Daeyeon Kim, V. Chandra, R. Aitken, D. Blaauw, D. Sylvester","doi":"10.1109/ISLPED.2011.5993627","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993627","url":null,"abstract":"As process technology scales, SRAM robustness is compromised. In addition, lowering the supply voltage to reduce power consumption further reduces the read and write margins. To maintain robustness, a new bitcell topology, 8-T bitcell, has been proposed and read where write operation can be separately optimized. However, it can aggravate the half select disturb when write word-line boosting is applied or the bitcell sizing is done to enable robust writability. The half select disturb issue limits the use of a bit-interleaved array configuration required for immunity to soft errors. The opposing characteristic between write operation and half select disturb generates a new constraint which should be carefully considered for robust operation of voltage-scaled bit-interleaved 8-T SRAMs. In this paper, we propose bit-interleaved writability analysis that captures the double-sided constraints placed on the word-line pulse width and voltage level to ensure writability while avoiding half select disturb issue. Using the proposed analysis, we investigate the effectiveness of word-line boosting and device sizing optimization on improving bitcell robustness in low voltage region. With 57.7% of area overhead and 0.1V of word-line boosting, we can achieve 4.6σ of VTH mismatch tolerance at 0.6V and it shows 41% of energy saving.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"5 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131922980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993601
Hong-Ting Lin, Yi-Lin Chuang, Tsung-Yi Ho
Minimizing the clock tree has been known as an effective approach to reduce power dissipation in modern circuit designs. However, most existing power-aware clock tree synthesis algorithms still focus on optimizing power in flip-flops, which might have limited power savings. In this work, we explore the pulsed-latch utilization in clock tree synthesis for further power savings. We are the first work in the literature to propose a novel synthesis algorithm to efficiently migrate a flip-flop-based clock tree into a pulsed-latch one. To maintain performance of a clock tree while considering load balance (skew issues) simultaneously, we determine the clock tree topology by the minimum-cost maximum-flow network. Experimental results show that our algorithm can further reduce power consumption by 22% on average compared to approaches without pulsed latches. Categories and Subject Descriptors: B.7.2 [Integrated Circuits]: Design Aids General Terms: Algorithms, Design
{"title":"Pulsed-latch-based clock tree migration for dynamic power reduction","authors":"Hong-Ting Lin, Yi-Lin Chuang, Tsung-Yi Ho","doi":"10.1109/ISLPED.2011.5993601","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993601","url":null,"abstract":"Minimizing the clock tree has been known as an effective approach to reduce power dissipation in modern circuit designs. However, most existing power-aware clock tree synthesis algorithms still focus on optimizing power in flip-flops, which might have limited power savings. In this work, we explore the pulsed-latch utilization in clock tree synthesis for further power savings. We are the first work in the literature to propose a novel synthesis algorithm to efficiently migrate a flip-flop-based clock tree into a pulsed-latch one. To maintain performance of a clock tree while considering load balance (skew issues) simultaneously, we determine the clock tree topology by the minimum-cost maximum-flow network. Experimental results show that our algorithm can further reduce power consumption by 22% on average compared to approaches without pulsed latches. Categories and Subject Descriptors: B.7.2 [Integrated Circuits]: Design Aids General Terms: Algorithms, Design","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133863133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993621
David Meisner, T. Wenisch
Data center efficiency has quickly become a first-class design goal. In response, many studies have emerged from the academic community and industry using low-power design to help improve the energy efficiency of server hardware. Generally, these proposals hold the assumption that low-power design is inherently better for energy efficiency; this preconception stems mostly from great success in the mobile space with building low-power, energy-efficient systems. We observe that unlike mobile devices, constraining a data center server to a low power budget is arbitrary and higher power design choices can be more energy efficient. We analyze the energy efficiency design space of past commercial server designs and find that high-power servers are generally more energy efficient than low-power ones. Furthermore, we evaluate building low- or high-power server clusters and find that the small increase in the cost of cooling high-powered servers is easily outweighed by their greater efficiency.
{"title":"Does low-power design imply energy efficiency for data centers?","authors":"David Meisner, T. Wenisch","doi":"10.1109/ISLPED.2011.5993621","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993621","url":null,"abstract":"Data center efficiency has quickly become a first-class design goal. In response, many studies have emerged from the academic community and industry using low-power design to help improve the energy efficiency of server hardware. Generally, these proposals hold the assumption that low-power design is inherently better for energy efficiency; this preconception stems mostly from great success in the mobile space with building low-power, energy-efficient systems. We observe that unlike mobile devices, constraining a data center server to a low power budget is arbitrary and higher power design choices can be more energy efficient. We analyze the energy efficiency design space of past commercial server designs and find that high-power servers are generally more energy efficient than low-power ones. Furthermore, we evaluate building low- or high-power server clusters and find that the small increase in the cost of cooling high-powered servers is easily outweighed by their greater efficiency.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133809320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}