Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993636
J. Woo, Taehoon Kim, Hyongmin Lee, Sunkwon Kim, Hyunjoong Lee, Suhwan Kim
In this paper, we describe a cyclic ADC to adopt the comparator-based switched-capacitor (CBSC) technique, for the first time, so as to compensate for the technology scaling and reduce power consumption by eliminating the need for high gain opamps. A boosted preset voltage is also introduced to improve the conversion rate without consuming more power. The ADC operates at 2.5MS/s, and near the Nyquist-rate, a prototype has a signal-to-noise and distortion ratio (SNDR) of 55.99 dB and a spurious-free dynamic-range (SFDR) of 66.85 dB. The chip was fabricated in 0.18μm CMOS and it has an active area of 0.146mm2 and consumes 0.74mW from a 1.8V supply.
{"title":"A comparator-based cyclic analog-to-digital converter with boosted preset voltage","authors":"J. Woo, Taehoon Kim, Hyongmin Lee, Sunkwon Kim, Hyunjoong Lee, Suhwan Kim","doi":"10.1109/ISLPED.2011.5993636","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993636","url":null,"abstract":"In this paper, we describe a cyclic ADC to adopt the comparator-based switched-capacitor (CBSC) technique, for the first time, so as to compensate for the technology scaling and reduce power consumption by eliminating the need for high gain opamps. A boosted preset voltage is also introduced to improve the conversion rate without consuming more power. The ADC operates at 2.5MS/s, and near the Nyquist-rate, a prototype has a signal-to-noise and distortion ratio (SNDR) of 55.99 dB and a spurious-free dynamic-range (SFDR) of 66.85 dB. The chip was fabricated in 0.18μm CMOS and it has an active area of 0.146mm2 and consumes 0.74mW from a 1.8V supply.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132746459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993602
Saro Meguerdichian, M. Potkonjak
Hardware-based physically unclonable functions (PUFs) leverage intrinsic process variation of modern integrated circuits to provide interesting security solutions but either induce high storage requirements or require significant resources of at least one involved party. We use device aging to realize two identical unclonable modules that cannot be matched with any third such module. Each device enables rapid, low-energy computation of ultra-complex functions that are too complex for simulation in any reasonable time. The approach induces negligible area and energy costs and enables a majority of security protocols to be completed in a single or a few clock cycles.
{"title":"Matched public PUF: Ultra low energy security platform","authors":"Saro Meguerdichian, M. Potkonjak","doi":"10.1109/ISLPED.2011.5993602","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993602","url":null,"abstract":"Hardware-based physically unclonable functions (PUFs) leverage intrinsic process variation of modern integrated circuits to provide interesting security solutions but either induce high storage requirements or require significant resources of at least one involved party. We use device aging to realize two identical unclonable modules that cannot be matched with any third such module. Each device enables rapid, low-energy computation of ultra-complex functions that are too complex for simulation in any reasonable time. The approach induces negligible area and energy costs and enables a majority of security protocols to be completed in a single or a few clock cycles.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126060420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993625
Chin-Hung Lin, Ing-Chao Lin, Kuan-Hui Li
NBTI (Negative Bias Temperature Instability), which can degrade the switching speed of PMOS transistors, has become a major reliability challenge. Meanwhile, reducing leakage consumption has become major design goals. In this paper, we propose a novel transmission gate-based (TG) technique to minimize NBTI-induced degradation and leakage. This technique provides higher flexibility compared to the gate replacement technique. Simulation results show our proposed technique has up to 20X and 2.44X on average improvement on NBTI-induced degradation with comparable leakage power reduction. With a 19% area penalty, combining our technique and the gate replacement can reduce 19.39% of the total leakage power and 36.56% of the NBTI-induced circuit degradation.
{"title":"TG-based technique for NBTI degradation and leakage optimization","authors":"Chin-Hung Lin, Ing-Chao Lin, Kuan-Hui Li","doi":"10.1109/ISLPED.2011.5993625","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993625","url":null,"abstract":"NBTI (Negative Bias Temperature Instability), which can degrade the switching speed of PMOS transistors, has become a major reliability challenge. Meanwhile, reducing leakage consumption has become major design goals. In this paper, we propose a novel transmission gate-based (TG) technique to minimize NBTI-induced degradation and leakage. This technique provides higher flexibility compared to the gate replacement technique. Simulation results show our proposed technique has up to 20X and 2.44X on average improvement on NBTI-induced degradation with comparable leakage power reduction. With a 19% area penalty, combining our technique and the gate replacement can reduce 19.39% of the total leakage power and 36.56% of the NBTI-induced circuit degradation.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122431817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993648
Hideki Takase, Gang Zeng, L. Gauthier, Hirotaka Kawashima, Noritoshi Atsumi, T. Tatematsu, Yoshitake Kobayashi, Shunitsu Kohara, T. Koshiro, T. Ishihara, H. Tomiyama, H. Takada
This paper presents a framework for the purpose of energy optimization of embedded real-time systems. We implemented the presented framework as an optimization toolchain and an energy-aware real-time operating system. Our framework is synthetic, that is, multiple techniques optimize the target application together. The main idea of our approach is to utilize a trade-off between energy and performance of the processor configuration. The optimal processor configuration is selected at each appropriate point in the task. Additionally, an optimization technique about the memory allocation is employed in our framework. Our framework is also gradual, that is, the target application is optimized in a step-by-step manner. The characteristic and the behavior of target applications are analyzed and optimized for both intra-task and inter-task levels by our toolchain at the static time. Based on the results of static time optimization, the runtime energy optimization is performed by a real-time operating system according to the behavior of the application. A case study shows that energy minimization is achieved on average while keeping the real-time performance.
{"title":"An integrated optimization framework for reducing the energy consumption of embedded real-time applications","authors":"Hideki Takase, Gang Zeng, L. Gauthier, Hirotaka Kawashima, Noritoshi Atsumi, T. Tatematsu, Yoshitake Kobayashi, Shunitsu Kohara, T. Koshiro, T. Ishihara, H. Tomiyama, H. Takada","doi":"10.1109/ISLPED.2011.5993648","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993648","url":null,"abstract":"This paper presents a framework for the purpose of energy optimization of embedded real-time systems. We implemented the presented framework as an optimization toolchain and an energy-aware real-time operating system. Our framework is synthetic, that is, multiple techniques optimize the target application together. The main idea of our approach is to utilize a trade-off between energy and performance of the processor configuration. The optimal processor configuration is selected at each appropriate point in the task. Additionally, an optimization technique about the memory allocation is employed in our framework. Our framework is also gradual, that is, the target application is optimized in a step-by-step manner. The characteristic and the behavior of target applications are analyzed and optimized for both intra-task and inter-task levels by our toolchain at the static time. Based on the results of static time optimization, the runtime energy optimization is performed by a real-time operating system according to the behavior of the application. A case study shows that energy minimization is achieved on average while keeping the real-time performance.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122760782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993593
T. Hattori
Renesas Mobile Corporation (RMC), established on the first of December 2010, comes to the global chipset market with advanced and innovative products and services for mobile phones, car infotainment solutions, consumer electronics and industrial applications. The modem group in RMC comes with a strong pedigree from Nokia. The group has developed all Nokia's in-house modems and formed an essential part of the chipset development for Nokia products since the time of NMT and the first generation of GSM. The world-class and leading wireless connectivity expertise is visible today as widely accepted modem technology and IP in billions of handsets. Renesas Mobile continues on this path by combining the modem asset with Renesas's unique experience in the field of applications processors, microprocessors and controllers to form a base for highly integrated single- or multichip mobile platforms. This presentation introduces RMC's leading edge low power technology for GSM, LTE and WCDMA, and also application processors.
{"title":"Low-power and high-performance technologies for mobile SoC in LTE era","authors":"T. Hattori","doi":"10.1109/ISLPED.2011.5993593","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993593","url":null,"abstract":"Renesas Mobile Corporation (RMC), established on the first of December 2010, comes to the global chipset market with advanced and innovative products and services for mobile phones, car infotainment solutions, consumer electronics and industrial applications. The modem group in RMC comes with a strong pedigree from Nokia. The group has developed all Nokia's in-house modems and formed an essential part of the chipset development for Nokia products since the time of NMT and the first generation of GSM. The world-class and leading wireless connectivity expertise is visible today as widely accepted modem technology and IP in billions of handsets. Renesas Mobile continues on this path by combining the modem asset with Renesas's unique experience in the field of applications processors, microprocessors and controllers to form a base for highly integrated single- or multichip mobile platforms. This presentation introduces RMC's leading edge low power technology for GSM, LTE and WCDMA, and also application processors.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130966902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1587/TRANSFUN.E94.A.2609
Xun He, Dajiang Zhou, Xin Jin, Satoshi Goto
This paper presents a high-performance dual-issue 32-core SIMD platform for image and video processing. Eight cores with a 4-ports L2 cache are connected by CIB bus as a cluster. Four clusters are connected by mesh network. The proposed hierarchical network can provide 192 GB/sintercore communication BW in average. To reduce coherence operation in large-scale SMP, an application specified protocol is proposed. Comparing with MOESI, 67.8% of L1 Cache energy can be saved in 32 cores case. It can achieve a peak performance of 375 GMACs and 98 GMACs/W in 65 nm CMOS.
{"title":"A 98 GMACs/W 32-core vector processor in 65nm CMOS","authors":"Xun He, Dajiang Zhou, Xin Jin, Satoshi Goto","doi":"10.1587/TRANSFUN.E94.A.2609","DOIUrl":"https://doi.org/10.1587/TRANSFUN.E94.A.2609","url":null,"abstract":"This paper presents a high-performance dual-issue 32-core SIMD platform for image and video processing. Eight cores with a 4-ports L2 cache are connected by CIB bus as a cluster. Four clusters are connected by mesh network. The proposed hierarchical network can provide 192 GB/sintercore communication BW in average. To reduce coherence operation in large-scale SMP, an application specified protocol is proposed. Comparing with MOESI, 67.8% of L1 Cache energy can be saved in 32 cores case. It can achieve a peak performance of 375 GMACs and 98 GMACs/W in 65 nm CMOS.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130712705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993623
Anurag Nigam, IV ClintonWillsSmullen, Vidyabhushan Mohan, E. Chen, S. Gurumurthi, M. Stan
Spin-Transfer Torque RAM (STT-RAM) has emerged as a potential candidate for Universal memory. However, there are two challenges to using STT-RAM in memory system design: (1) the intrinsic variation in the storage element, the Magnetic Tunnel Junction (MTJ), and (2) the high write energy. In this paper, we present a physically based thermal noise model for simulating the statistical variations of MTJs. We have implemented it in HSPICE and validated it against analytical results. We demonstrate its use in setting the write pulse width for a given write error rate. We then propose two write-energy reduction techniques. At the device level, we propose the use of a low-MS ferromagnetic material that can reduce the write energy without sacrificing retention time. At the architecture level, we show that Invert Coding provides a 7% average reduction in the total write energy for the SPEC CPU2006 benchmark suite without any performance overhead.
{"title":"Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM)","authors":"Anurag Nigam, IV ClintonWillsSmullen, Vidyabhushan Mohan, E. Chen, S. Gurumurthi, M. Stan","doi":"10.1109/ISLPED.2011.5993623","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993623","url":null,"abstract":"Spin-Transfer Torque RAM (STT-RAM) has emerged as a potential candidate for Universal memory. However, there are two challenges to using STT-RAM in memory system design: (1) the intrinsic variation in the storage element, the Magnetic Tunnel Junction (MTJ), and (2) the high write energy. In this paper, we present a physically based thermal noise model for simulating the statistical variations of MTJs. We have implemented it in HSPICE and validated it against analytical results. We demonstrate its use in setting the write pulse width for a given write error rate. We then propose two write-energy reduction techniques. At the device level, we propose the use of a low-MS ferromagnetic material that can reduce the write energy without sacrificing retention time. At the architecture level, we show that Invert Coding provides a 7% average reduction in the total write energy for the SPEC CPU2006 benchmark suite without any performance overhead.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114281723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993651
M. Yabuuchi, Y. Tsukamoto, H. Fujiwara, Shigeki Tawa, Koji Maekawa, M. Igarashi, K. Nii
In this paper, we propose an SRAM macro that realizes 0.5V operation by combining a device technique with simple design architecture. Regarding the device technique, we utilize asymmetric halo implant MOSFETs, which enables to enhance both the static noise margin and write margin of SRAM, simultaneously. As for the design technique, dynamic body-bias scheme which operates body bias dynamically is introduced to overcome the speed degradation due to lower supply voltage. Showing measured data fabricated on 45nm CMOS technology, we demonstrate a plausible scenario for achieving 0.5V operating SoC products.
{"title":"A dynamic body-biased SRAM with asymmetric halo implant MOSFETs","authors":"M. Yabuuchi, Y. Tsukamoto, H. Fujiwara, Shigeki Tawa, Koji Maekawa, M. Igarashi, K. Nii","doi":"10.1109/ISLPED.2011.5993651","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993651","url":null,"abstract":"In this paper, we propose an SRAM macro that realizes 0.5V operation by combining a device technique with simple design architecture. Regarding the device technique, we utilize asymmetric halo implant MOSFETs, which enables to enhance both the static noise margin and write margin of SRAM, simultaneously. As for the design technique, dynamic body-bias scheme which operates body bias dynamically is introduced to overcome the speed degradation due to lower supply voltage. Showing measured data fabricated on 45nm CMOS technology, we demonstrate a plausible scenario for achieving 0.5V operating SoC products.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116597137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993646
D. Dondi, P. Zappi, T. Simunic
Getting consistent results when monitoring phenomena is a challenging but critical task for solar powered wireless high power embedded systems. Our algorithm relies on an energy predictor to achieve uniform monitoring over time while maximizing the number of tasks executed. Our approach outperforms state of the art algorithms by increasing the number of daily measurement by 30% and reducing their standard deviation by 5.5×.
{"title":"A scheduling algorithm for consistent monitoring results with solar powered high-performance wireless embedded systems","authors":"D. Dondi, P. Zappi, T. Simunic","doi":"10.1109/ISLPED.2011.5993646","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993646","url":null,"abstract":"Getting consistent results when monitoring phenomena is a challenging but critical task for solar powered wireless high power embedded systems. Our algorithm relies on an energy predictor to achieve uniform monitoring over time while maximizing the number of tasks executed. Our approach outperforms state of the art algorithms by increasing the number of daily measurement by 30% and reducing their standard deviation by 5.5×.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115521430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-08-01DOI: 10.1109/ISLPED.2011.5993635
Jun-Hong Weng, Ching-Yuan Yang, Yi-Lin Jhu
A new approach for d irect digital frequency synthesizer (DDFS) with analogue sine conversion is presented. The proposed DDFS adopts the ROM-less architecture with linear DAC to achieve higher speed operation and lower power consumption. Fabricated by 0.18-μm CMOS process, the DDFS employs a 9-bits pipe line accumulator to provide an 8-bits amplitude resolution for the DAC circuit. At 1-GHz clock frequency, the power consumption is 50 mw at 1. 8-V power supply and the spurious free dynamic range (SFDR) is 44 dBc at the N yquist synthesized frequency. The total chip area is 0.52 mm2.
{"title":"A low-power direct digital frequency synthesizer using an analogue-sine-conversion technique","authors":"Jun-Hong Weng, Ching-Yuan Yang, Yi-Lin Jhu","doi":"10.1109/ISLPED.2011.5993635","DOIUrl":"https://doi.org/10.1109/ISLPED.2011.5993635","url":null,"abstract":"A new approach for d irect digital frequency synthesizer (DDFS) with analogue sine conversion is presented. The proposed DDFS adopts the ROM-less architecture with linear DAC to achieve higher speed operation and lower power consumption. Fabricated by 0.18-μm CMOS process, the DDFS employs a 9-bits pipe line accumulator to provide an 8-bits amplitude resolution for the DAC circuit. At 1-GHz clock frequency, the power consumption is 50 mw at 1. 8-V power supply and the spurious free dynamic range (SFDR) is 44 dBc at the N yquist synthesized frequency. The total chip area is 0.52 mm2.","PeriodicalId":117694,"journal":{"name":"IEEE/ACM International Symposium on Low Power Electronics and Design","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129749219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}