We propose a power scalable digital base band for a low-IF receiver for IEEE 802.15.4-2006. The digital section’s sampling frequency and bit width are used as knobs to reduce the power under favorable signal and interference scenarios, thus recovering the design margins introduced to handle worst case conditions. We propose tuning of these knobs based on measurements of Signal and the interference levels. We show that in a 0.13u CMOS technology, for an adaptive digital base band section of the receiver designed to meet the 802.15.4 standard specification, power saving can be up to nearly 85% (0.49mW against 3.3mW) in favorable interference and signal conditions.
{"title":"Power Scalable Digital Baseband Architecture for IEEE 802.15.4","authors":"S. Dwivedi, B. Amrutur, N. Bhat","doi":"10.1109/VLSID.2011.64","DOIUrl":"https://doi.org/10.1109/VLSID.2011.64","url":null,"abstract":"We propose a power scalable digital base band for a low-IF receiver for IEEE 802.15.4-2006. The digital section’s sampling frequency and bit width are used as knobs to reduce the power under favorable signal and interference scenarios, thus recovering the design margins introduced to handle worst case conditions. We propose tuning of these knobs based on measurements of Signal and the interference levels. We show that in a 0.13u CMOS technology, for an adaptive digital base band section of the receiver designed to meet the 802.15.4 standard specification, power saving can be up to nearly 85% (0.49mW against 3.3mW) in favorable interference and signal conditions.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133003925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growth of oscillation in a quadrature oscillator that employs phase shifters in the coupling-path between a pair of LC-loaded negative resistance cores, is analyzed. Such an oscillator is known to have two stable modes of oscillation. Under a noise-initiated startup from an unstable initial condition, quasiharmonic assumption and the Method of First Approximation are used to demonstrate that compression mechanisms lead to preferential enhancement of one mode, while attenuating the other. The magnitude of phase shift in the coupling-path is shown to directly affect the ratio between temporal rates of mode buildup and decay.
{"title":"Evolution of Oscillation in a Quadrature Oscillator","authors":"Diptendu Ghosh, R. Gharpurey","doi":"10.1109/VLSID.2011.25","DOIUrl":"https://doi.org/10.1109/VLSID.2011.25","url":null,"abstract":"The growth of oscillation in a quadrature oscillator that employs phase shifters in the coupling-path between a pair of LC-loaded negative resistance cores, is analyzed. Such an oscillator is known to have two stable modes of oscillation. Under a noise-initiated startup from an unstable initial condition, quasiharmonic assumption and the Method of First Approximation are used to demonstrate that compression mechanisms lead to preferential enhancement of one mode, while attenuating the other. The magnitude of phase shift in the coupling-path is shown to directly affect the ratio between temporal rates of mode buildup and decay.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127121274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Run-time power gating for aggressive leakage reduction has brought into focus the cost of mode transition overheads due to frequent switching between sleep and active modes of circuit operation. In order to design circuits for effective power gating, logic circuits must be characterized for overheads they present during mode transitions. In this paper, we describe a method to determine steady-state virtual-supply voltage in active mode and hence present a model for virtual supply voltage in terms of basic circuit parameters. Further, we derive expressions for estimation of two mode transition overheads: wakeup time and wakeup energy for a power-gated logic cluster using the proposed model. Finally we demonstrate its application to four ISCAS benchmark circuits while also analyzing the accuracy of approximations used in the model.
{"title":"Wakeup Time and Wakeup Energy Estimation in Power-Gated Logic Clusters","authors":"Vivek D. Tovinakere, O. Sentieys, Steven Derrien","doi":"10.1109/VLSID.2011.18","DOIUrl":"https://doi.org/10.1109/VLSID.2011.18","url":null,"abstract":"Run-time power gating for aggressive leakage reduction has brought into focus the cost of mode transition overheads due to frequent switching between sleep and active modes of circuit operation. In order to design circuits for effective power gating, logic circuits must be characterized for overheads they present during mode transitions. In this paper, we describe a method to determine steady-state virtual-supply voltage in active mode and hence present a model for virtual supply voltage in terms of basic circuit parameters. Further, we derive expressions for estimation of two mode transition overheads: wakeup time and wakeup energy for a power-gated logic cluster using the proposed model. Finally we demonstrate its application to four ISCAS benchmark circuits while also analyzing the accuracy of approximations used in the model.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129567723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shibaji Banerjee, J. Mathew, D. Pradhan, S. Mohanty, M. Ciesielski
As technology scales down to nanometer regime the process variations have profound effect on circuit characteristics. Meeting timing and power constraints under such process variations in nano-CMOS circuit design is increasingly difficult. This causes a shifting from worst-case based analysis and optimization to statistical or probability based analysis and optimization at every level of circuit abstraction. This paper presents a TED (Taylor Expansion Diagram) based multi ? Tox techniques during high-level synthesis (HLS). A variation-aware simultaneous scheduling and resource binding algorithm is proposed which maximizes the power yield under timing yield and performance constraint. For this purpose, a multi ? Tox library is characterized under process variation. The delay and power distribution of different functional units are exhaustively studied. The proposed variation-aware algorithm uses those components for generating low power RTL under a given timing yield and performance constraint. The experimental results show significant improvement as high as 95% on leakage power yield under given constraints.
{"title":"Variation-Aware TED-Based Approach for Nano-CMOS RTL Leakage Optimization","authors":"Shibaji Banerjee, J. Mathew, D. Pradhan, S. Mohanty, M. Ciesielski","doi":"10.1109/VLSID.2011.40","DOIUrl":"https://doi.org/10.1109/VLSID.2011.40","url":null,"abstract":"As technology scales down to nanometer regime the process variations have profound effect on circuit characteristics. Meeting timing and power constraints under such process variations in nano-CMOS circuit design is increasingly difficult. This causes a shifting from worst-case based analysis and optimization to statistical or probability based analysis and optimization at every level of circuit abstraction. This paper presents a TED (Taylor Expansion Diagram) based multi ? Tox techniques during high-level synthesis (HLS). A variation-aware simultaneous scheduling and resource binding algorithm is proposed which maximizes the power yield under timing yield and performance constraint. For this purpose, a multi ? Tox library is characterized under process variation. The delay and power distribution of different functional units are exhaustively studied. The proposed variation-aware algorithm uses those components for generating low power RTL under a given timing yield and performance constraint. The experimental results show significant improvement as high as 95% on leakage power yield under given constraints.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128428605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Post-silicon validation is an essential part of modern integrated circuit design to capture bugs and design errors that escape pre-silicon validation phase. A major problem governing post-silicon debug is the observability of internal signals since the chip has already been manufactured. Storage requirements limit the number of signals that can be traced, therefore, a major challenge is how to reconstruct the majority of the remaining signals based on traced values. Existing approaches focus on selecting signals with an emphasis on partial restorability, which does not guarantee a good signal restoration. We propose an approach that efficiently selects a set of signals based on total restorability criteria. Our experimental results demonstrate that our signal selection algorithm is both computationally more efficient and can restore up to three times more signals compared to existing methods.
{"title":"Efficient Trace Signal Selection for Post Silicon Validation and Debug","authors":"K. Basu, P. Mishra","doi":"10.1109/VLSID.2011.14","DOIUrl":"https://doi.org/10.1109/VLSID.2011.14","url":null,"abstract":"Post-silicon validation is an essential part of modern integrated circuit design to capture bugs and design errors that escape pre-silicon validation phase. A major problem governing post-silicon debug is the observability of internal signals since the chip has already been manufactured. Storage requirements limit the number of signals that can be traced, therefore, a major challenge is how to reconstruct the majority of the remaining signals based on traced values. Existing approaches focus on selecting signals with an emphasis on partial restorability, which does not guarantee a good signal restoration. We propose an approach that efficiently selects a set of signals based on total restorability criteria. Our experimental results demonstrate that our signal selection algorithm is both computationally more efficient and can restore up to three times more signals compared to existing methods.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"80 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126930993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the development of a micro mechanical inverter for signal processing applications in low frequency MEMS sensors. The inverter consists of two MEMS contact switches connected in complementary configuration. The working principle of the inverter has been thoroughly explained and the design and performance analysis of the inverter have been systematically worked out using a top-down approach. PolyMUMPs surface micro machining process has been utilized for implementing and fabricating the MEMS inverter. The mechanical response and the switching response of the cantilevers have been extensively investigated. Static functional characterization of the inverter has been successfully carried out.
{"title":"Development of a Micro-mechanical Logic Inverter for Low Frequency MEMS Sensor Interfacing","authors":"S. Chakraborty, T. K. Bhattacharyya","doi":"10.1109/VLSID.2011.26","DOIUrl":"https://doi.org/10.1109/VLSID.2011.26","url":null,"abstract":"This paper presents the development of a micro mechanical inverter for signal processing applications in low frequency MEMS sensors. The inverter consists of two MEMS contact switches connected in complementary configuration. The working principle of the inverter has been thoroughly explained and the design and performance analysis of the inverter have been systematically worked out using a top-down approach. PolyMUMPs surface micro machining process has been utilized for implementing and fabricating the MEMS inverter. The mechanical response and the switching response of the cantilevers have been extensively investigated. Static functional characterization of the inverter has been successfully carried out.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128122480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless body sensor area networks (WBAN) is one of the key technologies to solve the rising healthcare costs through early detection, and point-of-care diagnosis and health management. However there is a stringent power requirement on individual sensor nodes in such networks. Consequently traditional signal chain of amplify-digitize-transmit generates large amounts of data that cannot be sustained due to limited energy and bandwidth. In this paper we propose an asynchronous data acquisition platform that provides inherent digitization and compression at the source. The proposed implementation consists of low noise front-end amplifier (AFE) with tunable bandwidth and an asynchronous clockless analog-to-digital converter (ADC). Data compression is achieved by the inherent signal dependent sampling of the asynchronous architecture. The AFE and ADC were fabricated in a 0.18μm CMOS technology and consume a total of 79μW. Measured results for asynchronous ECG signal acquisition are presented.
{"title":"Low Power Asynchronous Data Acquisition Front End for Wireless Body Sensor Area Network","authors":"M. Trakimas, Sungkil Hwang, S. Sonkusale","doi":"10.1109/VLSID.2011.89","DOIUrl":"https://doi.org/10.1109/VLSID.2011.89","url":null,"abstract":"Wireless body sensor area networks (WBAN) is one of the key technologies to solve the rising healthcare costs through early detection, and point-of-care diagnosis and health management. However there is a stringent power requirement on individual sensor nodes in such networks. Consequently traditional signal chain of amplify-digitize-transmit generates large amounts of data that cannot be sustained due to limited energy and bandwidth. In this paper we propose an asynchronous data acquisition platform that provides inherent digitization and compression at the source. The proposed implementation consists of low noise front-end amplifier (AFE) with tunable bandwidth and an asynchronous clockless analog-to-digital converter (ADC). Data compression is achieved by the inherent signal dependent sampling of the asynchronous architecture. The AFE and ADC were fabricated in a 0.18μm CMOS technology and consume a total of 79μW. Measured results for asynchronous ECG signal acquisition are presented.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128396502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Srinivasaraman Chandrasekaran, K. Desai, A. Sendhil, William Ng
Magnitude of equalization applied by a receiver equalization circuit varies across silicon process and environmental conditions. We propose a novel method to auto calibrate a programmable receiver equalization circuit to a target gain equalization value without the use of any external test equipment or channel. This method is built upon on-chip eye monitoring and internal loop back capabilities, which are used to measure the gain equalization value. By executing this on-chip gain equalization measurement for various equalizer settings, the setting which produces equalization that is closest to the target value can be determined. This has been implemented in 45nm CMOS for a PCI Express 2.0 transceiver hardware running at 5Gbps. Lab results with test silicon demonstrate the on-chip eye height measurement capabilities.
{"title":"Self-Calibrating Equalizer for Optimal Jitter Performance Using On-chip Eye Monitoring","authors":"Srinivasaraman Chandrasekaran, K. Desai, A. Sendhil, William Ng","doi":"10.1109/VLSID.2011.38","DOIUrl":"https://doi.org/10.1109/VLSID.2011.38","url":null,"abstract":"Magnitude of equalization applied by a receiver equalization circuit varies across silicon process and environmental conditions. We propose a novel method to auto calibrate a programmable receiver equalization circuit to a target gain equalization value without the use of any external test equipment or channel. This method is built upon on-chip eye monitoring and internal loop back capabilities, which are used to measure the gain equalization value. By executing this on-chip gain equalization measurement for various equalizer settings, the setting which produces equalization that is closest to the target value can be determined. This has been implemented in 45nm CMOS for a PCI Express 2.0 transceiver hardware running at 5Gbps. Lab results with test silicon demonstrate the on-chip eye height measurement capabilities.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133374288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parameter variations in deep sub-micron integrated circuits cause chip characteristics to deviate during semiconductor fabrication process. These variations are dominant in memory systems such as caches and the delay spread due to process variation impacts the performance of a cache based system significantly. In this paper, we propose two schemes to reduce the performance impact of variations in caches: i) Latency-Aware Least Recently Used (LA-LRU) replacement policy which ensures that cache blocks that are affected by process variation are accessed less frequently, and ii) Block Rearrangement scheme that distributes cache blocks with high latencies to all sets uniformly. We implemented our schemes on the Wattch Simple Scalar toolset for Xscale, PowerPC and Alpha21264-like processor configurations. Our experiments on SPEC 2000 benchmarks show that our scheme improves the average memory access time of caches by 11% to 22%, almost eliminating any performance degradation due to variations. We also synthesized the LA-LRU logic, to find out that we can obtain this benefit at negligible increase in the power consumption of the cache.
{"title":"LA-LRU: A Latency-Aware Replacement Policy for Variation Tolerant Caches","authors":"Aarul Jain, Aviral Shrivastava, C. Chakrabarti","doi":"10.1109/VLSID.2011.24","DOIUrl":"https://doi.org/10.1109/VLSID.2011.24","url":null,"abstract":"Parameter variations in deep sub-micron integrated circuits cause chip characteristics to deviate during semiconductor fabrication process. These variations are dominant in memory systems such as caches and the delay spread due to process variation impacts the performance of a cache based system significantly. In this paper, we propose two schemes to reduce the performance impact of variations in caches: i) Latency-Aware Least Recently Used (LA-LRU) replacement policy which ensures that cache blocks that are affected by process variation are accessed less frequently, and ii) Block Rearrangement scheme that distributes cache blocks with high latencies to all sets uniformly. We implemented our schemes on the Wattch Simple Scalar toolset for Xscale, PowerPC and Alpha21264-like processor configurations. Our experiments on SPEC 2000 benchmarks show that our scheme improves the average memory access time of caches by 11% to 22%, almost eliminating any performance degradation due to variations. We also synthesized the LA-LRU logic, to find out that we can obtain this benefit at negligible increase in the power consumption of the cache.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133656032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One of the factors now beginning to seriously limit clock rates in large synchronous designs is manufacturing variations in device parameters. Moreover, such random process variations are increasing significantly with device scaling as technology approaches the end of the silicon roadmap. In a large design containing several millions of transistors, virtually every manufactured part will have a few hundreds of transistors that are significant performance outliers. Any one such device in a critical path can greatly limit the highest clock rate that can be achieved by the chip. In this paper we propose and analyze a new design approach that allows for the post manufacture tuning and speed-up of exceptionally slow circuit paths to recover much of the performance lost due to such outlier devices. We show that such tuning of exceptionally slow paths can result in a significant increase in the average clock speed attainable by the manufactured parts. We also show this method to be defect tolerant, implying an additional benefit of increasing the semiconductor yield.
{"title":"Path Delay Tuning for Performance Gain in the Face of Random Manufacturing Variations","authors":"Kautalya Mishra, Ahmed Faraz, A. Singh","doi":"10.1109/VLSID.2011.35","DOIUrl":"https://doi.org/10.1109/VLSID.2011.35","url":null,"abstract":"One of the factors now beginning to seriously limit clock rates in large synchronous designs is manufacturing variations in device parameters. Moreover, such random process variations are increasing significantly with device scaling as technology approaches the end of the silicon roadmap. In a large design containing several millions of transistors, virtually every manufactured part will have a few hundreds of transistors that are significant performance outliers. Any one such device in a critical path can greatly limit the highest clock rate that can be achieved by the chip. In this paper we propose and analyze a new design approach that allows for the post manufacture tuning and speed-up of exceptionally slow circuit paths to recover much of the performance lost due to such outlier devices. We show that such tuning of exceptionally slow paths can result in a significant increase in the average clock speed attainable by the manufactured parts. We also show this method to be defect tolerant, implying an additional benefit of increasing the semiconductor yield.","PeriodicalId":371062,"journal":{"name":"2011 24th Internatioal Conference on VLSI Design","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122372139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}