Oghenekarho Okobiah, S. Mohanty, E. Kougianos, Oleg Garitselov
Simulations using SPICE provide accurate design exploration but consume a considerable amount of time and can be infeasible for large circuits. The continued technology scaling requires that more circuit parameters are accounted for along with the process variation effects. Regression models have been widely researched and while they present an acceptable accuracy for simulation purposes, they fail to account for the strong correlation effect between parameters on the design. This paper presents an ultra-fast design-optimization flow that combines correlation-aware Kriging metamodels and a simulated annealing algorithm that operates on them. The Kriging-based method generates metamodels of a clamped bit line sense amplifier circuit which take into account the effects of correlation among the design and process parameters. A simulated annealing based optimization algorithm is used to optimize the circuit through the Kriging metamodel. The results show that the Kriging metamodels are very accurate with very low error. The optimization algorithm finds an optimized precharge time while keeping power consumption as constraint in an average execution time of 2.78 ms, as compared to a 45 minutes for an exhaustive search of the design space, i.e. close to 106× faster. To the best of the authors' knowledge this is the first paper that uses Kriging and simulated annealing for nano-CMOS design.
{"title":"Kriging-Assisted Ultra-Fast Simulated-Annealing Optimization of a Clamped Bitline Sense Amplifier","authors":"Oghenekarho Okobiah, S. Mohanty, E. Kougianos, Oleg Garitselov","doi":"10.1109/VLSID.2012.89","DOIUrl":"https://doi.org/10.1109/VLSID.2012.89","url":null,"abstract":"Simulations using SPICE provide accurate design exploration but consume a considerable amount of time and can be infeasible for large circuits. The continued technology scaling requires that more circuit parameters are accounted for along with the process variation effects. Regression models have been widely researched and while they present an acceptable accuracy for simulation purposes, they fail to account for the strong correlation effect between parameters on the design. This paper presents an ultra-fast design-optimization flow that combines correlation-aware Kriging metamodels and a simulated annealing algorithm that operates on them. The Kriging-based method generates metamodels of a clamped bit line sense amplifier circuit which take into account the effects of correlation among the design and process parameters. A simulated annealing based optimization algorithm is used to optimize the circuit through the Kriging metamodel. The results show that the Kriging metamodels are very accurate with very low error. The optimization algorithm finds an optimized precharge time while keeping power consumption as constraint in an average execution time of 2.78 ms, as compared to a 45 minutes for an exhaustive search of the design space, i.e. close to 106× faster. To the best of the authors' knowledge this is the first paper that uses Kriging and simulated annealing for nano-CMOS design.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128230545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The offline debugging model provided by logic emulation systems has some specific disadvantages. Since analysis of signal traces and bug fixing is decoupled from emulation run, validation of a potential fix requires a costly iteration through design recompilation and mapping process, followed by fresh emulation run. This slows down overall verification process. This paper presents an online debugging methodology to achieve rapid verification closure with capability to execute the design back and forward for debug. On encountering an error, the design under test (DUT) can be reverse executed step-by-step to locate source of the error. A two pass emulation technique is used to generate checkpoints and traces needed to support reverse execution. Easy and efficient reverse execution based debug is supported using an innovative technique called optimized design slicing, which allows debug along a meaningful design portion likely to cause the error being investigated. Once the source of error is located, potential bug fixes can be evaluated online by forcing a set of signals to desired values, without going through the design recompilation process and restarting emulation from time 0. Benchmarks on several customer designs have shown that the methodology enhances verification performance significantly.
{"title":"Efficient Online RTL Debugging Methodology for Logic Emulation Systems","authors":"Somnath Banerjee, T. Gupta","doi":"10.1109/VLSID.2012.87","DOIUrl":"https://doi.org/10.1109/VLSID.2012.87","url":null,"abstract":"The offline debugging model provided by logic emulation systems has some specific disadvantages. Since analysis of signal traces and bug fixing is decoupled from emulation run, validation of a potential fix requires a costly iteration through design recompilation and mapping process, followed by fresh emulation run. This slows down overall verification process. This paper presents an online debugging methodology to achieve rapid verification closure with capability to execute the design back and forward for debug. On encountering an error, the design under test (DUT) can be reverse executed step-by-step to locate source of the error. A two pass emulation technique is used to generate checkpoints and traces needed to support reverse execution. Easy and efficient reverse execution based debug is supported using an innovative technique called optimized design slicing, which allows debug along a meaningful design portion likely to cause the error being investigated. Once the source of error is located, potential bug fixes can be evaluated online by forcing a set of signals to desired values, without going through the design recompilation process and restarting emulation from time 0. Benchmarks on several customer designs have shown that the methodology enhances verification performance significantly.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"45 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114033561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As semiconductor manufacturers build ever smaller components, circuits and chips at the nano scale become less reliable and more expensive to produce no longer behaving like precisely chiseled machines with tight tolerances. Modern computing tends to ignore the variability in behavior of underlying system components from device to device, their wear-out over time, or the environment in which the computing system is placed. This makes them expensive, fragile and vulnerable to even the smallest changes in the environment or component failures. This tutorial presents an approach to tame and exploit variability through a strategy where system components -- led by proactive software -- routinely monitor, predict and adapt to the variability of manufactured systems. Unlike conventional system design where variability is hidden behind the conservative specifications of an "over-designed" hardware, we describe strategies that expose spatiotemporal variations in hardware to the highest layers of software. After presenting the background and positioning the new approach, the tutorial will proceed in a bottom- up fashion. Causes of variability at the circuit and hardware levels are first presented, and classical approaches to hide such variability are presented. The tutorial then presents a number of strategies at successively higher levels of abstraction covering the circuit, microarchitecture, compiler, operating systems and software applications to monitor, detect, adapt to, and exploit the exposed variability. Adaptable software will use online statistical modeling to learn and predict actual hardware characteristics, opportunistically adjust to variability, and proactively conform to a deliberately underdesigned hardware with relaxed design and manufacturing constraints. The resulting class of UnO (Underdesigned and Opportunistic) computing machines are adaptive but highly energy efficient. They will continue working while using components that vary in performance or grow less reliable over time and across technology generations. A fluid software-hardware interface will mitigate the variability of manufactured systems and make machines robust, reliable and responsive to changing operating conditions offering the best hope for perpetuating the fundamental gains in computing performance at lower cost of the past 40 years.
{"title":"Tutorial T6: Variability-resistant Software and Hardware for Nano-Scale Computing","authors":"N. Dutt, M. Srivastava, Rajesh K. Gupta, S. Mitra","doi":"10.1109/VLSID.2012.33","DOIUrl":"https://doi.org/10.1109/VLSID.2012.33","url":null,"abstract":"As semiconductor manufacturers build ever smaller components, circuits and chips at the nano scale become less reliable and more expensive to produce no longer behaving like precisely chiseled machines with tight tolerances. Modern computing tends to ignore the variability in behavior of underlying system components from device to device, their wear-out over time, or the environment in which the computing system is placed. This makes them expensive, fragile and vulnerable to even the smallest changes in the environment or component failures. This tutorial presents an approach to tame and exploit variability through a strategy where system components -- led by proactive software -- routinely monitor, predict and adapt to the variability of manufactured systems. Unlike conventional system design where variability is hidden behind the conservative specifications of an \"over-designed\" hardware, we describe strategies that expose spatiotemporal variations in hardware to the highest layers of software. After presenting the background and positioning the new approach, the tutorial will proceed in a bottom- up fashion. Causes of variability at the circuit and hardware levels are first presented, and classical approaches to hide such variability are presented. The tutorial then presents a number of strategies at successively higher levels of abstraction covering the circuit, microarchitecture, compiler, operating systems and software applications to monitor, detect, adapt to, and exploit the exposed variability. Adaptable software will use online statistical modeling to learn and predict actual hardware characteristics, opportunistically adjust to variability, and proactively conform to a deliberately underdesigned hardware with relaxed design and manufacturing constraints. The resulting class of UnO (Underdesigned and Opportunistic) computing machines are adaptive but highly energy efficient. They will continue working while using components that vary in performance or grow less reliable over time and across technology generations. A fluid software-hardware interface will mitigate the variability of manufactured systems and make machines robust, reliable and responsive to changing operating conditions offering the best hope for perpetuating the fundamental gains in computing performance at lower cost of the past 40 years.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132216261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We reduce the test time of external test applied from an automatic test equipment (ATE) by speeding up low activity cycles without exceeding the specified peak power budget. An activity monitor is implemented as hardware or as presimulated and stored test data for this purpose. The achieved test time reduction depends upon the input and output activity factors, αin and αout, of the scan chain. When on-circuit built-in hardware control is used, test time reductions of about 50% and 25% are possible for vectors with low input activity αin ≈ 0 and moderate input activity αin = 0.5, respectively, in ITC02 benchmark circuits. When stored pre-simulated test data is used, test time reduction of up to 99% is shown for vectors with low input and output activities.
{"title":"Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock","authors":"Priyadharshini Shanmugasundaram, V. Agrawal","doi":"10.1109/VLSID.2012.112","DOIUrl":"https://doi.org/10.1109/VLSID.2012.112","url":null,"abstract":"We reduce the test time of external test applied from an automatic test equipment (ATE) by speeding up low activity cycles without exceeding the specified peak power budget. An activity monitor is implemented as hardware or as presimulated and stored test data for this purpose. The achieved test time reduction depends upon the input and output activity factors, αin and αout, of the scan chain. When on-circuit built-in hardware control is used, test time reductions of about 50% and 25% are possible for vectors with low input activity αin ≈ 0 and moderate input activity αin = 0.5, respectively, in ITC02 benchmark circuits. When stored pre-simulated test data is used, test time reduction of up to 99% is shown for vectors with low input and output activities.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133275783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
At the nanoscale domain, the simulation, design, and optimization time of the circuits have increased significantly due to high-integration density, increasing technology constraints, and complex device models. This necessitates fast design space exploration techniques to meet the shorter time to market driven by consumer electronics. This paper presents non-polynomial metamodels (surrogate models) using neural networks to reduce the design optimization time of complex nano-CMOS circuit with no sacrifice on accuracy. The physical design aware neural networks are trained and used as metamodels to predict frequency, locking time, and power of a PLL circuit. Different architectures for neural networks are compared with traditional polynomial functions that have been generated for the same circuit characteristics. Thorough experimental results show that only 100 sample points are sufficient for neural networks to predict the output of circuits with 21 design parameters within 3% accuracy, which improves the accuracy by 56% over polynomial metamodels. The generated metamodels are used to perform optimization of the PLL using a bee colony algorithm. It is observed that the non-polynomial (using neural networks) metamodels achieve more accurate results than polynomial metamodels in shorter optimization time.
{"title":"Fast-Accurate Non-Polynomial Metamodeling for Nano-CMOS PLL Design Optimization","authors":"Oleg Garitselov, S. Mohanty, E. Kougianos","doi":"10.1109/VLSID.2012.90","DOIUrl":"https://doi.org/10.1109/VLSID.2012.90","url":null,"abstract":"At the nanoscale domain, the simulation, design, and optimization time of the circuits have increased significantly due to high-integration density, increasing technology constraints, and complex device models. This necessitates fast design space exploration techniques to meet the shorter time to market driven by consumer electronics. This paper presents non-polynomial metamodels (surrogate models) using neural networks to reduce the design optimization time of complex nano-CMOS circuit with no sacrifice on accuracy. The physical design aware neural networks are trained and used as metamodels to predict frequency, locking time, and power of a PLL circuit. Different architectures for neural networks are compared with traditional polynomial functions that have been generated for the same circuit characteristics. Thorough experimental results show that only 100 sample points are sufficient for neural networks to predict the output of circuits with 21 design parameters within 3% accuracy, which improves the accuracy by 56% over polynomial metamodels. The generated metamodels are used to perform optimization of the PLL using a bee colony algorithm. It is observed that the non-polynomial (using neural networks) metamodels achieve more accurate results than polynomial metamodels in shorter optimization time.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129652082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An instrumental accompaniment system for Indian classical vocal music is designed and implemented on a Texas Instruments Digital Signal Processor TMS320C6713. This will act as a virtual accompanist following the main artist, possibly a vocalist. The melodic pitch information drives an instrument synthesis system, which allows us to play any pitched musical instrument virtually following the singing voice in real time with small delay. Additive synthesis is used to generate the desired tones of the instrument with the needed instrument constraints incorporated. The performance of the system is optimized with respect to the computational complexity and memory space requirements of the algorithm. The system performance is studied for different combinations of singers and songs. The proposed system complements the already available automatic accompaniment for Indian classical music, namely the sruti and taala boxes.
{"title":"Real-time Melodic Accompaniment System for Indian Music Using TMS320C6713","authors":"Prateek Verma, P. Rao","doi":"10.1109/VLSID.2012.57","DOIUrl":"https://doi.org/10.1109/VLSID.2012.57","url":null,"abstract":"An instrumental accompaniment system for Indian classical vocal music is designed and implemented on a Texas Instruments Digital Signal Processor TMS320C6713. This will act as a virtual accompanist following the main artist, possibly a vocalist. The melodic pitch information drives an instrument synthesis system, which allows us to play any pitched musical instrument virtually following the singing voice in real time with small delay. Additive synthesis is used to generate the desired tones of the instrument with the needed instrument constraints incorporated. The performance of the system is optimized with respect to the computational complexity and memory space requirements of the algorithm. The system performance is studied for different combinations of singers and songs. The proposed system complements the already available automatic accompaniment for Indian classical music, namely the sruti and taala boxes.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"594 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120933659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasing circuit densities, the proliferation of Multi-Processor Systems-on-Chips (MPSoCs) and high performance computing systems have resulted in an alarming rise in electronic heat dissipation levels, making the conventional thermal management strategies, including air cooled heat sinks, obsolete. The latest advancements in 3D Integration of IC dies have only aggravated this problem, creating a strong worldwide research interest in the development of advanced cooling technologies, such as interlayer microchannel liquid cooled heat sinks, to maintain ICs under safe operating temperatures. While this research has helped create a substantial amount of knowledge base pertaining to the heat transfer mechanism in advanced liquid cooling systems as applied to electronic circuits, this knowledge is yet to be transferred to the EDA community for it to be incorporated in the IC thermal simulators of the future. The existence of such tools becomes absolutely essential when IC designers are faced with the challenge of ascertaining the thermal reliability of their designs in the presence of liquid cooling systems. This tutorial aims to introduce the attendees to the key concepts that are needed to compute IC temperatures with and without microchannel liquid cooling and the principles behind compact modeling of forced convective heat transfer in advanced IC cooling technologies. A major part of this tutorial is based on the 3D-ICE thermal simulator, which has been built by the Embedded Systems Laboratory in EPFL, Switzerland (URL: http://esl.epfl.ch/3D-ICE). This simulator is based on the Compact Transient Thermal Modeling for forced convective cooling advanced by our research group. Since its release in 2010, more than 50 research groups across the world have downloaded it and are actively using it for their research.
{"title":"Tutorial T7A: New Modeling Methodologies for Thermal Analysis of 3D ICs and Advanced Cooling Technologies of the Future","authors":"David Atienza Alonso, A. Sridhar","doi":"10.1109/VLSID.2012.34","DOIUrl":"https://doi.org/10.1109/VLSID.2012.34","url":null,"abstract":"Increasing circuit densities, the proliferation of Multi-Processor Systems-on-Chips (MPSoCs) and high performance computing systems have resulted in an alarming rise in electronic heat dissipation levels, making the conventional thermal management strategies, including air cooled heat sinks, obsolete. The latest advancements in 3D Integration of IC dies have only aggravated this problem, creating a strong worldwide research interest in the development of advanced cooling technologies, such as interlayer microchannel liquid cooled heat sinks, to maintain ICs under safe operating temperatures. While this research has helped create a substantial amount of knowledge base pertaining to the heat transfer mechanism in advanced liquid cooling systems as applied to electronic circuits, this knowledge is yet to be transferred to the EDA community for it to be incorporated in the IC thermal simulators of the future. The existence of such tools becomes absolutely essential when IC designers are faced with the challenge of ascertaining the thermal reliability of their designs in the presence of liquid cooling systems. This tutorial aims to introduce the attendees to the key concepts that are needed to compute IC temperatures with and without microchannel liquid cooling and the principles behind compact modeling of forced convective heat transfer in advanced IC cooling technologies. A major part of this tutorial is based on the 3D-ICE thermal simulator, which has been built by the Embedded Systems Laboratory in EPFL, Switzerland (URL: http://esl.epfl.ch/3D-ICE). This simulator is based on the Compact Transient Thermal Modeling for forced convective cooling advanced by our research group. Since its release in 2010, more than 50 research groups across the world have downloaded it and are actively using it for their research.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122745763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pawan Kumar Moyade, N. Nambath, Allmin Ansari, Shalabh Gupta
Inter symbol interference introduced by fiber non-idealities such as polarization mode dispersion and chromatic dispersion would be one of the major limiting factors in achieving higher data rates in the existing Gigabit fiber-optic links. Receivers based on high speed ADCs followed by DSPs will be limited by the need for massive parallelization and interconnects. We propose analog signal processing based coherent optical link receiver to drastically reduce its power consumption, size and cost. A 40, Gbps analog processing adaptive DP-QPSK (dual polarization quadrature phase shift keying) equalizer in 90, nm CMOS technology is demonstrated using simulations, which dissipates 450, mW of power. A complete analog processing receiver is expected to consume less than one-tenth of the power consumed by chip using ADCs followed by signal processing in DSP.
{"title":"Analog Processing Based Equalizer for 40 Gbps Coherent Optical Links in 90 nm CMOS","authors":"Pawan Kumar Moyade, N. Nambath, Allmin Ansari, Shalabh Gupta","doi":"10.1109/VLSID.2012.54","DOIUrl":"https://doi.org/10.1109/VLSID.2012.54","url":null,"abstract":"Inter symbol interference introduced by fiber non-idealities such as polarization mode dispersion and chromatic dispersion would be one of the major limiting factors in achieving higher data rates in the existing Gigabit fiber-optic links. Receivers based on high speed ADCs followed by DSPs will be limited by the need for massive parallelization and interconnects. We propose analog signal processing based coherent optical link receiver to drastically reduce its power consumption, size and cost. A 40, Gbps analog processing adaptive DP-QPSK (dual polarization quadrature phase shift keying) equalizer in 90, nm CMOS technology is demonstrated using simulations, which dissipates 450, mW of power. A complete analog processing receiver is expected to consume less than one-tenth of the power consumed by chip using ADCs followed by signal processing in DSP.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122805283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High-Speed Digital-to-Analog Converters (DACs) are inevitable due to the advent of multi level modulation formats to meet the increasing demand of high data rates in communication systems. In this paper, a 4-bit 20 GS/s DAC has been designed in 90 nm CMOS technology. CMOS based DACs provide a low cost single IC solution as compared to compound semiconductor counterparts by fully integrating digital and RF blocks. In this paper, an on-chip Linear Feedback Shift Register (LFSR) is used to generate the required high-speed broadband data and eye diagram of the DAC output is used for characterization. In order to drive the high capacitive loads along with routing, Electro Static Discharge (ESD) and pad capacitance (≈800 fF) at a speed of 20 GS/s (13.1 GHz bandwidth), a new buffer architecture has also been implemented.
{"title":"Buffer Design and Eye-Diagram Based Characterization of a 20 GS/s CMOS DAC","authors":"M. Singh, Shalabh Gupta","doi":"10.1109/VLSID.2012.53","DOIUrl":"https://doi.org/10.1109/VLSID.2012.53","url":null,"abstract":"High-Speed Digital-to-Analog Converters (DACs) are inevitable due to the advent of multi level modulation formats to meet the increasing demand of high data rates in communication systems. In this paper, a 4-bit 20 GS/s DAC has been designed in 90 nm CMOS technology. CMOS based DACs provide a low cost single IC solution as compared to compound semiconductor counterparts by fully integrating digital and RF blocks. In this paper, an on-chip Linear Feedback Shift Register (LFSR) is used to generate the required high-speed broadband data and eye diagram of the DAC output is used for characterization. In order to drive the high capacitive loads along with routing, Electro Static Discharge (ESD) and pad capacitance (≈800 fF) at a speed of 20 GS/s (13.1 GHz bandwidth), a new buffer architecture has also been implemented.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115870897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Both power and heat density of on-chip systems are in- creasing exponentially with Moore's Law. High temperature negatively affects reliability as well the costs of cooling and packaging. In this paper, we propose task partitioning as an effective way to reduce the peak temperature in embedded systems running either a set of periodic heterogeneous tasks with common period or periodic heterogeneous tasks with individual period. For task sets with common period, experimental results show that our task partitioning algorithms is able to reduce the peak temperature by as much as 5.8°C as compared to algorithms that only use task sequencing. For task sets with individual period, EDF scheduling with task partitioning can also lower the peak temperature, as compared to simple EDF scheduling, by as much as 6°C. Our analysis indicates that the numbers of additional context switches (overhead) is less than 2 per task, which is tolerable in many practical scenarios.
{"title":"Temperature-aware Task Partitioning for Real-Time Scheduling in Embedded Systems","authors":"Zhe Wang, S. Ranka, P. Mishra","doi":"10.1109/VLSID.2012.64","DOIUrl":"https://doi.org/10.1109/VLSID.2012.64","url":null,"abstract":"Both power and heat density of on-chip systems are in- creasing exponentially with Moore's Law. High temperature negatively affects reliability as well the costs of cooling and packaging. In this paper, we propose task partitioning as an effective way to reduce the peak temperature in embedded systems running either a set of periodic heterogeneous tasks with common period or periodic heterogeneous tasks with individual period. For task sets with common period, experimental results show that our task partitioning algorithms is able to reduce the peak temperature by as much as 5.8°C as compared to algorithms that only use task sequencing. For task sets with individual period, EDF scheduling with task partitioning can also lower the peak temperature, as compared to simple EDF scheduling, by as much as 6°C. Our analysis indicates that the numbers of additional context switches (overhead) is less than 2 per task, which is tolerable in many practical scenarios.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128783958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}