Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.68
Weihuang Wang, G. Choi, K. Gunnam
This paper presents a low-power LDPC decoder design for additive white Gaussian noise (AWGN) channels. The proposed decoding scheme provides constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. It differs from recent publications on speculative LDPC decoding for block-fading channels. Our approach addresses the more difficult problem of decoding requirement prediction for data frames in AWGN channels. It is also directly applicable for fading channels. A decoder architecture utilizing offset min-sum layered decoding algorithm is presented. Up to 30% saving in decoding energy consumption is achieved with negligible coding performance degradation.
{"title":"Low-Power VLSI Design of LDPC Decoder Using DVFS for AWGN Channels","authors":"Weihuang Wang, G. Choi, K. Gunnam","doi":"10.1109/VLSI.Design.2009.68","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.68","url":null,"abstract":"This paper presents a low-power LDPC decoder design for additive white Gaussian noise (AWGN) channels. The proposed decoding scheme provides constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. It differs from recent publications on speculative LDPC decoding for block-fading channels. Our approach addresses the more difficult problem of decoding requirement prediction for data frames in AWGN channels. It is also directly applicable for fading channels. A decoder architecture utilizing offset min-sum layered decoding algorithm is presented. Up to 30% saving in decoding energy consumption is achieved with negligible coding performance degradation.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125466670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.77
Fan Wang, V. Agrawal
We analyze the neutron induced soft error rate (SER). An induced error pulse is modeled by two parameters, probability of occurrence and probability density function of the pulse width. We calculate failures in time (FIT) rates for ISCAS85 benchmark circuits. A comparison with measured SER for SRAMs shows better relevance of our work over other published work. Our CPU times are reasonable; benchmark circuit C1908 with 880 gates requires only 1.14seconds. Further, we study the influence of circuit topology on SER. We find that for some circuits with many levels of logic there exists a critical single event transient (SET) width. For smaller induced pulse width the SER depends not on the size of the circuit but only on the gates near the output, and only those need to be protected. For an inverter chain in TMSC035 technology, the critical width is between 25ps and 50ps. For a shallow circuit, e.g., a ripple-carry adder, the critical SET width may not exist.
{"title":"Soft Error Rates with Inertial and Logical Masking","authors":"Fan Wang, V. Agrawal","doi":"10.1109/VLSI.Design.2009.77","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.77","url":null,"abstract":"We analyze the neutron induced soft error rate (SER). An induced error pulse is modeled by two parameters, probability of occurrence and probability density function of the pulse width. We calculate failures in time (FIT) rates for ISCAS85 benchmark circuits. A comparison with measured SER for SRAMs shows better relevance of our work over other published work. Our CPU times are reasonable; benchmark circuit C1908 with 880 gates requires only 1.14seconds. Further, we study the influence of circuit topology on SER. We find that for some circuits with many levels of logic there exists a critical single event transient (SET) width. For smaller induced pulse width the SER depends not on the size of the circuit but only on the gates near the output, and only those need to be protected. For an inverter chain in TMSC035 technology, the critical width is between 25ps and 50ps. For a shallow circuit, e.g., a ripple-carry adder, the critical SET width may not exist.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127112418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.41
R. K. Baruah, S. Mahapatra
As the conventional MOSFET's scaling is approaching the limit imposed by short channel effects, Double Gate (DG) MOS transistors are appearing as the most feasible candidate in terms of technology in sub-45nm technology nodes. As the short channel effect in DG transistor is controlled by the device geometry, undoped or lightly doped body is used to sustain the channel. There exits a disparity in threshold voltage calculation criteria of undoped-body symmetric double gate transistors which uses two definitions, one is potential based and the another is charge based definition. In this paper, a novel concept of "crossover point'' is introduced, which proves that the charge-based definition is more accurate than the potential based definition.The change in threshold voltage with body thickness variation for a fixed channel length is anomalous as predicted by potential based definition while it is monotonous for charge based definition.The threshold voltage is then extracted from drain currant versus gate voltage characteristics using linear extrapolation and "Third Derivative of Drain-Source Current'' method or simply "TD'' method. The trend of threshold voltage variation is found same in both the cases which support charge-based definition.
{"title":"Concept of \"Crossover Point\" and its Application on Threshold Voltage Definition for Undoped-Body Transistors","authors":"R. K. Baruah, S. Mahapatra","doi":"10.1109/VLSI.Design.2009.41","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.41","url":null,"abstract":"As the conventional MOSFET's scaling is approaching the limit imposed by short channel effects, Double Gate (DG) MOS transistors are appearing as the most feasible candidate in terms of technology in sub-45nm technology nodes. As the short channel effect in DG transistor is controlled by the device geometry, undoped or lightly doped body is used to sustain the channel. There exits a disparity in threshold voltage calculation criteria of undoped-body symmetric double gate transistors which uses two definitions, one is potential based and the another is charge based definition. In this paper, a novel concept of \"crossover point'' is introduced, which proves that the charge-based definition is more accurate than the potential based definition.The change in threshold voltage with body thickness variation for a fixed channel length is anomalous as predicted by potential based definition while it is monotonous for charge based definition.The threshold voltage is then extracted from drain currant versus gate voltage characteristics using linear extrapolation and \"Third Derivative of Drain-Source Current'' method or simply \"TD'' method. The trend of threshold voltage variation is found same in both the cases which support charge-based definition.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127796233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.89
SangHoon Han, SeongHoon Woo, Mun-Ho Jeong, Bum-Jae You
This paper presents a stereo vision processor with the form of ASIC that achieves enhanced quality depth maps and real-time performance. Our vision processor can be used broadly in practical applications. To improve depth map quality, pre- and post-processing units are adopted, and SFRs (Special Function Registers) are assigned to vision parameters for controllable quality. To meet real-time requirements, the stereo vision system is implemented on hardware using sophisticated design. We integrate image rectification, bilateral filtering, depth estimator and left-right consistency check blocks on a single silicon chip. This processor is fabricated in a 0.18-um standard CMOS technology, and can operate at 120MHz clock frequency achieving over 140 frames/s depth maps with 320 by 240 image size and 64 disparity levels. The system exploits 8-bit sub-pixel disparities for depth accuracy, and shows the throughput over 707 million PDS, which is better than results of any published work. The unrectified and unfiltered images taken at real environment are used as test inputs for performance and quality evaluation. Comparisons with previous ASIC implementations are presented to verify the improvement of this task.
本文提出了一种基于ASIC的立体视觉处理器,实现了高质量的深度图和实时性。我们的视觉处理器在实际应用中具有广泛的应用前景。为了提高深度图的质量,采用了预处理和后处理单元,并将SFRs (Special Function Registers)分配给视觉参数,以实现质量可控。为了满足实时性的要求,采用复杂的硬件设计实现了立体视觉系统。我们将图像校正、双边滤波、深度估计和左右一致性检查块集成在单个硅芯片上。该处理器采用0.18 um标准CMOS技术制造,可以在120MHz时钟频率下工作,实现超过140帧/秒的深度图,图像尺寸为320 × 240,视差级别为64。该系统利用8位亚像素差实现深度精度,显示吞吐量超过7.07亿PDS,优于任何已发表的研究结果。在真实环境下拍摄的未校正和未滤波图像被用作性能和质量评估的测试输入。与以前的ASIC实现进行了比较,以验证该任务的改进。
{"title":"Improved-Quality Real-Time Stereo Vision Processor","authors":"SangHoon Han, SeongHoon Woo, Mun-Ho Jeong, Bum-Jae You","doi":"10.1109/VLSI.Design.2009.89","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.89","url":null,"abstract":"This paper presents a stereo vision processor with the form of ASIC that achieves enhanced quality depth maps and real-time performance. Our vision processor can be used broadly in practical applications. To improve depth map quality, pre- and post-processing units are adopted, and SFRs (Special Function Registers) are assigned to vision parameters for controllable quality. To meet real-time requirements, the stereo vision system is implemented on hardware using sophisticated design. We integrate image rectification, bilateral filtering, depth estimator and left-right consistency check blocks on a single silicon chip. This processor is fabricated in a 0.18-um standard CMOS technology, and can operate at 120MHz clock frequency achieving over 140 frames/s depth maps with 320 by 240 image size and 64 disparity levels. The system exploits 8-bit sub-pixel disparities for depth accuracy, and shows the throughput over 707 million PDS, which is better than results of any published work. The unrectified and unfiltered images taken at real environment are used as test inputs for performance and quality evaluation. Comparisons with previous ASIC implementations are presented to verify the improvement of this task.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127943765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.30
T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta
Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.
{"title":"A Method for the Multi-Net Multi-Pin Routing Problem with Layer Assignment","authors":"T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta","doi":"10.1109/VLSI.Design.2009.30","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.30","url":null,"abstract":"Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"402 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115100780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.63
K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura
This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.
{"title":"Design and Implementation of Fine-Grain Power Gating with Ground Bounce Suppression","authors":"K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura","doi":"10.1109/VLSI.Design.2009.63","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.63","url":null,"abstract":"This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125893086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.78
V. Krishnan, S. Katkoori
With continuous CMOS scaling and increasing operating frequencies, power and thermal concerns have become critical design issues in current and future high-performance integrated circuits. Elevated chip temperatures adversely impact circuit performance and reliability. On-chip thermal gradients can lead to unpredictable clock skew variations and timing failures. Chip temperatures are influenced by design decisions at the behavioral and physical-synthesis levels. Existing low-power design techniques cannot adequately address thermal issues since their optimization objectives fail to capture the spatial nature of on-chip thermal gradients. We present an algorithm for thermally-aware low-power behavioral synthesis that concurrently minimizes average power and peak chip temperature. Our algorithm uses accurate floorplan-based temperature estimates to guide behavioral synthesis. Compared to traditional low-power synthesis, our method reduces peak temperatures by as much as 23%, with less than 10% overhead in chip area.
{"title":"Simultaneous Peak Temperature and Average Power Minimization during Behavioral Synthesis","authors":"V. Krishnan, S. Katkoori","doi":"10.1109/VLSI.Design.2009.78","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.78","url":null,"abstract":"With continuous CMOS scaling and increasing operating frequencies, power and thermal concerns have become critical design issues in current and future high-performance integrated circuits. Elevated chip temperatures adversely impact circuit performance and reliability. On-chip thermal gradients can lead to unpredictable clock skew variations and timing failures. Chip temperatures are influenced by design decisions at the behavioral and physical-synthesis levels. Existing low-power design techniques cannot adequately address thermal issues since their optimization objectives fail to capture the spatial nature of on-chip thermal gradients. We present an algorithm for thermally-aware low-power behavioral synthesis that concurrently minimizes average power and peak chip temperature. Our algorithm uses accurate floorplan-based temperature estimates to guide behavioral synthesis. Compared to traditional low-power synthesis, our method reduces peak temperatures by as much as 23%, with less than 10% overhead in chip area.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125549529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.46
V. Vireen, N. Venugopalachary, G. Seetharaman, B. Venkataramani
Wave-pipelining enables digital systems to be operated at higher frequencies by properly selecting the clock periods and clock skews so as to latch the output of combinational logic circuits at stable periods. In the literature, only trial and error and manual procedures are adopted for these selections. The major contribution of this paper is the proposal for automating the above procedure for the ASIC implementation of wave pipelined circuits using built in self test approach. For the purpose of verification, a Coordinate rotation digital computer and filters using the distributed arithmetic algorithm are implemented. To test the efficacy, these circuits are implemented by adopting three schemes: wave-pipelining, pipelining and non-pipelining. From the implementation results, it is observed that the wave-pipelined circuits are 21-29 % faster compared to non-pipelined circuits. The pipelined circuits are 22-48 % faster compared to wave-pipelined circuits but at the cost of about 18-28 % increase in area.
{"title":"Built in Self Test Based Design of Wave-Pipelined Circuits in ASICs","authors":"V. Vireen, N. Venugopalachary, G. Seetharaman, B. Venkataramani","doi":"10.1109/VLSI.Design.2009.46","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.46","url":null,"abstract":"Wave-pipelining enables digital systems to be operated at higher frequencies by properly selecting the clock periods and clock skews so as to latch the output of combinational logic circuits at stable periods. In the literature, only trial and error and manual procedures are adopted for these selections. The major contribution of this paper is the proposal for automating the above procedure for the ASIC implementation of wave pipelined circuits using built in self test approach. For the purpose of verification, a Coordinate rotation digital computer and filters using the distributed arithmetic algorithm are implemented. To test the efficacy, these circuits are implemented by adopting three schemes: wave-pipelining, pipelining and non-pipelining. From the implementation results, it is observed that the wave-pipelined circuits are 21-29 % faster compared to non-pipelined circuits. The pipelined circuits are 22-48 % faster compared to wave-pipelined circuits but at the cost of about 18-28 % increase in area.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122160903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.DESIGN.2009.40
R. Wille, Daniel Große, G. Dueck, R. Drechsler
Synthesis of reversible logic has become a very important research area. In recent years several algorithms--heuristic as well as exact ones--have been introduced in this area. Typically, they use the specification of a reversible function in terms of a truth table as input. Here, the position of the outputs are fixed. However, in general it is irrelevant, how the respective outputs are ordered. Thus, a synthesis methodology is proposed that determines for a given reversible function an equivalent circuit realization modulo output permutation. More precisely, the result of the synthesis process is a circuit realization whose output functions have been permuted in comparison to the original specification and the respective permutation vector. We show that this synthesis methodology may lead to significant smaller realizations. We apply Synthesis with Output Permutation (SWOP) to both, an exact and a heuristic synthesis algorithm. As our experiments show using the new synthesis paradigm leads to multiple control Toffoli networks that are smaller than the currently best known realizations.
{"title":"Reversible Logic Synthesis with Output Permutation","authors":"R. Wille, Daniel Große, G. Dueck, R. Drechsler","doi":"10.1109/VLSI.DESIGN.2009.40","DOIUrl":"https://doi.org/10.1109/VLSI.DESIGN.2009.40","url":null,"abstract":"Synthesis of reversible logic has become a very important research area. In recent years several algorithms--heuristic as well as exact ones--have been introduced in this area. Typically, they use the specification of a reversible function in terms of a truth table as input. Here, the position of the outputs are fixed. However, in general it is irrelevant, how the respective outputs are ordered. Thus, a synthesis methodology is proposed that determines for a given reversible function an equivalent circuit realization modulo output permutation. More precisely, the result of the synthesis process is a circuit realization whose output functions have been permuted in comparison to the original specification and the respective permutation vector. We show that this synthesis methodology may lead to significant smaller realizations. We apply Synthesis with Output Permutation (SWOP) to both, an exact and a heuristic synthesis algorithm. As our experiments show using the new synthesis paradigm leads to multiple control Toffoli networks that are smaller than the currently best known realizations.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132459371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-01-05DOI: 10.1109/VLSI.Design.2009.96
N. Banerjee, Saumya Chandra, Swaroop Ghosh, S. Dey, A. Raghunathan, K. Roy
Manufacturing and operation-induced variations have emerged as a critical challenge in designing integrated circuits (ICs) under the nanometer technology regime. Most work on addressing variations has focused on device, circuit, and logic-level solutions. As the magnitude of parameter variations increases with technology scaling, these techniques are not sufficient to address the negative impact that variations have on IC performance, power, yield, and design time. Therefore, in recent years, the research community has shown great interest in techniques to address variations starting from the other end of the design process, i.e., at the system level. In this paper, we provide an overview of various techniques that we have developed for coping with variations through system-level design. The presented techniques include a paradigm for designing variation-tolerant systems through critical path isolation for timing adaptiveness, application-specific techniques to achieve variation-tolerance by trading off quality of the result, variation-aware system-level power analysis, and system-level power management under variations. These techniques demonstrate that addressing variations during system-level design can greatly mitigate the effects of variations, enabling the design of integrated circuits in scaled technologies.
{"title":"Coping with Variations through System-Level Design","authors":"N. Banerjee, Saumya Chandra, Swaroop Ghosh, S. Dey, A. Raghunathan, K. Roy","doi":"10.1109/VLSI.Design.2009.96","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.96","url":null,"abstract":"Manufacturing and operation-induced variations have emerged as a critical challenge in designing integrated circuits (ICs) under the nanometer technology regime. Most work on addressing variations has focused on device, circuit, and logic-level solutions. As the magnitude of parameter variations increases with technology scaling, these techniques are not sufficient to address the negative impact that variations have on IC performance, power, yield, and design time. Therefore, in recent years, the research community has shown great interest in techniques to address variations starting from the other end of the design process, i.e., at the system level. In this paper, we provide an overview of various techniques that we have developed for coping with variations through system-level design. The presented techniques include a paradigm for designing variation-tolerant systems through critical path isolation for timing adaptiveness, application-specific techniques to achieve variation-tolerance by trading off quality of the result, variation-aware system-level power analysis, and system-level power management under variations. These techniques demonstrate that addressing variations during system-level design can greatly mitigate the effects of variations, enabling the design of integrated circuits in scaled technologies.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131979467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}