首页 > 最新文献

2009 22nd International Conference on VLSI Design最新文献

英文 中文
Low-Power VLSI Design of LDPC Decoder Using DVFS for AWGN Channels 基于DVFS的AWGN信道LDPC解码器低功耗VLSI设计
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.68
Weihuang Wang, G. Choi, K. Gunnam
This paper presents a low-power LDPC decoder design for additive white Gaussian noise (AWGN) channels. The proposed decoding scheme provides constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. It differs from recent publications on speculative LDPC decoding for block-fading channels. Our approach addresses the more difficult problem of decoding requirement prediction for data frames in AWGN channels. It is also directly applicable for fading channels. A decoder architecture utilizing offset min-sum layered decoding algorithm is presented. Up to 30% saving in decoding energy consumption is achieved with negligible coding performance degradation.
提出了一种适用于加性高斯白噪声信道的低功耗LDPC解码器设计。所提出的译码方案提供了恒定时间的译码,从而方便了需要保证数据速率的实时应用。它分析每个接收到的数据帧,以估计帧收敛所需的最大迭代次数。然后将结果用于动态调整解码器频率并在多个电压电平之间切换;因此,能源的使用是最小化的。它不同于最近关于块衰落信道的推测LDPC解码的出版物。我们的方法解决了AWGN信道中数据帧解码需求预测的更困难的问题。它也直接适用于衰落信道。提出了一种利用偏移最小和分层译码算法的译码结构。在编码性能下降可以忽略不计的情况下,实现了高达30%的解码能耗节约。
{"title":"Low-Power VLSI Design of LDPC Decoder Using DVFS for AWGN Channels","authors":"Weihuang Wang, G. Choi, K. Gunnam","doi":"10.1109/VLSI.Design.2009.68","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.68","url":null,"abstract":"This paper presents a low-power LDPC decoder design for additive white Gaussian noise (AWGN) channels. The proposed decoding scheme provides constant-time decoding and thus facilitates real-time applications where guaranteed data rate is required. It analyzes each received data frame to estimate the maximum number of necessary iterations for frame convergence. The results are then used to dynamically adjust decoder frequency and switch between multiple-voltage levels; thereby energy use is minimized. It differs from recent publications on speculative LDPC decoding for block-fading channels. Our approach addresses the more difficult problem of decoding requirement prediction for data frames in AWGN channels. It is also directly applicable for fading channels. A decoder architecture utilizing offset min-sum layered decoding algorithm is presented. Up to 30% saving in decoding energy consumption is achieved with negligible coding performance degradation.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125466670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Soft Error Rates with Inertial and Logical Masking 具有惯性和逻辑掩蔽的软错误率
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.77
Fan Wang, V. Agrawal
We analyze the neutron induced soft error rate (SER). An induced error pulse is modeled by two parameters, probability of occurrence and probability density function of the pulse width. We calculate failures in time (FIT) rates for ISCAS85 benchmark circuits. A comparison with measured SER for SRAMs shows better relevance of our work over other published work. Our CPU times are reasonable; benchmark circuit C1908 with 880 gates requires only 1.14seconds. Further, we study the influence of circuit topology on SER. We find that for some circuits with many levels of logic there exists a critical single event transient (SET) width. For smaller induced pulse width the SER depends not on the size of the circuit but only on the gates near the output, and only those need to be protected. For an inverter chain in TMSC035 technology, the critical width is between 25ps and 50ps. For a shallow circuit, e.g., a ripple-carry adder, the critical SET width may not exist.
分析了中子诱导软错误率(SER)。用脉冲宽度的概率密度函数和发生概率函数两个参数对诱导误差脉冲进行建模。我们计算了ISCAS85基准电路的失败率。与sram测量SER的比较表明,我们的工作比其他已发表的工作具有更好的相关性。我们的CPU时间是合理的;具有880个门的基准电路C1908只需要1.14秒。进一步,我们研究了电路拓扑结构对SER的影响。我们发现,对于一些具有多层逻辑的电路,存在一个临界单事件暂态(SET)宽度。对于较小的感应脉冲宽度,SER不取决于电路的大小,而只取决于输出附近的门,并且只有那些需要被保护。对于采用TMSC035技术的逆变器链,临界宽度在25ps到50ps之间。对于浅电路,例如,纹波进位加法器,临界SET宽度可能不存在。
{"title":"Soft Error Rates with Inertial and Logical Masking","authors":"Fan Wang, V. Agrawal","doi":"10.1109/VLSI.Design.2009.77","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.77","url":null,"abstract":"We analyze the neutron induced soft error rate (SER). An induced error pulse is modeled by two parameters, probability of occurrence and probability density function of the pulse width. We calculate failures in time (FIT) rates for ISCAS85 benchmark circuits. A comparison with measured SER for SRAMs shows better relevance of our work over other published work. Our CPU times are reasonable; benchmark circuit C1908 with 880 gates requires only 1.14seconds. Further, we study the influence of circuit topology on SER. We find that for some circuits with many levels of logic there exists a critical single event transient (SET) width. For smaller induced pulse width the SER depends not on the size of the circuit but only on the gates near the output, and only those need to be protected. For an inverter chain in TMSC035 technology, the critical width is between 25ps and 50ps. For a shallow circuit, e.g., a ripple-carry adder, the critical SET width may not exist.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127112418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Concept of "Crossover Point" and its Application on Threshold Voltage Definition for Undoped-Body Transistors “交叉点”概念及其在非掺杂体晶体管阈值电压定义中的应用
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.41
R. K. Baruah, S. Mahapatra
As the conventional MOSFET's scaling is approaching the limit imposed by short channel effects, Double Gate (DG) MOS transistors are appearing as the most feasible candidate in terms of technology in sub-45nm technology nodes. As the short channel effect in DG transistor is controlled by the device geometry, undoped or lightly doped body is used to sustain the channel. There exits a disparity in threshold voltage calculation criteria of undoped-body symmetric double gate transistors which uses two definitions, one is potential based and the another is charge based definition. In this paper, a novel concept of "crossover point'' is introduced, which proves that the charge-based definition is more accurate than the potential based definition.The change in threshold voltage with body thickness variation for a fixed channel length is anomalous as predicted by potential based definition while it is monotonous for charge based definition.The threshold voltage is then extracted from drain currant versus gate voltage characteristics using linear extrapolation and "Third Derivative of Drain-Source Current'' method or simply "TD'' method. The trend of threshold voltage variation is found same in both the cases which support charge-based definition.
由于传统MOSFET的标度正接近短沟道效应所施加的极限,双栅(DG) MOS晶体管在45纳米以下的技术节点上成为最可行的技术候选者。由于DG晶体管的短沟道效应是由器件的几何形状控制的,因此采用未掺杂或轻掺杂的体来维持沟道。非带体对称双栅极晶体管的阈值电压计算标准采用基于电势和基于电荷的两种定义存在差异。本文引入了“交叉点”的新概念,证明了基于电荷的定义比基于电位的定义更准确。在固定通道长度下,阈值电压随体厚变化的变化与基于电位的定义预测的不一致,而基于电荷的定义预测的阈值电压变化单调。然后使用线性外推法和“漏源电流三阶导数”方法或简单的“TD”方法从漏极电流与栅极电压特性中提取阈值电压。在支持基于电荷定义的两种情况下,阈值电压的变化趋势是相同的。
{"title":"Concept of \"Crossover Point\" and its Application on Threshold Voltage Definition for Undoped-Body Transistors","authors":"R. K. Baruah, S. Mahapatra","doi":"10.1109/VLSI.Design.2009.41","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.41","url":null,"abstract":"As the conventional MOSFET's scaling is approaching the limit imposed by short channel effects, Double Gate (DG) MOS transistors are appearing as the most feasible candidate in terms of technology in sub-45nm technology nodes. As the short channel effect in DG transistor is controlled by the device geometry, undoped or lightly doped body is used to sustain the channel. There exits a disparity in threshold voltage calculation criteria of undoped-body symmetric double gate transistors which uses two definitions, one is potential based and the another is charge based definition. In this paper, a novel concept of \"crossover point'' is introduced, which proves that the charge-based definition is more accurate than the potential based definition.The change in threshold voltage with body thickness variation for a fixed channel length is anomalous as predicted by potential based definition while it is monotonous for charge based definition.The threshold voltage is then extracted from drain currant versus gate voltage characteristics using linear extrapolation and \"Third Derivative of Drain-Source Current'' method or simply \"TD'' method. The trend of threshold voltage variation is found same in both the cases which support charge-based definition.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127796233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improved-Quality Real-Time Stereo Vision Processor
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.89
SangHoon Han, SeongHoon Woo, Mun-Ho Jeong, Bum-Jae You
This paper presents a stereo vision processor with the form of ASIC that achieves enhanced quality depth maps and real-time performance. Our vision processor can be used broadly in practical applications. To improve depth map quality, pre- and post-processing units are adopted, and SFRs (Special Function Registers) are assigned to vision parameters for controllable quality. To meet real-time requirements, the stereo vision system is implemented on hardware using sophisticated design. We integrate image rectification, bilateral filtering, depth estimator and left-right consistency check blocks on a single silicon chip. This processor is fabricated in a 0.18-um standard CMOS technology, and can operate at 120MHz clock frequency achieving over 140 frames/s depth maps with 320 by 240 image size and 64 disparity levels. The system exploits 8-bit sub-pixel disparities for depth accuracy, and shows the throughput over 707 million PDS, which is better than results of any published work. The unrectified and unfiltered images taken at real environment are used as test inputs for performance and quality evaluation. Comparisons with previous ASIC implementations are presented to verify the improvement of this task.
本文提出了一种基于ASIC的立体视觉处理器,实现了高质量的深度图和实时性。我们的视觉处理器在实际应用中具有广泛的应用前景。为了提高深度图的质量,采用了预处理和后处理单元,并将SFRs (Special Function Registers)分配给视觉参数,以实现质量可控。为了满足实时性的要求,采用复杂的硬件设计实现了立体视觉系统。我们将图像校正、双边滤波、深度估计和左右一致性检查块集成在单个硅芯片上。该处理器采用0.18 um标准CMOS技术制造,可以在120MHz时钟频率下工作,实现超过140帧/秒的深度图,图像尺寸为320 × 240,视差级别为64。该系统利用8位亚像素差实现深度精度,显示吞吐量超过7.07亿PDS,优于任何已发表的研究结果。在真实环境下拍摄的未校正和未滤波图像被用作性能和质量评估的测试输入。与以前的ASIC实现进行了比较,以验证该任务的改进。
{"title":"Improved-Quality Real-Time Stereo Vision Processor","authors":"SangHoon Han, SeongHoon Woo, Mun-Ho Jeong, Bum-Jae You","doi":"10.1109/VLSI.Design.2009.89","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.89","url":null,"abstract":"This paper presents a stereo vision processor with the form of ASIC that achieves enhanced quality depth maps and real-time performance. Our vision processor can be used broadly in practical applications. To improve depth map quality, pre- and post-processing units are adopted, and SFRs (Special Function Registers) are assigned to vision parameters for controllable quality. To meet real-time requirements, the stereo vision system is implemented on hardware using sophisticated design. We integrate image rectification, bilateral filtering, depth estimator and left-right consistency check blocks on a single silicon chip. This processor is fabricated in a 0.18-um standard CMOS technology, and can operate at 120MHz clock frequency achieving over 140 frames/s depth maps with 320 by 240 image size and 64 disparity levels. The system exploits 8-bit sub-pixel disparities for depth accuracy, and shows the throughput over 707 million PDS, which is better than results of any published work. The unrectified and unfiltered images taken at real environment are used as test inputs for performance and quality evaluation. Comparisons with previous ASIC implementations are presented to verify the improvement of this task.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127943765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A Method for the Multi-Net Multi-Pin Routing Problem with Layer Assignment 一种具有层分配的多网络多引脚路由问题的方法
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.30
T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta
Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.
互连在深亚微米VLSI设计中至关重要,因为它们会施加限制,如延迟、拥塞、串扰、功耗等,并消耗资源。这些参数影响了多网全局路由的可行性求解。此外,还在努力探索和使用非曼哈顿路由架构。在这项工作中,我们重点研究了具有多个可用路由层的定制设计风格的多网络多引脚全局Y路由的具体问题。该问题被表述为具有多层分配的最小交叉Y -Steiner最小树问题。实验结果相当令人鼓舞。
{"title":"A Method for the Multi-Net Multi-Pin Routing Problem with Layer Assignment","authors":"T. Samanta, H. Rahaman, P. Ghosal, P. Dasgupta","doi":"10.1109/VLSI.Design.2009.30","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.30","url":null,"abstract":"Interconnects are vital in deep sub-micron VLSI design, as they impose constraints, such as delay, congestion, crosstalk, power dissipation and others, and consume resources. These parameters affect the efforts for obtaining a feasible solution for the global routing of multiple nets. In addition, efforts are on for exploration and use of non-Manhattan routing architectures. In this work, we focus on the specific problem of multi-net multi-pin global Y -routing for custom-built design styles with several available routing layers. The problem is formulated as a minimum crossing Y -Steiner Minimal tree problem with multi-layer assignment. Experimental results are quite encouraging.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"402 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115100780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design and Implementation of Fine-Grain Power Gating with Ground Bounce Suppression 抑制地弹跳的细粒度功率门控的设计与实现
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.63
K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura
This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.
本文介绍了一种细粒度功率门控的设计与实现方法。由于睡眠和唤醒在运行时以细粒度控制,因此迫切需要缩短睡眠状态和活动状态之间的转换时间。特别是,缩短唤醒时间至关重要,因为它会影响执行时间,从而影响性能。然而,这一要求使得抑制地面反弹变得更加困难。我们提出了一种新的技术来倾斜细粒局部功率域的唤醒时间,以抑制地面反弹。通过选择性地减小缓冲区的大小,使得驱动电源开关的缓冲区的延迟在缓冲区树中发生偏斜。我们设计了一个基于90纳米CMOS技术的MIPS R3000 CPU内核,并将我们的技术应用于内部功能单元。仿真结果表明,在同时打开电源开关的情况下,我们的技术将激流电流降低到47%。这导致地面反弹抑制到53mV与3.3ns唤醒时间。运行基准程序的仿真结果表明,功能单元的总功耗在25°C时降低了15%,在100°C时降低了62%。从温度相关的盈亏平衡点和程序中连续空闲时间的角度讨论了节电的有效性。
{"title":"Design and Implementation of Fine-Grain Power Gating with Ground Bounce Suppression","authors":"K. Usami, T. Shirai, T. Hashida, H. Masuda, S. Takeda, M. Nakata, N. Seki, H. Amano, M. Namiki, Masashi Imai, Masaaki Kondo, Hiroshi Nakamura","doi":"10.1109/VLSI.Design.2009.63","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.63","url":null,"abstract":"This paper describes a design and implementation methodology for fine-grain power gating. Since sleep-in and wakeup are controlled in a fine granularity in run time, shortening the transition time between the sleep and active states is strongly required. In particular, shortening the wakeup time is essential because it affects the execution time and hence does the performance. However, this requirement makes suppression of the ground-bounce more difficult. We propose a novel technique to skew the wakeup timings of fine-grain local power domains to suppress the ground bounce. Delay of buffers driving power switches is skewed in the buffer tree by selectively downsizing them. We designed a MIPS R3000 based CPU core in a 90nm CMOS technology and applied our technique to internal function units. Simulation results showed that our technique reduces the rush current to 47% over the case to turn-on the power switches simultaneously. This resulted in suppressing the ground bounce to 53mV with 3.3ns wakeup time. Simulation results from running benchmark programs showed that the total power dissipation for the function units was reduced by up to 15% at 25°C and by 62% at 100°C. Effectiveness in power savings is discussed from the viewpoint of the temperature-dependent break-even points and the consecutive idle time in the program.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125893086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Simultaneous Peak Temperature and Average Power Minimization during Behavioral Synthesis 行为合成过程中峰值温度和平均功率同时最小化
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.78
V. Krishnan, S. Katkoori
With continuous CMOS scaling and increasing operating frequencies, power and thermal concerns have become critical design issues in current and future high-performance integrated circuits. Elevated chip temperatures adversely impact circuit performance and reliability. On-chip thermal gradients can lead to unpredictable clock skew variations and timing failures. Chip temperatures are influenced by design decisions at the behavioral and physical-synthesis levels. Existing low-power design techniques cannot adequately address thermal issues since their optimization objectives fail to capture the spatial nature of on-chip thermal gradients. We present an algorithm for thermally-aware low-power behavioral synthesis that concurrently minimizes average power and peak chip temperature. Our algorithm uses accurate floorplan-based temperature estimates to guide behavioral synthesis. Compared to traditional low-power synthesis, our method reduces peak temperatures by as much as 23%, with less than 10% overhead in chip area.
随着CMOS规模的不断扩大和工作频率的不断提高,功率和热问题已成为当前和未来高性能集成电路设计的关键问题。芯片温度升高会对电路性能和可靠性产生不利影响。芯片上的热梯度会导致不可预测的时钟偏差变化和定时故障。芯片温度在行为和物理合成水平上受到设计决策的影响。现有的低功耗设计技术不能充分解决热问题,因为它们的优化目标不能捕捉片上热梯度的空间性质。我们提出了一种同时最小化平均功率和芯片峰值温度的热感知低功耗行为合成算法。我们的算法使用精确的基于平面图的温度估计来指导行为合成。与传统的低功耗合成相比,我们的方法将峰值温度降低了23%,芯片面积的开销不到10%。
{"title":"Simultaneous Peak Temperature and Average Power Minimization during Behavioral Synthesis","authors":"V. Krishnan, S. Katkoori","doi":"10.1109/VLSI.Design.2009.78","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.78","url":null,"abstract":"With continuous CMOS scaling and increasing operating frequencies, power and thermal concerns have become critical design issues in current and future high-performance integrated circuits. Elevated chip temperatures adversely impact circuit performance and reliability. On-chip thermal gradients can lead to unpredictable clock skew variations and timing failures. Chip temperatures are influenced by design decisions at the behavioral and physical-synthesis levels. Existing low-power design techniques cannot adequately address thermal issues since their optimization objectives fail to capture the spatial nature of on-chip thermal gradients. We present an algorithm for thermally-aware low-power behavioral synthesis that concurrently minimizes average power and peak chip temperature. Our algorithm uses accurate floorplan-based temperature estimates to guide behavioral synthesis. Compared to traditional low-power synthesis, our method reduces peak temperatures by as much as 23%, with less than 10% overhead in chip area.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125549529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Built in Self Test Based Design of Wave-Pipelined Circuits in ASICs 基于内建自检的asic波形流水线电路设计
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.46
V. Vireen, N. Venugopalachary, G. Seetharaman, B. Venkataramani
Wave-pipelining enables digital systems to be operated at higher frequencies by properly selecting the clock periods and clock skews so as to latch the output of combinational logic circuits at stable periods. In the literature, only trial and error and manual procedures are adopted for these selections. The major contribution of this paper is the proposal for automating the above procedure for the ASIC implementation of wave pipelined circuits using built in self test approach. For the purpose of verification, a Coordinate rotation digital computer and filters using the distributed arithmetic algorithm are implemented. To test the efficacy, these circuits are implemented by adopting three schemes: wave-pipelining, pipelining and non-pipelining. From the implementation results, it is observed that the wave-pipelined circuits are 21-29 % faster compared to non-pipelined circuits. The pipelined circuits are 22-48 % faster compared to wave-pipelined circuits but at the cost of about 18-28 % increase in area.
通过合理选择时钟周期和时钟偏度,将组合逻辑电路的输出锁存于稳定周期,使数字系统能够在更高的频率上工作。在文献中,只有试验和错误和人工程序采用这些选择。本文的主要贡献是建议使用内置自检方法将上述过程自动化,用于波流水线电路的ASIC实现。为了验证,实现了一个坐标旋转数字计算机和使用分布式算法的滤波器。为了测试这些电路的有效性,我们采用了三种方案来实现这些电路:波形流水线、流水线和非流水线。从实现结果来看,波形流水线电路比非流水线电路快21- 29%。管道电路比波式管道电路快22- 48%,但面积增加了18- 28%。
{"title":"Built in Self Test Based Design of Wave-Pipelined Circuits in ASICs","authors":"V. Vireen, N. Venugopalachary, G. Seetharaman, B. Venkataramani","doi":"10.1109/VLSI.Design.2009.46","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.46","url":null,"abstract":"Wave-pipelining enables digital systems to be operated at higher frequencies by properly selecting the clock periods and clock skews so as to latch the output of combinational logic circuits at stable periods. In the literature, only trial and error and manual procedures are adopted for these selections. The major contribution of this paper is the proposal for automating the above procedure for the ASIC implementation of wave pipelined circuits using built in self test approach. For the purpose of verification, a Coordinate rotation digital computer and filters using the distributed arithmetic algorithm are implemented. To test the efficacy, these circuits are implemented by adopting three schemes: wave-pipelining, pipelining and non-pipelining. From the implementation results, it is observed that the wave-pipelined circuits are 21-29 % faster compared to non-pipelined circuits. The pipelined circuits are 22-48 % faster compared to wave-pipelined circuits but at the cost of about 18-28 % increase in area.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122160903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reversible Logic Synthesis with Output Permutation 具有输出置换的可逆逻辑综合
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.DESIGN.2009.40
R. Wille, Daniel Große, G. Dueck, R. Drechsler
Synthesis of reversible logic has become a very important research area. In recent years several algorithms--heuristic as well as exact ones--have been introduced in this area. Typically, they use the specification of a reversible function in terms of a truth table as input. Here, the position of the outputs are fixed. However, in general it is irrelevant, how the respective outputs are ordered. Thus, a synthesis methodology is proposed that determines for a given reversible function an equivalent circuit realization modulo output permutation.  More precisely, the result of the synthesis process is a circuit realization whose output functions have been permuted in comparison to the original specification and the respective permutation vector. We show that this synthesis methodology may lead to significant smaller realizations. We apply Synthesis with Output Permutation (SWOP) to both, an exact and a heuristic synthesis algorithm. As our experiments show using the new synthesis paradigm leads to multiple control Toffoli networks that are smaller than the currently best known realizations.
可逆逻辑的综合已成为一个非常重要的研究领域。近年来,在这一领域引入了几种算法——启发式算法和精确算法。通常,它们使用一个可逆函数的真值表作为输入。这里,输出的位置是固定的。然而,一般来说,如何排序各自的输出是无关紧要的。因此,提出了一种确定给定可逆函数的等效电路实现模输出置换的综合方法。更确切地说,合成过程的结果是一个电路实现,其输出功能与原始规格和各自的排列向量相比已进行了排列。我们表明,这种综合方法可能导致显著较小的实现。我们将输出置换综合算法(SWOP)应用于精确综合算法和启发式综合算法。正如我们的实验所表明的那样,使用新的综合范式会导致比目前最知名的实现更小的多个控制Toffoli网络。
{"title":"Reversible Logic Synthesis with Output Permutation","authors":"R. Wille, Daniel Große, G. Dueck, R. Drechsler","doi":"10.1109/VLSI.DESIGN.2009.40","DOIUrl":"https://doi.org/10.1109/VLSI.DESIGN.2009.40","url":null,"abstract":"Synthesis of reversible logic has become a very important research area. In recent years several algorithms--heuristic as well as exact ones--have been introduced in this area. Typically, they use the specification of a reversible function in terms of a truth table as input. Here, the position of the outputs are fixed. However, in general it is irrelevant, how the respective outputs are ordered. Thus, a synthesis methodology is proposed that determines for a given reversible function an equivalent circuit realization modulo output permutation.  More precisely, the result of the synthesis process is a circuit realization whose output functions have been permuted in comparison to the original specification and the respective permutation vector. We show that this synthesis methodology may lead to significant smaller realizations. We apply Synthesis with Output Permutation (SWOP) to both, an exact and a heuristic synthesis algorithm. As our experiments show using the new synthesis paradigm leads to multiple control Toffoli networks that are smaller than the currently best known realizations.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132459371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Coping with Variations through System-Level Design 通过系统级设计应对变化
Pub Date : 2009-01-05 DOI: 10.1109/VLSI.Design.2009.96
N. Banerjee, Saumya Chandra, Swaroop Ghosh, S. Dey, A. Raghunathan, K. Roy
Manufacturing and operation-induced variations have emerged as a critical challenge in designing integrated circuits (ICs) under the nanometer technology regime. Most work on addressing variations has focused on device, circuit, and logic-level solutions. As the magnitude of parameter variations increases with technology scaling, these techniques are not sufficient to address the negative impact that variations have on IC performance, power, yield, and design time. Therefore, in recent years, the research community has shown great interest in techniques to address variations starting from the other end of the design process, i.e., at the system level. In this paper, we provide an overview of various techniques that we have developed for coping with variations through system-level design. The presented techniques include a paradigm for designing variation-tolerant systems through critical path isolation for timing adaptiveness, application-specific techniques to achieve variation-tolerance by trading off quality of the result, variation-aware system-level power analysis, and system-level power management under variations. These techniques demonstrate that addressing variations during system-level design can greatly mitigate the effects of variations, enabling the design of integrated circuits in scaled technologies.
制造和操作引起的变化已经成为纳米技术下设计集成电路(ic)的关键挑战。大多数解决变化的工作集中在器件、电路和逻辑级解决方案上。由于参数变化的幅度随着技术的扩展而增加,这些技术不足以解决变化对IC性能、功率、良率和设计时间的负面影响。因此,近年来,研究团体对从设计过程的另一端开始处理变化的技术表现出极大的兴趣,即在系统级别。在本文中,我们提供了各种技术的概述,我们已经开发了通过系统级设计来应对变化。所提出的技术包括通过关键路径隔离来设计容变系统的范例,以实现时序适应性,通过权衡结果质量来实现容变的特定应用技术,变化感知系统级功率分析,以及变化下的系统级功率管理。这些技术表明,在系统级设计期间处理变化可以大大减轻变化的影响,使集成电路的设计成为可能。
{"title":"Coping with Variations through System-Level Design","authors":"N. Banerjee, Saumya Chandra, Swaroop Ghosh, S. Dey, A. Raghunathan, K. Roy","doi":"10.1109/VLSI.Design.2009.96","DOIUrl":"https://doi.org/10.1109/VLSI.Design.2009.96","url":null,"abstract":"Manufacturing and operation-induced variations have emerged as a critical challenge in designing integrated circuits (ICs) under the nanometer technology regime. Most work on addressing variations has focused on device, circuit, and logic-level solutions. As the magnitude of parameter variations increases with technology scaling, these techniques are not sufficient to address the negative impact that variations have on IC performance, power, yield, and design time. Therefore, in recent years, the research community has shown great interest in techniques to address variations starting from the other end of the design process, i.e., at the system level. In this paper, we provide an overview of various techniques that we have developed for coping with variations through system-level design. The presented techniques include a paradigm for designing variation-tolerant systems through critical path isolation for timing adaptiveness, application-specific techniques to achieve variation-tolerance by trading off quality of the result, variation-aware system-level power analysis, and system-level power management under variations. These techniques demonstrate that addressing variations during system-level design can greatly mitigate the effects of variations, enabling the design of integrated circuits in scaled technologies.","PeriodicalId":267121,"journal":{"name":"2009 22nd International Conference on VLSI Design","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131979467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2009 22nd International Conference on VLSI Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1