首页 > 最新文献

2012 25th International Conference on VLSI Design最新文献

英文 中文
Kriging-Assisted Ultra-Fast Simulated-Annealing Optimization of a Clamped Bitline Sense Amplifier 箝位位线感测放大器的kriging辅助超快速模拟退火优化
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.89
Oghenekarho Okobiah, S. Mohanty, E. Kougianos, Oleg Garitselov
Simulations using SPICE provide accurate design exploration but consume a considerable amount of time and can be infeasible for large circuits. The continued technology scaling requires that more circuit parameters are accounted for along with the process variation effects. Regression models have been widely researched and while they present an acceptable accuracy for simulation purposes, they fail to account for the strong correlation effect between parameters on the design. This paper presents an ultra-fast design-optimization flow that combines correlation-aware Kriging metamodels and a simulated annealing algorithm that operates on them. The Kriging-based method generates metamodels of a clamped bit line sense amplifier circuit which take into account the effects of correlation among the design and process parameters. A simulated annealing based optimization algorithm is used to optimize the circuit through the Kriging metamodel. The results show that the Kriging metamodels are very accurate with very low error. The optimization algorithm finds an optimized precharge time while keeping power consumption as constraint in an average execution time of 2.78 ms, as compared to a 45 minutes for an exhaustive search of the design space, i.e. close to 106× faster. To the best of the authors' knowledge this is the first paper that uses Kriging and simulated annealing for nano-CMOS design.
使用SPICE的模拟提供了精确的设计探索,但消耗了相当多的时间,并且对于大型电路来说是不可行的。随着工艺规模的不断扩大,需要考虑更多的电路参数以及工艺变化的影响。回归模型已经得到了广泛的研究,虽然它们为模拟目的提供了可接受的精度,但它们无法解释参数之间对设计的强相关性影响。本文提出了一种超快速的设计优化流程,该流程结合了关联感知的Kriging元模型和对其进行操作的模拟退火算法。基于kriging的方法生成了考虑设计参数和工艺参数相关性影响的箝位线感测电路元模型。采用模拟退火优化算法,通过Kriging元模型对电路进行优化。结果表明,Kriging元模型具有较好的精度和较低的误差。优化算法找到了一个优化的预充电时间,同时保持功耗作为约束,平均执行时间为2.78 ms,而详尽搜索设计空间需要45分钟,即快了近106倍。据作者所知,这是第一篇使用克里格和模拟退火进行纳米cmos设计的论文。
{"title":"Kriging-Assisted Ultra-Fast Simulated-Annealing Optimization of a Clamped Bitline Sense Amplifier","authors":"Oghenekarho Okobiah, S. Mohanty, E. Kougianos, Oleg Garitselov","doi":"10.1109/VLSID.2012.89","DOIUrl":"https://doi.org/10.1109/VLSID.2012.89","url":null,"abstract":"Simulations using SPICE provide accurate design exploration but consume a considerable amount of time and can be infeasible for large circuits. The continued technology scaling requires that more circuit parameters are accounted for along with the process variation effects. Regression models have been widely researched and while they present an acceptable accuracy for simulation purposes, they fail to account for the strong correlation effect between parameters on the design. This paper presents an ultra-fast design-optimization flow that combines correlation-aware Kriging metamodels and a simulated annealing algorithm that operates on them. The Kriging-based method generates metamodels of a clamped bit line sense amplifier circuit which take into account the effects of correlation among the design and process parameters. A simulated annealing based optimization algorithm is used to optimize the circuit through the Kriging metamodel. The results show that the Kriging metamodels are very accurate with very low error. The optimization algorithm finds an optimized precharge time while keeping power consumption as constraint in an average execution time of 2.78 ms, as compared to a 45 minutes for an exhaustive search of the design space, i.e. close to 106× faster. To the best of the authors' knowledge this is the first paper that uses Kriging and simulated annealing for nano-CMOS design.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128230545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Efficient Online RTL Debugging Methodology for Logic Emulation Systems 逻辑仿真系统的高效在线RTL调试方法
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.87
Somnath Banerjee, T. Gupta
The offline debugging model provided by logic emulation systems has some specific disadvantages. Since analysis of signal traces and bug fixing is decoupled from emulation run, validation of a potential fix requires a costly iteration through design recompilation and mapping process, followed by fresh emulation run. This slows down overall verification process. This paper presents an online debugging methodology to achieve rapid verification closure with capability to execute the design back and forward for debug. On encountering an error, the design under test (DUT) can be reverse executed step-by-step to locate source of the error. A two pass emulation technique is used to generate checkpoints and traces needed to support reverse execution. Easy and efficient reverse execution based debug is supported using an innovative technique called optimized design slicing, which allows debug along a meaningful design portion likely to cause the error being investigated. Once the source of error is located, potential bug fixes can be evaluated online by forcing a set of signals to desired values, without going through the design recompilation process and restarting emulation from time 0. Benchmarks on several customer designs have shown that the methodology enhances verification performance significantly.
逻辑仿真系统提供的离线调试模式有一些特殊的缺点。由于信号跟踪和bug修复的分析与仿真运行是分离的,因此对潜在修复的验证需要通过设计重新编译和映射过程进行昂贵的迭代,然后再进行新的仿真运行。这减慢了整个验证过程。本文提出了一种在线调试方法,以实现快速的验证关闭,并能够前后执行设计以进行调试。在遇到错误时,可以逐步反向执行被测设计(DUT)以定位错误的来源。使用两遍模拟技术来生成支持反向执行所需的检查点和跟踪。使用一种称为优化设计切片的创新技术支持基于简单有效的反向执行的调试,该技术允许沿着可能导致正在调查的错误的有意义的设计部分进行调试。一旦找到错误的来源,就可以通过将一组信号强制为所需值来在线评估潜在的错误修复,而无需经过设计重新编译过程并从时间0重新启动仿真。几个客户设计的基准测试表明,该方法显著提高了验证性能。
{"title":"Efficient Online RTL Debugging Methodology for Logic Emulation Systems","authors":"Somnath Banerjee, T. Gupta","doi":"10.1109/VLSID.2012.87","DOIUrl":"https://doi.org/10.1109/VLSID.2012.87","url":null,"abstract":"The offline debugging model provided by logic emulation systems has some specific disadvantages. Since analysis of signal traces and bug fixing is decoupled from emulation run, validation of a potential fix requires a costly iteration through design recompilation and mapping process, followed by fresh emulation run. This slows down overall verification process. This paper presents an online debugging methodology to achieve rapid verification closure with capability to execute the design back and forward for debug. On encountering an error, the design under test (DUT) can be reverse executed step-by-step to locate source of the error. A two pass emulation technique is used to generate checkpoints and traces needed to support reverse execution. Easy and efficient reverse execution based debug is supported using an innovative technique called optimized design slicing, which allows debug along a meaningful design portion likely to cause the error being investigated. Once the source of error is located, potential bug fixes can be evaluated online by forcing a set of signals to desired values, without going through the design recompilation process and restarting emulation from time 0. Benchmarks on several customer designs have shown that the methodology enhances verification performance significantly.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"45 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114033561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Tutorial T6: Variability-resistant Software and Hardware for Nano-Scale Computing 教程T6:纳米级计算的抗变异性软件和硬件
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.33
N. Dutt, M. Srivastava, Rajesh K. Gupta, S. Mitra
As semiconductor manufacturers build ever smaller components, circuits and chips at the nano scale become less reliable and more expensive to produce no longer behaving like precisely chiseled machines with tight tolerances. Modern computing tends to ignore the variability in behavior of underlying system components from device to device, their wear-out over time, or the environment in which the computing system is placed. This makes them expensive, fragile and vulnerable to even the smallest changes in the environment or component failures. This tutorial presents an approach to tame and exploit variability through a strategy where system components -- led by proactive software -- routinely monitor, predict and adapt to the variability of manufactured systems. Unlike conventional system design where variability is hidden behind the conservative specifications of an "over-designed" hardware, we describe strategies that expose spatiotemporal variations in hardware to the highest layers of software. After presenting the background and positioning the new approach, the tutorial will proceed in a bottom- up fashion. Causes of variability at the circuit and hardware levels are first presented, and classical approaches to hide such variability are presented. The tutorial then presents a number of strategies at successively higher levels of abstraction covering the circuit, microarchitecture, compiler, operating systems and software applications to monitor, detect, adapt to, and exploit the exposed variability. Adaptable software will use online statistical modeling to learn and predict actual hardware characteristics, opportunistically adjust to variability, and proactively conform to a deliberately underdesigned hardware with relaxed design and manufacturing constraints. The resulting class of UnO (Underdesigned and Opportunistic) computing machines are adaptive but highly energy efficient. They will continue working while using components that vary in performance or grow less reliable over time and across technology generations. A fluid software-hardware interface will mitigate the variability of manufactured systems and make machines robust, reliable and responsive to changing operating conditions offering the best hope for perpetuating the fundamental gains in computing performance at lower cost of the past 40 years.
随着半导体制造商制造越来越小的元件,纳米级的电路和芯片变得越来越不可靠,生产成本也越来越高,不再像精密雕刻的机器一样具有严格的公差。现代计算倾向于忽略设备与设备之间底层系统组件行为的可变性,它们随时间的损耗,或者计算系统所在的环境。这使得它们昂贵、脆弱,甚至对环境中最小的变化或组件故障都很脆弱。本教程介绍了一种驯服和利用可变性的方法,通过一种策略,系统组件——由主动软件领导——常规地监视、预测和适应制造系统的可变性。与传统的系统设计不同,变异性隐藏在“过度设计”硬件的保守规范后面,我们描述的策略将硬件的时空变化暴露给最高层的软件。在介绍背景和定位新方法之后,本教程将以自下而上的方式进行。首先提出了电路和硬件级别的可变性的原因,并提出了隐藏这种可变性的经典方法。然后,本教程在更高的抽象层次上介绍了一些策略,包括电路、微体系结构、编译器、操作系统和软件应用程序,以监视、检测、适应和利用暴露的可变性。适应性强的软件将使用在线统计建模来学习和预测实际的硬件特性,机会性地调整可变性,并主动符合故意设计不足的硬件,具有宽松的设计和制造约束。由此产生的UnO(未充分设计和机会主义)计算机器是自适应的,但非常节能。在使用性能变化或随着时间和技术更新而变得不可靠的组件时,它们将继续工作。一个流畅的软件-硬件接口将减轻制造系统的可变性,使机器强大、可靠,并对不断变化的操作条件做出反应,这是在过去40年里以更低的成本保持计算性能基本收益的最大希望。
{"title":"Tutorial T6: Variability-resistant Software and Hardware for Nano-Scale Computing","authors":"N. Dutt, M. Srivastava, Rajesh K. Gupta, S. Mitra","doi":"10.1109/VLSID.2012.33","DOIUrl":"https://doi.org/10.1109/VLSID.2012.33","url":null,"abstract":"As semiconductor manufacturers build ever smaller components, circuits and chips at the nano scale become less reliable and more expensive to produce no longer behaving like precisely chiseled machines with tight tolerances. Modern computing tends to ignore the variability in behavior of underlying system components from device to device, their wear-out over time, or the environment in which the computing system is placed. This makes them expensive, fragile and vulnerable to even the smallest changes in the environment or component failures. This tutorial presents an approach to tame and exploit variability through a strategy where system components -- led by proactive software -- routinely monitor, predict and adapt to the variability of manufactured systems. Unlike conventional system design where variability is hidden behind the conservative specifications of an \"over-designed\" hardware, we describe strategies that expose spatiotemporal variations in hardware to the highest layers of software. After presenting the background and positioning the new approach, the tutorial will proceed in a bottom- up fashion. Causes of variability at the circuit and hardware levels are first presented, and classical approaches to hide such variability are presented. The tutorial then presents a number of strategies at successively higher levels of abstraction covering the circuit, microarchitecture, compiler, operating systems and software applications to monitor, detect, adapt to, and exploit the exposed variability. Adaptable software will use online statistical modeling to learn and predict actual hardware characteristics, opportunistically adjust to variability, and proactively conform to a deliberately underdesigned hardware with relaxed design and manufacturing constraints. The resulting class of UnO (Underdesigned and Opportunistic) computing machines are adaptive but highly energy efficient. They will continue working while using components that vary in performance or grow less reliable over time and across technology generations. A fluid software-hardware interface will mitigate the variability of manufactured systems and make machines robust, reliable and responsive to changing operating conditions offering the best hope for perpetuating the fundamental gains in computing performance at lower cost of the past 40 years.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132216261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock 外部测试扫描电路,内置活动监视器和自适应测试时钟
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.112
Priyadharshini Shanmugasundaram, V. Agrawal
We reduce the test time of external test applied from an automatic test equipment (ATE) by speeding up low activity cycles without exceeding the specified peak power budget. An activity monitor is implemented as hardware or as presimulated and stored test data for this purpose. The achieved test time reduction depends upon the input and output activity factors, αin and αout, of the scan chain. When on-circuit built-in hardware control is used, test time reductions of about 50% and 25% are possible for vectors with low input activity αin ≈ 0 and moderate input activity αin = 0.5, respectively, in ITC02 benchmark circuits. When stored pre-simulated test data is used, test time reduction of up to 99% is shown for vectors with low input and output activities.
我们通过加速低活动周期而不超过规定的峰值功率预算,减少了自动测试设备(ATE)应用的外部测试的测试时间。为此目的,活动监视器被实现为硬件或预模拟和存储的测试数据。测试时间的缩短取决于扫描链的输入和输出活度因子αin和αout。当采用电路内置硬件控制时,在itco2基准电路中,低输入活度αin≈0和中等输入活度αin = 0.5的矢量分别可以减少约50%和25%的测试时间。当使用存储的预模拟测试数据时,对于低输入和输出活动的向量,测试时间减少高达99%。
{"title":"Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock","authors":"Priyadharshini Shanmugasundaram, V. Agrawal","doi":"10.1109/VLSID.2012.112","DOIUrl":"https://doi.org/10.1109/VLSID.2012.112","url":null,"abstract":"We reduce the test time of external test applied from an automatic test equipment (ATE) by speeding up low activity cycles without exceeding the specified peak power budget. An activity monitor is implemented as hardware or as presimulated and stored test data for this purpose. The achieved test time reduction depends upon the input and output activity factors, αin and αout, of the scan chain. When on-circuit built-in hardware control is used, test time reductions of about 50% and 25% are possible for vectors with low input activity αin ≈ 0 and moderate input activity αin = 0.5, respectively, in ITC02 benchmark circuits. When stored pre-simulated test data is used, test time reduction of up to 99% is shown for vectors with low input and output activities.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133275783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Fast-Accurate Non-Polynomial Metamodeling for Nano-CMOS PLL Design Optimization 用于纳米cmos锁相环设计优化的快速精确非多项式元建模
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.90
Oleg Garitselov, S. Mohanty, E. Kougianos
At the nanoscale domain, the simulation, design, and optimization time of the circuits have increased significantly due to high-integration density, increasing technology constraints, and complex device models. This necessitates fast design space exploration techniques to meet the shorter time to market driven by consumer electronics. This paper presents non-polynomial metamodels (surrogate models) using neural networks to reduce the design optimization time of complex nano-CMOS circuit with no sacrifice on accuracy. The physical design aware neural networks are trained and used as metamodels to predict frequency, locking time, and power of a PLL circuit. Different architectures for neural networks are compared with traditional polynomial functions that have been generated for the same circuit characteristics. Thorough experimental results show that only 100 sample points are sufficient for neural networks to predict the output of circuits with 21 design parameters within 3% accuracy, which improves the accuracy by 56% over polynomial metamodels. The generated metamodels are used to perform optimization of the PLL using a bee colony algorithm. It is observed that the non-polynomial (using neural networks) metamodels achieve more accurate results than polynomial metamodels in shorter optimization time.
在纳米级领域,由于高集成密度、技术限制和复杂的器件模型,电路的仿真、设计和优化时间显著增加。这就需要快速的设计空间探索技术,以满足由消费电子产品驱动的更短的上市时间。本文提出了一种基于神经网络的非多项式元模型(替代模型),在不牺牲精度的前提下缩短了复杂纳米cmos电路的设计优化时间。对物理设计感知神经网络进行训练,并将其用作元模型来预测锁相环电路的频率、锁定时间和功率。将神经网络的不同结构与针对相同电路特性生成的传统多项式函数进行了比较。实验结果表明,仅100个样本点就足以使神经网络对21个设计参数的电路输出进行预测,准确度在3%以内,比多项式元模型提高56%。生成的元模型用于使用蜂群算法对PLL进行优化。非多项式元模型(使用神经网络)在更短的优化时间内获得了比多项式元模型更精确的结果。
{"title":"Fast-Accurate Non-Polynomial Metamodeling for Nano-CMOS PLL Design Optimization","authors":"Oleg Garitselov, S. Mohanty, E. Kougianos","doi":"10.1109/VLSID.2012.90","DOIUrl":"https://doi.org/10.1109/VLSID.2012.90","url":null,"abstract":"At the nanoscale domain, the simulation, design, and optimization time of the circuits have increased significantly due to high-integration density, increasing technology constraints, and complex device models. This necessitates fast design space exploration techniques to meet the shorter time to market driven by consumer electronics. This paper presents non-polynomial metamodels (surrogate models) using neural networks to reduce the design optimization time of complex nano-CMOS circuit with no sacrifice on accuracy. The physical design aware neural networks are trained and used as metamodels to predict frequency, locking time, and power of a PLL circuit. Different architectures for neural networks are compared with traditional polynomial functions that have been generated for the same circuit characteristics. Thorough experimental results show that only 100 sample points are sufficient for neural networks to predict the output of circuits with 21 design parameters within 3% accuracy, which improves the accuracy by 56% over polynomial metamodels. The generated metamodels are used to perform optimization of the PLL using a bee colony algorithm. It is observed that the non-polynomial (using neural networks) metamodels achieve more accurate results than polynomial metamodels in shorter optimization time.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129652082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Real-time Melodic Accompaniment System for Indian Music Using TMS320C6713 基于TMS320C6713的印度音乐实时旋律伴奏系统
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.57
Prateek Verma, P. Rao
An instrumental accompaniment system for Indian classical vocal music is designed and implemented on a Texas Instruments Digital Signal Processor TMS320C6713. This will act as a virtual accompanist following the main artist, possibly a vocalist. The melodic pitch information drives an instrument synthesis system, which allows us to play any pitched musical instrument virtually following the singing voice in real time with small delay. Additive synthesis is used to generate the desired tones of the instrument with the needed instrument constraints incorporated. The performance of the system is optimized with respect to the computational complexity and memory space requirements of the algorithm. The system performance is studied for different combinations of singers and songs. The proposed system complements the already available automatic accompaniment for Indian classical music, namely the sruti and taala boxes.
设计并实现了一种基于德州仪器数字信号处理器TMS320C6713的印度古典声乐器乐伴奏系统。这将作为一个虚拟伴奏继主要艺术家,可能是一个歌手。旋律音高信息驱动一个乐器合成系统,它允许我们演奏任何音调的乐器几乎跟随歌唱的声音在实时小延迟。添加剂合成是用来产生所需的音调的仪器与需要的仪器约束合并。根据算法的计算复杂度和存储空间要求对系统性能进行了优化。研究了不同歌手和歌曲组合的系统性能。拟议的系统补充了已经可用的印度古典音乐自动伴奏,即斯鲁提和塔阿拉盒。
{"title":"Real-time Melodic Accompaniment System for Indian Music Using TMS320C6713","authors":"Prateek Verma, P. Rao","doi":"10.1109/VLSID.2012.57","DOIUrl":"https://doi.org/10.1109/VLSID.2012.57","url":null,"abstract":"An instrumental accompaniment system for Indian classical vocal music is designed and implemented on a Texas Instruments Digital Signal Processor TMS320C6713. This will act as a virtual accompanist following the main artist, possibly a vocalist. The melodic pitch information drives an instrument synthesis system, which allows us to play any pitched musical instrument virtually following the singing voice in real time with small delay. Additive synthesis is used to generate the desired tones of the instrument with the needed instrument constraints incorporated. The performance of the system is optimized with respect to the computational complexity and memory space requirements of the algorithm. The system performance is studied for different combinations of singers and songs. The proposed system complements the already available automatic accompaniment for Indian classical music, namely the sruti and taala boxes.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"594 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120933659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Tutorial T7A: New Modeling Methodologies for Thermal Analysis of 3D ICs and Advanced Cooling Technologies of the Future 教程T7A: 3D集成电路热分析的新建模方法和未来的先进冷却技术
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.34
David Atienza Alonso, A. Sridhar
Increasing circuit densities, the proliferation of Multi-Processor Systems-on-Chips (MPSoCs) and high performance computing systems have resulted in an alarming rise in electronic heat dissipation levels, making the conventional thermal management strategies, including air cooled heat sinks, obsolete. The latest advancements in 3D Integration of IC dies have only aggravated this problem, creating a strong worldwide research interest in the development of advanced cooling technologies, such as interlayer microchannel liquid cooled heat sinks, to maintain ICs under safe operating temperatures. While this research has helped create a substantial amount of knowledge base pertaining to the heat transfer mechanism in advanced liquid cooling systems as applied to electronic circuits, this knowledge is yet to be transferred to the EDA community for it to be incorporated in the IC thermal simulators of the future. The existence of such tools becomes absolutely essential when IC designers are faced with the challenge of ascertaining the thermal reliability of their designs in the presence of liquid cooling systems. This tutorial aims to introduce the attendees to the key concepts that are needed to compute IC temperatures with and without microchannel liquid cooling and the principles behind compact modeling of forced convective heat transfer in advanced IC cooling technologies. A major part of this tutorial is based on the 3D-ICE thermal simulator, which has been built by the Embedded Systems Laboratory in EPFL, Switzerland (URL: http://esl.epfl.ch/3D-ICE). This simulator is based on the Compact Transient Thermal Modeling for forced convective cooling advanced by our research group. Since its release in 2010, more than 50 research groups across the world have downloaded it and are actively using it for their research.
电路密度的增加、多处理器片上系统(mpsoc)和高性能计算系统的普及导致了电子散热水平的惊人上升,使得传统的热管理策略,包括风冷散热器,已经过时。集成电路芯片3D集成的最新进展只会加剧这一问题,在开发先进冷却技术方面产生了强烈的全球研究兴趣,例如层间微通道液冷散热器,以保持集成电路在安全的工作温度下。虽然这项研究已经帮助创建了大量的知识基础,与应用于电子电路的先进液体冷却系统的传热机制有关,但这些知识尚未被转移到EDA社区,以便将其纳入未来的IC热模拟器中。当IC设计人员面临确定其设计在液冷系统存在下的热可靠性的挑战时,这些工具的存在变得绝对必要。本教程旨在向与会者介绍计算有微通道液体冷却和没有微通道液体冷却的IC温度所需的关键概念,以及先进IC冷却技术中强制对流传热的紧凑建模背后的原理。本教程的主要部分是基于3D-ICE热模拟器,该模拟器由瑞士EPFL的嵌入式系统实验室(URL: http://esl.epfl.ch/3D-ICE)构建。该模拟器是在本课题组研究的紧凑型强制对流冷却瞬态热模型的基础上开发的。自2010年发布以来,全球已有50多个研究小组下载了它,并积极使用它进行研究。
{"title":"Tutorial T7A: New Modeling Methodologies for Thermal Analysis of 3D ICs and Advanced Cooling Technologies of the Future","authors":"David Atienza Alonso, A. Sridhar","doi":"10.1109/VLSID.2012.34","DOIUrl":"https://doi.org/10.1109/VLSID.2012.34","url":null,"abstract":"Increasing circuit densities, the proliferation of Multi-Processor Systems-on-Chips (MPSoCs) and high performance computing systems have resulted in an alarming rise in electronic heat dissipation levels, making the conventional thermal management strategies, including air cooled heat sinks, obsolete. The latest advancements in 3D Integration of IC dies have only aggravated this problem, creating a strong worldwide research interest in the development of advanced cooling technologies, such as interlayer microchannel liquid cooled heat sinks, to maintain ICs under safe operating temperatures. While this research has helped create a substantial amount of knowledge base pertaining to the heat transfer mechanism in advanced liquid cooling systems as applied to electronic circuits, this knowledge is yet to be transferred to the EDA community for it to be incorporated in the IC thermal simulators of the future. The existence of such tools becomes absolutely essential when IC designers are faced with the challenge of ascertaining the thermal reliability of their designs in the presence of liquid cooling systems. This tutorial aims to introduce the attendees to the key concepts that are needed to compute IC temperatures with and without microchannel liquid cooling and the principles behind compact modeling of forced convective heat transfer in advanced IC cooling technologies. A major part of this tutorial is based on the 3D-ICE thermal simulator, which has been built by the Embedded Systems Laboratory in EPFL, Switzerland (URL: http://esl.epfl.ch/3D-ICE). This simulator is based on the Compact Transient Thermal Modeling for forced convective cooling advanced by our research group. Since its release in 2010, more than 50 research groups across the world have downloaded it and are actively using it for their research.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122745763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analog Processing Based Equalizer for 40 Gbps Coherent Optical Links in 90 nm CMOS 基于模拟处理的90nm CMOS 40gbps相干光链路均衡器
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.54
Pawan Kumar Moyade, N. Nambath, Allmin Ansari, Shalabh Gupta
Inter symbol interference introduced by fiber non-idealities such as polarization mode dispersion and chromatic dispersion would be one of the major limiting factors in achieving higher data rates in the existing Gigabit fiber-optic links. Receivers based on high speed ADCs followed by DSPs will be limited by the need for massive parallelization and interconnects. We propose analog signal processing based coherent optical link receiver to drastically reduce its power consumption, size and cost. A 40, Gbps analog processing adaptive DP-QPSK (dual polarization quadrature phase shift keying) equalizer in 90, nm CMOS technology is demonstrated using simulations, which dissipates 450, mW of power. A complete analog processing receiver is expected to consume less than one-tenth of the power consumed by chip using ADCs followed by signal processing in DSP.
光纤的偏振模色散和色散等非理想特性所带来的码间干扰将是现有千兆光纤链路实现更高数据速率的主要限制因素之一。基于高速adc和dsp的接收器将受到大规模并行化和互连需求的限制。为了大幅度降低相干光链路接收机的功耗、尺寸和成本,提出了基于模拟信号处理的相干光链路接收机。采用90nm CMOS技术实现了40gbps模拟处理自适应DP-QPSK(双极化正交相移键控)均衡器,该均衡器功耗为450mw。一个完整的模拟处理接收器预计消耗的功率不到使用adc的芯片的十分之一,然后在DSP中进行信号处理。
{"title":"Analog Processing Based Equalizer for 40 Gbps Coherent Optical Links in 90 nm CMOS","authors":"Pawan Kumar Moyade, N. Nambath, Allmin Ansari, Shalabh Gupta","doi":"10.1109/VLSID.2012.54","DOIUrl":"https://doi.org/10.1109/VLSID.2012.54","url":null,"abstract":"Inter symbol interference introduced by fiber non-idealities such as polarization mode dispersion and chromatic dispersion would be one of the major limiting factors in achieving higher data rates in the existing Gigabit fiber-optic links. Receivers based on high speed ADCs followed by DSPs will be limited by the need for massive parallelization and interconnects. We propose analog signal processing based coherent optical link receiver to drastically reduce its power consumption, size and cost. A 40, Gbps analog processing adaptive DP-QPSK (dual polarization quadrature phase shift keying) equalizer in 90, nm CMOS technology is demonstrated using simulations, which dissipates 450, mW of power. A complete analog processing receiver is expected to consume less than one-tenth of the power consumed by chip using ADCs followed by signal processing in DSP.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122805283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Buffer Design and Eye-Diagram Based Characterization of a 20 GS/s CMOS DAC 20gs /s CMOS DAC的缓冲设计及眼图表征
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.53
M. Singh, Shalabh Gupta
High-Speed Digital-to-Analog Converters (DACs) are inevitable due to the advent of multi level modulation formats to meet the increasing demand of high data rates in communication systems. In this paper, a 4-bit 20 GS/s DAC has been designed in 90 nm CMOS technology. CMOS based DACs provide a low cost single IC solution as compared to compound semiconductor counterparts by fully integrating digital and RF blocks. In this paper, an on-chip Linear Feedback Shift Register (LFSR) is used to generate the required high-speed broadband data and eye diagram of the DAC output is used for characterization. In order to drive the high capacitive loads along with routing, Electro Static Discharge (ESD) and pad capacitance (≈800 fF) at a speed of 20 GS/s (13.1 GHz bandwidth), a new buffer architecture has also been implemented.
高速数模转换器(dac)的出现是为了满足通信系统对高数据速率日益增长的需求。本文采用90nm CMOS技术设计了一个4位20gs /s的DAC。与化合物半导体相比,基于CMOS的dac通过完全集成数字和射频模块,提供了低成本的单IC解决方案。在本文中,使用片上线性反馈移位寄存器(LFSR)来生成所需的高速宽带数据,并使用DAC输出的眼图进行表征。为了以20 GS/s (13.1 GHz带宽)的速度驱动高容性负载以及路由、静电放电(ESD)和垫电容(≈800 fF),还实现了一种新的缓冲架构。
{"title":"Buffer Design and Eye-Diagram Based Characterization of a 20 GS/s CMOS DAC","authors":"M. Singh, Shalabh Gupta","doi":"10.1109/VLSID.2012.53","DOIUrl":"https://doi.org/10.1109/VLSID.2012.53","url":null,"abstract":"High-Speed Digital-to-Analog Converters (DACs) are inevitable due to the advent of multi level modulation formats to meet the increasing demand of high data rates in communication systems. In this paper, a 4-bit 20 GS/s DAC has been designed in 90 nm CMOS technology. CMOS based DACs provide a low cost single IC solution as compared to compound semiconductor counterparts by fully integrating digital and RF blocks. In this paper, an on-chip Linear Feedback Shift Register (LFSR) is used to generate the required high-speed broadband data and eye diagram of the DAC output is used for characterization. In order to drive the high capacitive loads along with routing, Electro Static Discharge (ESD) and pad capacitance (≈800 fF) at a speed of 20 GS/s (13.1 GHz bandwidth), a new buffer architecture has also been implemented.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115870897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Temperature-aware Task Partitioning for Real-Time Scheduling in Embedded Systems 基于温度感知的嵌入式系统实时调度任务分配
Pub Date : 2012-01-07 DOI: 10.1109/VLSID.2012.64
Zhe Wang, S. Ranka, P. Mishra
Both power and heat density of on-chip systems are in- creasing exponentially with Moore's Law. High temperature negatively affects reliability as well the costs of cooling and packaging. In this paper, we propose task partitioning as an effective way to reduce the peak temperature in embedded systems running either a set of periodic heterogeneous tasks with common period or periodic heterogeneous tasks with individual period. For task sets with common period, experimental results show that our task partitioning algorithms is able to reduce the peak temperature by as much as 5.8°C as compared to algorithms that only use task sequencing. For task sets with individual period, EDF scheduling with task partitioning can also lower the peak temperature, as compared to simple EDF scheduling, by as much as 6°C. Our analysis indicates that the numbers of additional context switches (overhead) is less than 2 per task, which is tolerable in many practical scenarios.
片上系统的功率和热密度都按照摩尔定律呈指数增长。高温对可靠性以及冷却和包装成本产生负面影响。在本文中,我们提出任务划分作为一种有效的方法来降低嵌入式系统运行一组具有共同周期的周期性异构任务或具有单个周期的周期性异构任务时的峰值温度。对于具有共同周期的任务集,实验结果表明,与仅使用任务排序的算法相比,我们的任务划分算法可以将峰值温度降低5.8°C。对于具有单独周期的任务集,与简单的EDF调度相比,具有任务分区的EDF调度还可以将峰值温度降低多达6°C。我们的分析表明,每个任务的额外上下文切换(开销)少于2个,这在许多实际场景中是可以忍受的。
{"title":"Temperature-aware Task Partitioning for Real-Time Scheduling in Embedded Systems","authors":"Zhe Wang, S. Ranka, P. Mishra","doi":"10.1109/VLSID.2012.64","DOIUrl":"https://doi.org/10.1109/VLSID.2012.64","url":null,"abstract":"Both power and heat density of on-chip systems are in- creasing exponentially with Moore's Law. High temperature negatively affects reliability as well the costs of cooling and packaging. In this paper, we propose task partitioning as an effective way to reduce the peak temperature in embedded systems running either a set of periodic heterogeneous tasks with common period or periodic heterogeneous tasks with individual period. For task sets with common period, experimental results show that our task partitioning algorithms is able to reduce the peak temperature by as much as 5.8°C as compared to algorithms that only use task sequencing. For task sets with individual period, EDF scheduling with task partitioning can also lower the peak temperature, as compared to simple EDF scheduling, by as much as 6°C. Our analysis indicates that the numbers of additional context switches (overhead) is less than 2 per task, which is tolerable in many practical scenarios.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128783958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
2012 25th International Conference on VLSI Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1