首页 > 最新文献

2008 Asia and South Pacific Design Automation Conference最新文献

英文 中文
An optimal algorithm for sizing sequential circuits for industrial library based designs 基于工业库设计的顺序电路尺寸优化算法
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483929
Sanghamitra Roy, Y. Hu, C. C. Chen, Shih-Pin Hung, Tse-Yu Chiang, Jiuan-Guei Tseng
In this paper, we propose an optimal gate sizing and clock skew optimization algorithm for globally sizing synchronous sequential circuits. The number of constraints and variables in our formulation is linear with respect to the number of circuit components and hence our algorithm can efficiently find the optimal solution for industrial scale designs. To the best of our knowledge our method is the first exact gate sizing algorithm that can handle cyclic sequential circuits. Experimental results on industrial cell libraries demonstrate that our algorithm can yield an average of 12.6% improvement in the optimal clock period by combining clock skew optimization with gate sizing. For identical clock period, our algorithm can achieve an average of 11.3% area savings over a popular commercial synthesis tool.
在本文中,我们提出了一种全局同步顺序电路的最佳门尺寸和时钟偏差优化算法。在我们的公式中,约束和变量的数量与电路元件的数量呈线性关系,因此我们的算法可以有效地找到工业规模设计的最优解。据我们所知,我们的方法是第一个可以处理循环顺序电路的精确门尺寸算法。在工业单元库上的实验结果表明,通过将时钟偏差优化与栅极尺寸相结合,我们的算法可以在最优时钟周期内平均提高12.6%。对于相同的时钟周期,我们的算法可以实现比流行的商业合成工具平均节省11.3%的面积。
{"title":"An optimal algorithm for sizing sequential circuits for industrial library based designs","authors":"Sanghamitra Roy, Y. Hu, C. C. Chen, Shih-Pin Hung, Tse-Yu Chiang, Jiuan-Guei Tseng","doi":"10.1109/ASPDAC.2008.4483929","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483929","url":null,"abstract":"In this paper, we propose an optimal gate sizing and clock skew optimization algorithm for globally sizing synchronous sequential circuits. The number of constraints and variables in our formulation is linear with respect to the number of circuit components and hence our algorithm can efficiently find the optimal solution for industrial scale designs. To the best of our knowledge our method is the first exact gate sizing algorithm that can handle cyclic sequential circuits. Experimental results on industrial cell libraries demonstrate that our algorithm can yield an average of 12.6% improvement in the optimal clock period by combining clock skew optimization with gate sizing. For identical clock period, our algorithm can achieve an average of 11.3% area savings over a popular commercial synthesis tool.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129931138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A low-leakage current power 180-nm CMOS SRAM 一种低漏电流功率180nm CMOS SRAM
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483914
T. Enomoto, Yuki Higuchi
A low leakage power, 180-nm 1K-b SRAM was fabricated. The stand-by leakage power of a 1K-bit memory cell array incorporating a newly-developed leakage current reduction circuit called a "self-controllable voltage level (SVL)" circuit was only 3.7 nW, which is 5.4% that of an equivalent conventional memory-cell array at a VDD of 1.8 V. On the other hand, the speed remained almost constant with a minimal overhead in terms of the memory cell array area.
制备了低漏功率180nm的1K-b SRAM。采用新开发的泄漏电流减小电路“自控电压电平(SVL)”电路的1k位存储单元阵列的待机泄漏功率仅为3.7 nW,是同等传统存储单元阵列在VDD为1.8 V时的5.4%。另一方面,速度几乎保持不变,内存单元阵列面积的开销最小。
{"title":"A low-leakage current power 180-nm CMOS SRAM","authors":"T. Enomoto, Yuki Higuchi","doi":"10.1109/ASPDAC.2008.4483914","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483914","url":null,"abstract":"A low leakage power, 180-nm 1K-b SRAM was fabricated. The stand-by leakage power of a 1K-bit memory cell array incorporating a newly-developed leakage current reduction circuit called a \"self-controllable voltage level (SVL)\" circuit was only 3.7 nW, which is 5.4% that of an equivalent conventional memory-cell array at a VDD of 1.8 V. On the other hand, the speed remained almost constant with a minimal overhead in terms of the memory cell array area.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128958531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Floating-point reconfiguration array processor for 3D graphics physics engine 用于3D图形物理引擎的浮点重构阵列处理器
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483956
Hoonmo Yang
We implemented an RTL model of the proposed RA and perform simulation in RealView coverification environment by executing examples using physics engine. We discovered if the physics engine part is accelerated by RA, the workloads run over 20 times faster than the pure software without FPU and over 4 times faster than the pure software with FPU. If codes are well partitioned and optimized for the proposed RA, which now remains for future study, even more improvement can be expected.
我们实现了提出的RA的RTL模型,并通过使用物理引擎执行示例在RealView覆盖环境中进行了仿真。我们发现,如果物理引擎部分通过RA加速,工作负载的运行速度比不带FPU的纯软件快20倍以上,比带FPU的纯软件快4倍以上。如果针对所提出的RA对代码进行了很好的划分和优化,那么可以预期会有更多的改进。
{"title":"Floating-point reconfiguration array processor for 3D graphics physics engine","authors":"Hoonmo Yang","doi":"10.1109/ASPDAC.2008.4483956","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483956","url":null,"abstract":"We implemented an RTL model of the proposed RA and perform simulation in RealView coverification environment by executing examples using physics engine. We discovered if the physics engine part is accelerated by RA, the workloads run over 20 times faster than the pure software without FPU and over 4 times faster than the pure software with FPU. If codes are well partitioned and optimized for the proposed RA, which now remains for future study, even more improvement can be expected.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126777423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Timing-power optimization for mixed-radix Ling adders by integer linear programming 基于整数线性规划的混合基数Ling加法器时功率优化
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483926
Yi Zhu, Jianhua Liu, Haikun Zhu, Chung-Kuan Cheng
This paper optimizes timing and power consumption of mixed-radix Ling adders with the physical area constraints using an integer linear programming formulation. Each cell in the prefix network is flexible to have different radix and size, and Ling carries are incorporated. Optimal solutions are obtained by solving the proposed formulation. The experiments show that the produced optimal structures have a large power saving compared with traditional designs. The ASIC implementation results are superior to those produced by Synopsys Module Compiler.
本文利用整数线性规划公式,在物理面积约束下对混合基数加法器的时序和功耗进行了优化。前缀网络中的每个单元可以灵活地具有不同的基数和大小,并结合了Ling载波。通过求解所提出的公式得到了最优解。实验结果表明,该优化结构与传统结构相比具有较大的节能效果。ASIC的实现结果优于Synopsys模块编译器。
{"title":"Timing-power optimization for mixed-radix Ling adders by integer linear programming","authors":"Yi Zhu, Jianhua Liu, Haikun Zhu, Chung-Kuan Cheng","doi":"10.1109/ASPDAC.2008.4483926","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483926","url":null,"abstract":"This paper optimizes timing and power consumption of mixed-radix Ling adders with the physical area constraints using an integer linear programming formulation. Each cell in the prefix network is flexible to have different radix and size, and Ling carries are incorporated. Optimal solutions are obtained by solving the proposed formulation. The experiments show that the produced optimal structures have a large power saving compared with traditional designs. The ASIC implementation results are superior to those produced by Synopsys Module Compiler.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126826892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MeshWorks: An efficient framework for planning, synthesis and optimization of clock mesh networks MeshWorks:用于规划、综合和优化时钟网状网络的有效框架
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483951
A. Rajaram, D. Pan
A leaf-level clock mesh is known to be very tolerant to variations (Restle et al., 2001). However, its use is limited to a few high-end designs because of the high power/resource requirements and lack of automatic mesh synthesis tools. Most existing works on clock mesh (Restle et al., 2001) either deal with semi-custom design or perform optimizations on a given clock mesh. However, the problem of obtaining a good initial clock mesh has not been addressed. Similarly, the problem of achieving a smooth tradeoff between skew and power/resources has not been addressed adequately. In this work, we present MeshWorks, the first comprehensive automated framework for planning, synthesis and optimization of clock mesh networks with the objective of addressing the above issues. Experimental results suggest that our algorithms can achieve an additional reduction of 26% in buffer area, 19% in wirelength and 18% in power, compared to the recent work of Venkataraman et al., (2006) with similar worst case maximum frequency under variation.
众所周知,叶片级时钟网格对变化的容忍度很高(Restle等人,2001年)。然而,由于高功率/资源要求和缺乏自动网格合成工具,它的使用仅限于一些高端设计。大多数关于时钟网格的现有工作(Restle et al., 2001)要么处理半定制设计,要么在给定的时钟网格上执行优化。然而,获得一个好的初始时钟网格的问题还没有得到解决。同样,在倾斜和功率/资源之间实现平滑权衡的问题也没有得到充分解决。在这项工作中,我们提出了MeshWorks,这是第一个用于规划、综合和优化时钟网格网络的综合自动化框架,旨在解决上述问题。实验结果表明,与Venkataraman等人(2006)最近的工作相比,我们的算法可以在相同的最坏情况下最大频率变化下实现缓冲面积26%,带宽19%和功耗18%的额外减少。
{"title":"MeshWorks: An efficient framework for planning, synthesis and optimization of clock mesh networks","authors":"A. Rajaram, D. Pan","doi":"10.1109/ASPDAC.2008.4483951","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483951","url":null,"abstract":"A leaf-level clock mesh is known to be very tolerant to variations (Restle et al., 2001). However, its use is limited to a few high-end designs because of the high power/resource requirements and lack of automatic mesh synthesis tools. Most existing works on clock mesh (Restle et al., 2001) either deal with semi-custom design or perform optimizations on a given clock mesh. However, the problem of obtaining a good initial clock mesh has not been addressed. Similarly, the problem of achieving a smooth tradeoff between skew and power/resources has not been addressed adequately. In this work, we present MeshWorks, the first comprehensive automated framework for planning, synthesis and optimization of clock mesh networks with the objective of addressing the above issues. Experimental results suggest that our algorithms can achieve an additional reduction of 26% in buffer area, 19% in wirelength and 18% in power, compared to the recent work of Venkataraman et al., (2006) with similar worst case maximum frequency under variation.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115291707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Robust on-chip bus architecture synthesis for MPSoCs under random tasks arrival 随机任务到达下mpsoc的鲁棒片上总线结构综合
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484022
S. Pandey, R. Drechsler
A major trend in a modern system-on-chip design is a growing system complexity, which results in a sharp increase of communication traffic on the on-chip communication bus architectures. In a real-time embedded system, task arrival rate, inter-task arrival time, and data size to be transferred are not uniform over time. This is due to the partial re-configuration of an embedded system to cope with dynamic workload. In this context, the traditional application specific bus architectures may fail to meet the real-time constraints. Thus, to incorporate the random behavior of on-chip communication, this work proposes an approach to synthesize an on-chip bus architecture, which is robust for a given distributions of random tasks. The randomness of communication tasks is characterized by three main parameters which are the average task arrival rate, the average inter-task arrival time, and the data size. For synthesis, an on-chip bus requirement is guided by the worst-case performance need, while the dynamic voltage scaling technique is used to save energy when the workload is low or timing slack is high. This, in turn, results in an effective utilization of communication resources under variable workload.
现代片上系统设计的一个主要趋势是系统复杂性的增加,这导致片上通信总线架构上的通信流量急剧增加。在实时嵌入式系统中,任务到达率、任务间到达时间和传输的数据量随时间的变化是不一致的。这是由于嵌入式系统的部分重新配置,以应付动态工作负载。在这种情况下,传统的特定于应用程序的总线体系结构可能无法满足实时约束。因此,为了结合片上通信的随机行为,本工作提出了一种合成片上总线架构的方法,该架构对于给定的随机任务分布具有鲁棒性。通信任务的随机性主要表现为任务平均到达率、任务间平均到达时间和数据量三个参数。在综合方面,片上总线需求以最坏情况性能需求为指导,而动态电压缩放技术用于在工作负载低或定时松弛高时节省能量。这反过来又导致在可变工作量下有效利用通信资源。
{"title":"Robust on-chip bus architecture synthesis for MPSoCs under random tasks arrival","authors":"S. Pandey, R. Drechsler","doi":"10.1109/ASPDAC.2008.4484022","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484022","url":null,"abstract":"A major trend in a modern system-on-chip design is a growing system complexity, which results in a sharp increase of communication traffic on the on-chip communication bus architectures. In a real-time embedded system, task arrival rate, inter-task arrival time, and data size to be transferred are not uniform over time. This is due to the partial re-configuration of an embedded system to cope with dynamic workload. In this context, the traditional application specific bus architectures may fail to meet the real-time constraints. Thus, to incorporate the random behavior of on-chip communication, this work proposes an approach to synthesize an on-chip bus architecture, which is robust for a given distributions of random tasks. The randomness of communication tasks is characterized by three main parameters which are the average task arrival rate, the average inter-task arrival time, and the data size. For synthesis, an on-chip bus requirement is guided by the worst-case performance need, while the dynamic voltage scaling technique is used to save energy when the workload is low or timing slack is high. This, in turn, results in an effective utilization of communication resources under variable workload.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121171734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Webpage-based benchmarks for mobile device design 基于网页的移动设备设计基准
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484060
Marc Somers, J. M. Paul
Computers are currently designed using benchmarks and specification styles that are decades old, even as computers are being used in fundamentally different ways. By investigating the content, structure and usage of webpages, we observe that webpages represent a fundamentally different standard for performance evaluation of computers. We gathered data and modeled typical webpage content in order to characterize what is becoming a uniquely important design space. We then included this data in a set of simulations that also included models of a variety of scheduler types and heterogeneous multiprocessor architectures. To this, we proposed usage patterns that we believe typify the way people access the Internet on mobile devices. Considering only modern-day content in webpages, we found that specialized architectures can improve performance up to 70% over a homogeneous multiprocessor composed of general purpose processors with 25% additional improvement over the next best architecture when individual user preferences are also considered. This trend will increase as webpages become more differentiated in purpose and more complex in content. A new model of performance evaluation of computing must be developed, based upon webpage content and webpage access patterns.
目前的计算机设计使用的是几十年前的基准和规范风格,即使计算机的使用方式已经完全不同。通过调查网页的内容、结构和使用,我们观察到网页代表了计算机性能评估的一个根本不同的标准。我们收集了数据并对典型的网页内容进行了建模,以描述什么正在成为一个独特的重要设计空间。然后,我们将这些数据包含在一组模拟中,其中还包括各种调度器类型和异构多处理器体系结构的模型。为此,我们提出了一些使用模式,我们认为这些模式代表了人们在移动设备上访问互联网的方式。仅考虑网页中的现代内容,我们发现专用架构可以比由通用处理器组成的同质多处理器提高高达70%的性能,并且在考虑个人用户偏好的情况下,比次优架构提高25%。随着网页在目的上的分化和内容上的复杂化,这种趋势还会增加。必须开发一种新的基于网页内容和网页访问模式的计算性能评估模型。
{"title":"Webpage-based benchmarks for mobile device design","authors":"Marc Somers, J. M. Paul","doi":"10.1109/ASPDAC.2008.4484060","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484060","url":null,"abstract":"Computers are currently designed using benchmarks and specification styles that are decades old, even as computers are being used in fundamentally different ways. By investigating the content, structure and usage of webpages, we observe that webpages represent a fundamentally different standard for performance evaluation of computers. We gathered data and modeled typical webpage content in order to characterize what is becoming a uniquely important design space. We then included this data in a set of simulations that also included models of a variety of scheduler types and heterogeneous multiprocessor architectures. To this, we proposed usage patterns that we believe typify the way people access the Internet on mobile devices. Considering only modern-day content in webpages, we found that specialized architectures can improve performance up to 70% over a homogeneous multiprocessor composed of general purpose processors with 25% additional improvement over the next best architecture when individual user preferences are also considered. This trend will increase as webpages become more differentiated in purpose and more complex in content. A new model of performance evaluation of computing must be developed, based upon webpage content and webpage access patterns.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"254 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133616206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Experiences of low power design implementation and verification 具有低功耗设计实现和验证经验
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484050
Shi-Hao Chen, Jiing-Yuan Lin
In this paper, we present the experiences of some low power solutions that have been successfully implemented in 90 nm/65 nm production tape-outs. We also focus on power gating design, an effective low leakage solution, and present the experiences of power switch planning, optimization, and verification. Dynamic IR drop is an important issue in low power design, which may reduce the logic gate noise margins and result in functional or timing failures. We will present a low cost but effective methodology for dynamic IR drop prevention and fixing.
在本文中,我们介绍了一些在90纳米/65纳米生产带中成功实现的低功耗解决方案的经验。我们还将重点介绍功率门控设计,这是一种有效的低泄漏解决方案,并介绍功率开关规划,优化和验证的经验。动态红外降是低功耗设计中的一个重要问题,它可能会降低逻辑门的噪声裕度并导致功能或时序故障。我们将提出一种低成本但有效的动态红外跌落预防和修复方法。
{"title":"Experiences of low power design implementation and verification","authors":"Shi-Hao Chen, Jiing-Yuan Lin","doi":"10.1109/ASPDAC.2008.4484050","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484050","url":null,"abstract":"In this paper, we present the experiences of some low power solutions that have been successfully implemented in 90 nm/65 nm production tape-outs. We also focus on power gating design, an effective low leakage solution, and present the experiences of power switch planning, optimization, and verification. Dynamic IR drop is an important issue in low power design, which may reduce the logic gate noise margins and result in functional or timing failures. We will present a low cost but effective methodology for dynamic IR drop prevention and fixing.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"32 29","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131804410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Within-die process variations: How accurately can they be statistically modeled? 模具内工艺变化:统计建模的准确性如何?
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484007
Brendan Hargreaves, Henrik Hult, S. Reda
Within-die process variations arise during integrated circuit (IC) fabrication in the sub-100nm regime. These variations are of paramount concern as they deviate the performance of ICs from their designers' original intent. These deviations reduce the parametric yield and revenues from integrated circuit fabrication. In this paper we provide a complete treatment to the subject of within-die variations. We propose a scan-chain based system, vMeter, to extract within-die variations in an automated fashion. We implement our system in a sample of 90 nm chips, and collect the within-die variations data. Then we propose a number of novel statistical analysis techniques that accurately model the within-die variation trends and capture the spatial correlations. We propose the use of maximum-likelihood techniques to find the required parameters to fit the model to the data. The accuracy of our models is statistically verified through residual analysis and variograms. Using our successful modeling technique, we propose a procedure to generate synthetic within-die variation patterns that mimic, or imitate, real silicon data.
在亚100nm制程的集成电路(IC)制造过程中,会出现模内工艺变化。这些变化是最重要的问题,因为它们使ic的性能偏离了设计者的初衷。这些偏差降低了集成电路制造的参数良率和收益。在本文中,我们提供了一个完整的处理的主题,模具内的变化。我们提出了一个基于扫描链的系统,vMeter,以自动化的方式提取模具内的变化。我们在90纳米芯片样品中实现了我们的系统,并收集了芯片内的变化数据。然后,我们提出了一些新的统计分析技术,可以准确地模拟模内变化趋势并捕获空间相关性。我们建议使用最大似然技术来找到所需的参数来拟合模型与数据。通过残差分析和方差分析,对模型的准确性进行了统计验证。利用我们成功的建模技术,我们提出了一种程序来生成模拟或模仿真实硅数据的合成模内变化模式。
{"title":"Within-die process variations: How accurately can they be statistically modeled?","authors":"Brendan Hargreaves, Henrik Hult, S. Reda","doi":"10.1109/ASPDAC.2008.4484007","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484007","url":null,"abstract":"Within-die process variations arise during integrated circuit (IC) fabrication in the sub-100nm regime. These variations are of paramount concern as they deviate the performance of ICs from their designers' original intent. These deviations reduce the parametric yield and revenues from integrated circuit fabrication. In this paper we provide a complete treatment to the subject of within-die variations. We propose a scan-chain based system, vMeter, to extract within-die variations in an automated fashion. We implement our system in a sample of 90 nm chips, and collect the within-die variations data. Then we propose a number of novel statistical analysis techniques that accurately model the within-die variation trends and capture the spatial correlations. We propose the use of maximum-likelihood techniques to find the required parameters to fit the model to the data. The accuracy of our models is statistically verified through residual analysis and variograms. Using our successful modeling technique, we propose a procedure to generate synthetic within-die variation patterns that mimic, or imitate, real silicon data.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117327589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
A multicycle communication architecture and synthesis flow for Global interconnect Resource Sharing 面向全局互连资源共享的多周期通信体系结构和综合流程
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483933
Wei-Sheng Huang, Y. Hong, Juinn-Dar Huang, Ya-Shih Huang
In deep submicron technology, wire delay is no longer negligible and is gradually dominating the system latency. Some state-of-the-art architectural synthesis flows adopt the distributed register (DR) architecture to cope with this increasing latency. The DR architecture, though allows multicycle communication, introduces extra overhead on interconnect resource. In this paper, we propose the regular distributed register - global resource sharing (RDR-GRS) architecture to enable global sharing of interconnects and registers. Based on the RDR-GRS architecture, we further define the channel and register allocation problem as a path scheduling problem of data transfers. A formal and flexible formulation of this problem is then presented and optimally solved by Integer Linear Programming (ILP). Experimental results show that RDR-GRS/ILP can averagely reduce 58% wires and 35% registers compared to the previous work.
在深亚微米技术中,线延迟不再是可以忽略不计的,并逐渐主导着系统延迟。一些最先进的体系结构合成流采用分布式寄存器(DR)体系结构来处理这种不断增加的延迟。DR架构虽然允许多周期通信,但在互连资源上引入了额外的开销。为了实现互连和寄存器的全局共享,本文提出了规则分布式寄存器全局资源共享(RDR-GRS)架构。在RDR-GRS架构的基础上,进一步将信道和寄存器分配问题定义为数据传输的路径调度问题。然后给出了该问题的一种形式和灵活的表述,并用整数线性规划(ILP)进行了最优求解。实验结果表明,RDR-GRS/ILP与以前的工作相比,平均减少58%的导线和35%的寄存器。
{"title":"A multicycle communication architecture and synthesis flow for Global interconnect Resource Sharing","authors":"Wei-Sheng Huang, Y. Hong, Juinn-Dar Huang, Ya-Shih Huang","doi":"10.1109/ASPDAC.2008.4483933","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483933","url":null,"abstract":"In deep submicron technology, wire delay is no longer negligible and is gradually dominating the system latency. Some state-of-the-art architectural synthesis flows adopt the distributed register (DR) architecture to cope with this increasing latency. The DR architecture, though allows multicycle communication, introduces extra overhead on interconnect resource. In this paper, we propose the regular distributed register - global resource sharing (RDR-GRS) architecture to enable global sharing of interconnects and registers. Based on the RDR-GRS architecture, we further define the channel and register allocation problem as a path scheduling problem of data transfers. A formal and flexible formulation of this problem is then presented and optimally solved by Integer Linear Programming (ILP). Experimental results show that RDR-GRS/ILP can averagely reduce 58% wires and 35% registers compared to the previous work.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115265253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2008 Asia and South Pacific Design Automation Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1