首页 > 最新文献

Proceedings. 41st Design Automation Conference, 2004.最新文献

英文 中文
Area-efficient instruction set synthesis for reconfigurable system-on-chip designs 面向可重构片上系统设计的区域高效指令集综合
Pub Date : 2004-06-07 DOI: 10.1145/996566.996679
P. Brisk, A. Kaplan, M. Sarrafzadeh
Silicon compilers are often used in conjunction with Field Programmable Gate Arrays (FPGAs) to deliver flexibility, fast prototyping, and accelerated time-to-market. Many of these compilers produce hardware that is larger than necessary, as they do not allow instructions to share hardware resources. This study presents an efficient heuristic which transforms a set of custom instructions into a single hardware datapath on which they can execute. Our approach is based on the classic problems of finding the longest common subsequence and substring of two (or more) sequences. This heuristic produces circuits which are as much as 85.33% smaller than those synthesized by integer linear programming (ILP) approaches which do not explore resource sharing. On average, we obtained 55.41% area reduction for pipelined datapaths, and 66.92% area reduction for VLIW datapaths. Our solution is simple and effective, and can easily be integrated into an existing silicon compiler.
硅编译器通常与现场可编程门阵列(fpga)结合使用,以提供灵活性,快速原型设计和加速上市时间。许多这样的编译器产生的硬件比需要的大,因为它们不允许指令共享硬件资源。本研究提出了一种有效的启发式方法,将一组自定义指令转换为单个硬件数据路径,并在该路径上执行这些指令。我们的方法是基于寻找两个(或多个)序列的最长公共子序列和子串的经典问题。这种启发式方法比不探索资源共享的整数线性规划(ILP)方法合成的电路小85.33%。管道数据路径的平均面积减少55.41%,VLIW数据路径的平均面积减少66.92%。我们的解决方案简单有效,可以很容易地集成到现有的硅编译器中。
{"title":"Area-efficient instruction set synthesis for reconfigurable system-on-chip designs","authors":"P. Brisk, A. Kaplan, M. Sarrafzadeh","doi":"10.1145/996566.996679","DOIUrl":"https://doi.org/10.1145/996566.996679","url":null,"abstract":"Silicon compilers are often used in conjunction with Field Programmable Gate Arrays (FPGAs) to deliver flexibility, fast prototyping, and accelerated time-to-market. Many of these compilers produce hardware that is larger than necessary, as they do not allow instructions to share hardware resources. This study presents an efficient heuristic which transforms a set of custom instructions into a single hardware datapath on which they can execute. Our approach is based on the classic problems of finding the longest common subsequence and substring of two (or more) sequences. This heuristic produces circuits which are as much as 85.33% smaller than those synthesized by integer linear programming (ILP) approaches which do not explore resource sharing. On average, we obtained 55.41% area reduction for pipelined datapaths, and 66.92% area reduction for VLIW datapaths. Our solution is simple and effective, and can easily be integrated into an existing silicon compiler.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134409446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Compact thermal modeling for temperature-aware design 紧凑的热建模温度敏感的设计
Pub Date : 2004-06-07 DOI: 10.1145/996566.996800
Wei Huang, M. Stan, K. Skadron, K. Sankaranarayanan, S. Ghosh, S. Velusamy
Thermal design in sub-100nm technologies is one of the major challenges to the CAD community. In this paper, we first introduce the idea of temperature-aware design. We then propose a compact thermal model which can be integrated with modern CAD tools to achieve a temperature-aware design methodology. Finally, we use the compact thermal model in a case study of microprocessor design to show the importance of using temperature as a guideline for the design. Results from our thermal model show that a temperature-aware design approach can provide more accurate estimations, and therefore better decisions and faster design convergence.
亚100nm技术的热设计是CAD社区面临的主要挑战之一。本文首先介绍了温度感知设计的思想。然后,我们提出了一个紧凑的热模型,可以与现代CAD工具集成,以实现温度感知设计方法。最后,我们在微处理器设计的案例研究中使用了紧凑的热模型,以显示使用温度作为设计指南的重要性。我们的热模型结果表明,温度感知设计方法可以提供更准确的估计,从而更好的决策和更快的设计收敛。
{"title":"Compact thermal modeling for temperature-aware design","authors":"Wei Huang, M. Stan, K. Skadron, K. Sankaranarayanan, S. Ghosh, S. Velusamy","doi":"10.1145/996566.996800","DOIUrl":"https://doi.org/10.1145/996566.996800","url":null,"abstract":"Thermal design in sub-100nm technologies is one of the major challenges to the CAD community. In this paper, we first introduce the idea of temperature-aware design. We then propose a compact thermal model which can be integrated with modern CAD tools to achieve a temperature-aware design methodology. Finally, we use the compact thermal model in a case study of microprocessor design to show the importance of using temperature as a guideline for the design. Results from our thermal model show that a temperature-aware design approach can provide more accurate estimations, and therefore better decisions and faster design convergence.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132055884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 348
On the generation of scan-based test sets with reachable states for testing under functional operation conditions 在功能运行条件下,生成具有可达状态的基于扫描的测试集
Pub Date : 2004-06-07 DOI: 10.1145/996566.996813
I. Pomeranz
Design-for-testability (DFT) for synchronous sequential circuits allows the generation and application of tests that rely on non-functional operation of the circuit. This can result in unnecessary yield loss due to the detection of faults that do not affect normal circuit operation. Considering single stuck-at faults in full-scan circuits, a test vector consists of a primary input vector U and a state S .We say that the test vector consisting of U and S relies on non-functional operation if S is an unreachable state, i.e., a state that cannot be reached from all the circuit states. Our goal is to obtain test sets with states S that are reachable states. Given a test set C, the solution we explore is based on a simulation-based procedure to identify reachable states that can replace unreachable states in C. No modifications are required to the test generation procedure and no sequential test generation is needed. Our results demonstrate that the proposed procedure is able to produce test sets that detect many of the circuit faults, which are detectable using scan, and practically all the sequentially irredundant faults, by using test vectors with reachable states. The procedure is applicable to any type of scan-based test set, including test sets for delay faults.
同步顺序电路的可测试性设计(DFT)允许生成和应用依赖于电路非功能操作的测试。由于检测到不影响正常电路运行的故障,这可能导致不必要的良率损失。考虑全扫描电路中的单个卡滞故障,测试向量由主输入向量U和状态S组成。我们说,如果S是不可达状态,即不能从所有电路状态到达的状态,则由U和S组成的测试向量依赖于非功能操作。我们的目标是获得状态S为可达状态的测试集。给定一个测试集C,我们探索的解决方案是基于一个基于模拟的过程来识别可达状态,这些状态可以取代C中的不可达状态。不需要修改测试生成过程,也不需要顺序生成测试。我们的结果表明,所提出的过程能够产生测试集来检测许多电路故障,这些故障可以通过扫描检测到,并且通过使用具有可达状态的测试向量来检测几乎所有的顺序非冗余故障。该步骤适用于任何类型的基于扫描的测试集,包括延迟故障的测试集。
{"title":"On the generation of scan-based test sets with reachable states for testing under functional operation conditions","authors":"I. Pomeranz","doi":"10.1145/996566.996813","DOIUrl":"https://doi.org/10.1145/996566.996813","url":null,"abstract":"Design-for-testability (DFT) for synchronous sequential circuits allows the generation and application of tests that rely on non-functional operation of the circuit. This can result in unnecessary yield loss due to the detection of faults that do not affect normal circuit operation. Considering single stuck-at faults in full-scan circuits, a test vector consists of a primary input vector U and a state S .We say that the test vector consisting of U and S relies on non-functional operation if S is an unreachable state, i.e., a state that cannot be reached from all the circuit states. Our goal is to obtain test sets with states S that are reachable states. Given a test set C, the solution we explore is based on a simulation-based procedure to identify reachable states that can replace unreachable states in C. No modifications are required to the test generation procedure and no sequential test generation is needed. Our results demonstrate that the proposed procedure is able to produce test sets that detect many of the circuit faults, which are detectable using scan, and practically all the sequentially irredundant faults, by using test vectors with reachable states. The procedure is applicable to any type of scan-based test set, including test sets for delay faults.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132355267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 100
An algorithm for converting floating-point computations to fixed-point in MATLAB based FPGA design 基于MATLAB的FPGA设计中浮点运算到定点运算的转换算法
Pub Date : 2004-06-07 DOI: 10.1145/996566.996701
Sanghamitra Roy, P. Banerjee
Most practical FPGA designs of digital signal processing applications are limited to fixed-point arithmetic owing to the cost and complexiry of floating-point hardware. While mapping DSP applications onto FPGAs, a DSP algorithm designer, who often develops his applications in MATLAB, must determine the dynamic range and desired precision of input, intermediate and output signals in a design implementation to ensure that the algorithm fidelity criteria are met. The first step in a flow to map MATLAB applications into hardware is the conversion of the floating-point MATLAB algorithm into a fixed-point version. This paper describes an approach to automate this conversion, for mapping to FPGAs by profiling the expected inputs to estimate errors. Our algorithm attempts to minimize the hardware resources while constraining the quantization error within a specified limit
由于浮点硬件的成本和复杂性,大多数实际的数字信号处理应用的FPGA设计都局限于定点算法。在将DSP应用映射到fpga时,通常在MATLAB中开发应用的DSP算法设计者必须在设计实现中确定输入、中间和输出信号的动态范围和所需精度,以确保满足算法保真标准。将MATLAB应用程序映射到硬件的流程的第一步是将浮点MATLAB算法转换为定点版本。本文描述了一种自动化转换的方法,通过分析预期的输入来估计误差,从而映射到fpga。我们的算法尽量减少硬件资源,同时将量化误差限制在指定的范围内
{"title":"An algorithm for converting floating-point computations to fixed-point in MATLAB based FPGA design","authors":"Sanghamitra Roy, P. Banerjee","doi":"10.1145/996566.996701","DOIUrl":"https://doi.org/10.1145/996566.996701","url":null,"abstract":"Most practical FPGA designs of digital signal processing applications are limited to fixed-point arithmetic owing to the cost and complexiry of floating-point hardware. While mapping DSP applications onto FPGAs, a DSP algorithm designer, who often develops his applications in MATLAB, must determine the dynamic range and desired precision of input, intermediate and output signals in a design implementation to ensure that the algorithm fidelity criteria are met. The first step in a flow to map MATLAB applications into hardware is the conversion of the floating-point MATLAB algorithm into a fixed-point version. This paper describes an approach to automate this conversion, for mapping to FPGAs by profiling the expected inputs to estimate errors. Our algorithm attempts to minimize the hardware resources while constraining the quantization error within a specified limit","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133331186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
A fast hardware/software co-verification method for systern-on-a-chip by using a C/C++ simulator and FPGA emulator with shared register communication 一种基于C/ c++模拟器和FPGA模拟器的片上系统的快速软硬件协同验证方法
Pub Date : 2004-06-07 DOI: 10.1145/996566.996655
Yuichi Nakamura, Koh Hosokawa, I. Kuroda, Ko Yoshikawa, T. Yoshimura
This paper describes a new hardware/software co-verification method for System-On-a-Chip, based on the integration of a C/C++ simulator and an inexpensive FPGA emulator. Communication between the simulator and emulator occurs via a flexible interface based on shared communication registers. This method enables easy debugging, rich portability, and high verification speed, at a low cost. We describe the application of this environment to the verification of three different complex commercial SoCs, supporting concurrent hardware and embedded software development. In these projects, our verification methodology was used to perform complete system verification at 0.2-1.1 MHz, while supporting full graphical interface functions such as "waveform" or "signal dump" viewers, and debugging functions such as "step" or "break".
本文介绍了一种基于C/ c++仿真器和廉价FPGA仿真器集成的单片系统软硬件协同验证方法。仿真器和仿真器之间的通信通过基于共享通信寄存器的灵活接口进行。该方法调试简单,可移植性强,验证速度快,成本低。我们描述了该环境的应用,以验证三种不同的复杂商用soc,支持并发硬件和嵌入式软件开发。在这些项目中,我们的验证方法被用于在0.2-1.1 MHz进行完整的系统验证,同时支持完整的图形界面功能,如“波形”或“信号转储”查看器,以及调试功能,如“步进”或“中断”。
{"title":"A fast hardware/software co-verification method for systern-on-a-chip by using a C/C++ simulator and FPGA emulator with shared register communication","authors":"Yuichi Nakamura, Koh Hosokawa, I. Kuroda, Ko Yoshikawa, T. Yoshimura","doi":"10.1145/996566.996655","DOIUrl":"https://doi.org/10.1145/996566.996655","url":null,"abstract":"This paper describes a new hardware/software co-verification method for System-On-a-Chip, based on the integration of a C/C++ simulator and an inexpensive FPGA emulator. Communication between the simulator and emulator occurs via a flexible interface based on shared communication registers. This method enables easy debugging, rich portability, and high verification speed, at a low cost. We describe the application of this environment to the verification of three different complex commercial SoCs, supporting concurrent hardware and embedded software development. In these projects, our verification methodology was used to perform complete system verification at 0.2-1.1 MHz, while supporting full graphical interface functions such as \"waveform\" or \"signal dump\" viewers, and debugging functions such as \"step\" or \"break\".","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133136033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 68
Trends in the use of re-configurable platforms 使用可重新配置平台的趋势
Pub Date : 2004-06-07 DOI: 10.1145/996566.996685
M. Baron
Designers can create completely new processors with custom instruction set architectures (ISA), using various methods involving configurable logic. Configurable technologies also enable designers to enhance the basic ISA of standard processors or the ISA of a proprietary processor to execute at speed workloads for which the processor has not been initially conceived. Contrary to some early beliefs, the idea behind creating a custom instruction is not to compress several existing ISA instructions in one cycle; it is to execute loops requiring hundreds or thousands of iterations, faster than in a single machine, even if it were clocked at the top frequency afforded by state-of-the-art semiconductor speeds and temperature limitations.To achieve high performance, most configurable platforms execute loop iterations in parallel; operating on multiple data in one cycle can make up for engine frequency and power limitations. Aimed at implementations in ASIC technologies, configurable platforms can be defined as designer-created mostly hardwired logic interfaced via ISA instruction enhancements.Re-configurable platforms were introduced only recently. Architectures employing FPGA-like structures instead of hardwired logic offer flexibility useful in addressing a broader range of applications and tracking evolving standards. The presentation surveys configurable and re-configurable structures including fabrics of processors, evolving trends, and the impact of soft-hardware development tools.Fabrics of processors were initially aimed at very high performance tasks in communications. This type of architecture is also beginning to be employed in low power applications where it can offer a ratio of performance-to-power exceeding that of an implementation using one or more general-purpose processors. Several emerging fabric configurations will be described and compared: base cores using a processor element (PE) and private memory for instructions and data, PEs using local instructions' memory and communicating data, PEs that can change processing capabilities depending on the function to be executed, heterogeneous PEs and others. Software development tools' issues have kept processor fabrics from being adopted by more designers: iterative optimal routing between PEs and assignment of functions have become additional burdens on the C/C++ language programmer. None of the proposed products has acquired enough traction to justify acceptance as a standard architecture. The key to a wider adoption of re-configurable engines will be found in the soft-hardware tools offered to the programmer: two types of soft-hardware tools will be described, one using program and explicit routing, the other employing hints that can generate program and routing.
设计人员可以使用涉及可配置逻辑的各种方法,使用自定义指令集架构(ISA)创建全新的处理器。可配置技术还使设计人员能够增强标准处理器的基本ISA或专有处理器的ISA,以执行处理器最初未考虑的速度工作负载。与一些早期的信念相反,创建自定义指令背后的想法并不是在一个周期内压缩几个现有的ISA指令;它是执行需要数百或数千次迭代的循环,比单个机器更快,即使它的时钟处于最先进的半导体速度和温度限制所能提供的最高频率。为了实现高性能,大多数可配置平台并行执行循环迭代;在一个周期内操作多个数据可以弥补发动机频率和功率的限制。针对ASIC技术的实现,可配置平台可以定义为设计人员通过ISA指令增强创建的大部分硬连线逻辑接口。可重新配置的平台是最近才引入的。采用类似fpga的结构而不是硬连接逻辑的体系结构,在处理更广泛的应用程序和跟踪不断发展的标准方面提供了有用的灵活性。该报告调查了可配置和可重新配置的结构,包括处理器的结构、发展趋势和软硬件开发工具的影响。处理器结构最初是针对通信中的高性能任务。这种类型的体系结构也开始用于低功耗应用程序,在这些应用程序中,它可以提供比使用一个或多个通用处理器的实现更高的性能与功耗比。将描述和比较几种新兴的结构配置:使用处理器元素(PE)和用于指令和数据的私有内存的基本核心,使用本地指令内存和通信数据的PE,可以根据要执行的功能改变处理能力的PE,异构PE等。软件开发工具的问题使处理器结构无法被更多的设计人员采用:pe之间的迭代优化路由和功能分配已经成为C/ c++语言程序员的额外负担。这些被提议的产品都没有获得足够的牵引力来证明被接受为标准架构。更广泛地采用可重新配置引擎的关键在于提供给程序员的软硬件工具:将描述两种类型的软硬件工具,一种使用程序和显式路由,另一种使用可以生成程序和路由的提示。
{"title":"Trends in the use of re-configurable platforms","authors":"M. Baron","doi":"10.1145/996566.996685","DOIUrl":"https://doi.org/10.1145/996566.996685","url":null,"abstract":"Designers can create completely new processors with custom instruction set architectures (ISA), using various methods involving configurable logic. Configurable technologies also enable designers to enhance the basic ISA of standard processors or the ISA of a proprietary processor to execute at speed workloads for which the processor has not been initially conceived. Contrary to some early beliefs, the idea behind creating a custom instruction is not to compress several existing ISA instructions in one cycle; it is to execute loops requiring hundreds or thousands of iterations, faster than in a single machine, even if it were clocked at the top frequency afforded by state-of-the-art semiconductor speeds and temperature limitations.To achieve high performance, most configurable platforms execute loop iterations in parallel; operating on multiple data in one cycle can make up for engine frequency and power limitations. Aimed at implementations in ASIC technologies, configurable platforms can be defined as designer-created mostly hardwired logic interfaced via ISA instruction enhancements.Re-configurable platforms were introduced only recently. Architectures employing FPGA-like structures instead of hardwired logic offer flexibility useful in addressing a broader range of applications and tracking evolving standards. The presentation surveys configurable and re-configurable structures including fabrics of processors, evolving trends, and the impact of soft-hardware development tools.Fabrics of processors were initially aimed at very high performance tasks in communications. This type of architecture is also beginning to be employed in low power applications where it can offer a ratio of performance-to-power exceeding that of an implementation using one or more general-purpose processors. Several emerging fabric configurations will be described and compared: base cores using a processor element (PE) and private memory for instructions and data, PEs using local instructions' memory and communicating data, PEs that can change processing capabilities depending on the function to be executed, heterogeneous PEs and others. Software development tools' issues have kept processor fabrics from being adopted by more designers: iterative optimal routing between PEs and assignment of functions have become additional burdens on the C/C++ language programmer. None of the proposed products has acquired enough traction to justify acceptance as a standard architecture. The key to a wider adoption of re-configurable engines will be found in the soft-hardware tools offered to the programmer: two types of soft-hardware tools will be described, one using program and explicit routing, the other employing hints that can generate program and routing.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132745479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A methodology to improve timing yield in the presence of process variations 一种在存在工艺变化的情况下提高定时良率的方法
Pub Date : 2004-06-07 DOI: 10.1145/996566.996694
Sreeja Raj, S. Vrudhula, Janet Roveda
The ability to control the variations in IC fabrication process is rapidly diminishing as feature sizes continue towards the sub-100 nm regime. As a result, there is an increasing uncertainty in the performance of CMOS circuits. Accounting for the worst case values of all parameters will result in an unacceptably low timing yield. Design for Variability, which involves designing to achieve a given level of confidence in the performance of ICs, is fast becoming an indispensable part of IC design methodology. This paper describes a method to identify certain paths in the circuit that are responsible for the spread of timing performance. The method is based on defining a disutility function of the gate and path delays, which includes both the means and variances of the delay random variables. Based on the moments of this disutility function, an algorithm is presented which selects a subset of paths (called undominated paths) as being most responsible for the variation in timing performance. Next, a statistical gate sizing algorithm is presented, which is aimed at minimizing the delay variability of the nodes in the selected paths subject to constraints on the critical path delay and the area penalty. Monte-Carlo simulations with ISCAS '85 benchmark circuits show that our statistical optimization approach results in significant improvements in timing yield over traditional deterministic sizing methods.
随着特征尺寸继续向亚100纳米方向发展,控制IC制造工艺变化的能力正在迅速减弱。因此,CMOS电路性能的不确定性越来越大。考虑所有参数的最坏情况值将导致不可接受的低时序收益。可变性设计,包括设计以实现对集成电路性能的给定置信度,正迅速成为集成电路设计方法中不可或缺的一部分。本文描述了一种识别电路中对时序性能扩散负责的某些路径的方法。该方法基于定义门和路径延迟的负效用函数,该函数包括延迟随机变量的均值和方差。基于该负效用函数的矩,提出了一种算法,该算法选择路径子集(称为非支配路径)作为对定时性能变化最负责的路径。其次,提出了一种统计门大小算法,该算法在受关键路径延迟和面积惩罚约束的情况下,使所选路径上节点的延迟可变性最小化。ISCAS’85基准电路的蒙特卡罗模拟表明,我们的统计优化方法在时序产率方面比传统的确定性尺寸方法有显著改善。
{"title":"A methodology to improve timing yield in the presence of process variations","authors":"Sreeja Raj, S. Vrudhula, Janet Roveda","doi":"10.1145/996566.996694","DOIUrl":"https://doi.org/10.1145/996566.996694","url":null,"abstract":"The ability to control the variations in IC fabrication process is rapidly diminishing as feature sizes continue towards the sub-100 nm regime. As a result, there is an increasing uncertainty in the performance of CMOS circuits. Accounting for the worst case values of all parameters will result in an unacceptably low timing yield. Design for Variability, which involves designing to achieve a given level of confidence in the performance of ICs, is fast becoming an indispensable part of IC design methodology. This paper describes a method to identify certain paths in the circuit that are responsible for the spread of timing performance. The method is based on defining a disutility function of the gate and path delays, which includes both the means and variances of the delay random variables. Based on the moments of this disutility function, an algorithm is presented which selects a subset of paths (called undominated paths) as being most responsible for the variation in timing performance. Next, a statistical gate sizing algorithm is presented, which is aimed at minimizing the delay variability of the nodes in the selected paths subject to constraints on the critical path delay and the area penalty. Monte-Carlo simulations with ISCAS '85 benchmark circuits show that our statistical optimization approach results in significant improvements in timing yield over traditional deterministic sizing methods.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133562724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Leakage in nano-scale technologies: mechanisms, impact and design considerations 纳米级技术中的泄漏:机制、影响和设计考虑
Pub Date : 2004-06-07 DOI: 10.1145/996566.996571
A. Agarwal, C. Kim, S. Mukhopadhyay, K. Roy
The high leakage current in nano-meter regimes is becoming a significant portion of power dissipation in CMOS circuits as threshold voltage, channel length, and gate oxide thickness are scaled. Consequently, the identification of different leakage components is very important for estimation and reduction of leakage. Moreover, the increasing statistical variation in the process parameters has led to significant variation in the transistor leakage current across and within different dies. Designing with the worst case leakage may cause excessive guard-banding, resulting in a lower performance. This paper explores various intrinsic leakage mechanisms including weak inversion, gateoxide tunneling and junction leakage etc. Various circuit level techniques to reduce leakage energy and their design trade-off are discussed. We also explore process variation compensating techniques to reduce delay and leakage spread, while meeting power constraint and yield.
随着阈值电压、沟道长度和栅极氧化物厚度的不断增大,高泄漏电流正成为CMOS电路功耗的重要组成部分。因此,识别不同的泄漏分量对于估计和减少泄漏是非常重要的。此外,工艺参数统计变化的增加导致晶体管漏电流在不同晶片之间和内部的显著变化。最坏的情况下设计泄漏可能会导致过度的保护带,导致性能下降。本文探讨了各种本征泄漏机制,包括弱反转、闸氧化隧穿和结漏等。讨论了降低泄漏能量的各种电路级技术及其设计权衡。我们还探索了过程变化补偿技术,以减少延迟和泄漏扩散,同时满足功率约束和良率。
{"title":"Leakage in nano-scale technologies: mechanisms, impact and design considerations","authors":"A. Agarwal, C. Kim, S. Mukhopadhyay, K. Roy","doi":"10.1145/996566.996571","DOIUrl":"https://doi.org/10.1145/996566.996571","url":null,"abstract":"The high leakage current in nano-meter regimes is becoming a significant portion of power dissipation in CMOS circuits as threshold voltage, channel length, and gate oxide thickness are scaled. Consequently, the identification of different leakage components is very important for estimation and reduction of leakage. Moreover, the increasing statistical variation in the process parameters has led to significant variation in the transistor leakage current across and within different dies. Designing with the worst case leakage may cause excessive guard-banding, resulting in a lower performance. This paper explores various intrinsic leakage mechanisms including weak inversion, gateoxide tunneling and junction leakage etc. Various circuit level techniques to reduce leakage energy and their design trade-off are discussed. We also explore process variation compensating techniques to reduce delay and leakage spread, while meeting power constraint and yield.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115125201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Modeling repeaters explicitly within analytical placement 在分析位置内显式地建模中继器
Pub Date : 2004-06-07 DOI: 10.1145/996566.996759
Prashant Saxena, Bill Halpin
Recent works have shown that scaling causes the number of repeaters to grow rapidly. We demonstrate that this growth leads to massive placement perturbations that break the convergence of today's interleaved placement and repeater insertion flows. We then present two new force models for repeaters targeted towards analytical placement algorithms. Our experiments demonstrate the effectiveness of our repeater modeling technique in preserving placement convergence (often also accompanied by wirelength improvement) at the 45 and 32 nm technology nodes.
最近的研究表明,规模化会导致中继器的数量迅速增长。我们证明,这种增长导致了大规模的放置扰动,打破了今天交错放置和中继器插入流的收敛性。然后,我们提出了针对分析放置算法的中继器的两个新的力模型。我们的实验证明了我们的中继器建模技术在45和32 nm技术节点上保持放置收敛(通常也伴随着带宽改进)的有效性。
{"title":"Modeling repeaters explicitly within analytical placement","authors":"Prashant Saxena, Bill Halpin","doi":"10.1145/996566.996759","DOIUrl":"https://doi.org/10.1145/996566.996759","url":null,"abstract":"Recent works have shown that scaling causes the number of repeaters to grow rapidly. We demonstrate that this growth leads to massive placement perturbations that break the convergence of today's interleaved placement and repeater insertion flows. We then present two new force models for repeaters targeted towards analytical placement algorithms. Our experiments demonstrate the effectiveness of our repeater modeling technique in preserving placement convergence (often also accompanied by wirelength improvement) at the 45 and 32 nm technology nodes.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115149287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Pre-layout wire length and congestion estimation 预布局导线长度和拥塞估计
Pub Date : 2004-06-07 DOI: 10.1145/996566.996726
Qinghua Liu, M. Marek-Sadowska
In this paper, we study the pre-layout wire length and congestion estimation. We find that two structural metrics, mutual contraction and net range, can be used to predict wire lengths. These metrics have different application ranges and complement each other. We also propose a new metric, the structural pin density, to capture the peak routing congestion of designs. Larger maximum pin densities usually lead to larger peak congestions in circuits with similar average congestions. We demonstrate experimentally very good correlation of our pre-layout measures with post layout interconnect lengths. We also isolate the structural netlist properties which cause the peak congestion.
在本文中,我们研究了预布放导线长度和拥塞估计。我们发现两个结构指标,相互收缩和净范围,可以用来预测导线长度。这些指标具有不同的应用范围,并且相互补充。我们还提出了一个新的度量,结构引脚密度,以捕捉设计的峰值路由拥塞。较大的最大引脚密度通常会在具有相似平均阻塞的电路中导致较大的峰值阻塞。我们通过实验证明了我们的布局前测量与布局后互连长度的良好相关性。我们还隔离了导致峰值拥塞的结构网表属性。
{"title":"Pre-layout wire length and congestion estimation","authors":"Qinghua Liu, M. Marek-Sadowska","doi":"10.1145/996566.996726","DOIUrl":"https://doi.org/10.1145/996566.996726","url":null,"abstract":"In this paper, we study the pre-layout wire length and congestion estimation. We find that two structural metrics, mutual contraction and net range, can be used to predict wire lengths. These metrics have different application ranges and complement each other. We also propose a new metric, the structural pin density, to capture the peak routing congestion of designs. Larger maximum pin densities usually lead to larger peak congestions in circuits with similar average congestions. We demonstrate experimentally very good correlation of our pre-layout measures with post layout interconnect lengths. We also isolate the structural netlist properties which cause the peak congestion.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115180388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
期刊
Proceedings. 41st Design Automation Conference, 2004.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1