首页 > 最新文献

1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)最新文献

英文 中文
Modeling and simulation of the interference due to digital switching in mixed-signal ICs 混合信号集成电路中数字开关干扰的建模与仿真
A. Demir, P. Feldmann
Introduces a methodology for the evaluation of the interference noise caused by digital switching activity in sensitive circuits of a mixed digital-analog chip. The digital switching activity is modeled stochastically as functions defined on Markov chains. The actual interference signal is obtained through the modulation of this discrete stochastic signal with real current injection patterns stored a priori in a pre-characterized library. The interference noise results from the propagation of these continuous stochastic signals through the linear network that models the chip power grid, substrate and relevant package parasitics. The interference noise power spectral density is computed by linear frequency-domain analysis. The methodology is implemented using advanced numerical techniques that are capable of tackling very large problems.
介绍了一种评估混合数模芯片敏感电路中数字开关活动引起的干扰噪声的方法。数字开关活动随机建模为马尔可夫链上定义的函数。实际的干扰信号是通过将该离散随机信号与先验存储在预表征库中的真实电流注入模式调制而得到的。干扰噪声是由这些连续随机信号通过模拟芯片电网、衬底和相关封装寄生的线性网络传播而产生的。通过线性频域分析计算干扰噪声功率谱密度。该方法采用先进的数值技术,能够解决非常大的问题。
{"title":"Modeling and simulation of the interference due to digital switching in mixed-signal ICs","authors":"A. Demir, P. Feldmann","doi":"10.1109/ICCAD.1999.810624","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810624","url":null,"abstract":"Introduces a methodology for the evaluation of the interference noise caused by digital switching activity in sensitive circuits of a mixed digital-analog chip. The digital switching activity is modeled stochastically as functions defined on Markov chains. The actual interference signal is obtained through the modulation of this discrete stochastic signal with real current injection patterns stored a priori in a pre-characterized library. The interference noise results from the propagation of these continuous stochastic signals through the linear network that models the chip power grid, substrate and relevant package parasitics. The interference noise power spectral density is computed by linear frequency-domain analysis. The methodology is implemented using advanced numerical techniques that are capable of tackling very large problems.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"6 1","pages":"70-74"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79860902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Cell replication and redundancy elimination during placement for cycle time optimization 细胞复制和冗余消除在放置周期时间优化
I. Neumann, D. Stoffel, H. Hartje, W. Kunz
Presents a new timing-driven approach for cell replication tailored to the practical needs of standard cell layout design. Cell replication methods have been studied extensively in the context of generic partitioning problems. However, until now, it has remained unclear what practical benefit can be obtained from this concept in a realistic environment for timing-driven layout synthesis. Therefore, this paper presents a timing-driven cell replication procedure, demonstrates its incorporation into a standard cell placement and routing tool, and examines its benefit on the final circuit performance in comparison with conventional gate or transistor sizing techniques. Furthermore, we demonstrate that cell replication can deteriorate the stuck-at fault testability of circuits and show that stuck-at redundancy elimination must be integrated into the placement procedure. Experimental results demonstrate the usefulness of the proposed methodology and suggest that cell replication should be an integral part of the physical design flow complementing traditional gate sizing techniques.
提出了一个新的时间驱动的方法为细胞复制量身定制的标准细胞布局设计的实际需要。细胞复制方法在一般分配问题的背景下得到了广泛的研究。然而,到目前为止,对于时序驱动布局综合的现实环境,从这一概念中获得的实际效益仍不清楚。因此,本文提出了一种时间驱动的细胞复制程序,演示了它与标准细胞放置和布线工具的结合,并与传统的栅极或晶体管尺寸技术相比,研究了它对最终电路性能的好处。此外,我们证明了细胞复制会恶化电路的卡在故障可测试性,并表明卡在冗余消除必须集成到放置过程中。实验结果证明了所提出的方法的有效性,并建议细胞复制应该是物理设计流程的一个组成部分,补充传统的门尺寸技术。
{"title":"Cell replication and redundancy elimination during placement for cycle time optimization","authors":"I. Neumann, D. Stoffel, H. Hartje, W. Kunz","doi":"10.1109/ICCAD.1999.810614","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810614","url":null,"abstract":"Presents a new timing-driven approach for cell replication tailored to the practical needs of standard cell layout design. Cell replication methods have been studied extensively in the context of generic partitioning problems. However, until now, it has remained unclear what practical benefit can be obtained from this concept in a realistic environment for timing-driven layout synthesis. Therefore, this paper presents a timing-driven cell replication procedure, demonstrates its incorporation into a standard cell placement and routing tool, and examines its benefit on the final circuit performance in comparison with conventional gate or transistor sizing techniques. Furthermore, we demonstrate that cell replication can deteriorate the stuck-at fault testability of circuits and show that stuck-at redundancy elimination must be integrated into the placement procedure. Experimental results demonstrate the usefulness of the proposed methodology and suggest that cell replication should be an integral part of the physical design flow complementing traditional gate sizing techniques.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"5 1","pages":"25-30"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83386891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A framework for testing core-based systems-on-a-chip 用于测试基于内核的片上系统的框架
S. Ravi, G. Lakshminarayana, N. Jha
Available techniques for testing core-based systems-on-a-chip (SOCs) do not provide a systematic means for synthesising low-overhead test architectures and compact test solutions. In this paper, we provide a comprehensive framework that generates low-overhead compact test solutions for SOCs. First, we develop a common ground for addressing issues such as core test requirements, core access and test hardware additions. For this purpose, we introduce finite-state automata for modeling tests, transparency modes and test hardware behavior. In many cases, the tests repeat a basic set of test actions for different test data which can again be modeled using finite-state automata. While earlier work can derive a single symbolic test for a module in a register-transfer level (RTL) circuit as a finite-state automaton, this work extends the methodology to the system level, and, additionally contributes a satisfiability-based solution to the problem of applying a sequence of tests phased in time. This problem is known to be a bottleneck in testability analysis not only at the system level, but also at the RTL. Experimental results show that the system-level average area overhead for making SOCs testable with our method is only 4.4%, while achieving an average test application time reduction of 78.5% over recent approaches. At the same time, it provides 100% test coverage of the precomputed test sets/sequences of the embedded cores.
现有的测试基于核心的片上系统(soc)的技术并没有提供一个系统的方法来综合低开销的测试架构和紧凑的测试解决方案。在本文中,我们提供了一个全面的框架,为soc生成低开销的紧凑测试解决方案。首先,我们为解决诸如核心测试需求、核心访问和测试硬件添加等问题开发了一个共同的基础。为此,我们引入了有限状态自动机,用于建模测试、透明模式和测试硬件行为。在许多情况下,测试为不同的测试数据重复一组基本的测试操作,这些数据可以再次使用有限状态自动机进行建模。虽然早期的工作可以将寄存器传输电平(RTL)电路中的模块作为有限状态自动机导出单个符号测试,但这项工作将方法扩展到系统级别,并且还为应用时序测试的问题提供了基于满意度的解决方案。这个问题不仅在系统级,而且在RTL上都是可测试性分析的瓶颈。实验结果表明,用我们的方法使soc可测试的系统级平均面积开销仅为4.4%,而与最近的方法相比,平均测试应用时间减少了78.5%。同时,它提供了100%的测试覆盖率预先计算的测试集/序列的嵌入式核心。
{"title":"A framework for testing core-based systems-on-a-chip","authors":"S. Ravi, G. Lakshminarayana, N. Jha","doi":"10.1109/ICCAD.1999.810680","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810680","url":null,"abstract":"Available techniques for testing core-based systems-on-a-chip (SOCs) do not provide a systematic means for synthesising low-overhead test architectures and compact test solutions. In this paper, we provide a comprehensive framework that generates low-overhead compact test solutions for SOCs. First, we develop a common ground for addressing issues such as core test requirements, core access and test hardware additions. For this purpose, we introduce finite-state automata for modeling tests, transparency modes and test hardware behavior. In many cases, the tests repeat a basic set of test actions for different test data which can again be modeled using finite-state automata. While earlier work can derive a single symbolic test for a module in a register-transfer level (RTL) circuit as a finite-state automaton, this work extends the methodology to the system level, and, additionally contributes a satisfiability-based solution to the problem of applying a sequence of tests phased in time. This problem is known to be a bottleneck in testability analysis not only at the system level, but also at the RTL. Experimental results show that the system-level average area overhead for making SOCs testable with our method is only 4.4%, while achieving an average test application time reduction of 78.5% over recent approaches. At the same time, it provides 100% test coverage of the precomputed test sets/sequences of the embedded cores.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"61 1","pages":"385-390"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81027151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Model reduction for DC solution of large nonlinear circuits 大型非线性电路直流解的模型简化
E. Gad, M. Nakhla
A new algorithm based on model reduction using the Krylov subspace technique is proposed to compute the DC solution of large nonlinear circuits. The proposed method combines continuation methods with model reduction techniques. Thus it enables the application of the continuation methods to an equivalent reduced-order set of nonlinear equations instead of the original system. This results in a significant reduction in the computational expense as the size of the reduced equations is much less than that of the original system. The reduced order system is obtained by projecting the set of nonlinear equations, whose solution represents the DC operating point, into a subspace of a much lower dimension. It is also shown that both the reduced-order system and the original system share the first q derivatives w.r.t. the circuit variable used to parameterize the family of the solution trajectories generated by the continuation method.
提出了一种基于Krylov子空间技术的模型约简算法来计算大型非线性电路的直流解。该方法结合了延拓方法和模型约简技术。这样就可以将延拓方法应用于一个等价的降阶非线性方程组,而不是原来的方程组。这导致了计算费用的显著减少,因为简化后的方程的大小比原始系统的小得多。该降阶系统是通过将一组非线性方程(其解表示直流工作点)投影到一个低维的子空间中得到的。还证明了降阶系统与原系统具有相同的前q阶导数w.r.t.,该电路变量用于参数化延拓法生成的解轨迹族。
{"title":"Model reduction for DC solution of large nonlinear circuits","authors":"E. Gad, M. Nakhla","doi":"10.1109/ICCAD.1999.810678","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810678","url":null,"abstract":"A new algorithm based on model reduction using the Krylov subspace technique is proposed to compute the DC solution of large nonlinear circuits. The proposed method combines continuation methods with model reduction techniques. Thus it enables the application of the continuation methods to an equivalent reduced-order set of nonlinear equations instead of the original system. This results in a significant reduction in the computational expense as the size of the reduced equations is much less than that of the original system. The reduced order system is obtained by projecting the set of nonlinear equations, whose solution represents the DC operating point, into a subspace of a much lower dimension. It is also shown that both the reduced-order system and the original system share the first q derivatives w.r.t. the circuit variable used to parameterize the family of the solution trajectories generated by the continuation method.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"25 1","pages":"376-379"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78761315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Fast performance analysis of bus-based system-on-chip communication architectures 基于总线的片上系统通信架构的快速性能分析
K. Lahiri, A. Raghunathan, S. Dey
This paper addresses the problem of efficient and accurate performance analysis to drive the exploration and design of bus-based system-on-chip (SOC) communication architectures. Our technique fills a gap in existing techniques for system-level performance analysis, which are either too slow to use in an iterative communication architecture design framework (e.g., simulation of the complete system), or are not accurate enough to drive the design of the communication architecture (e.g., techniques that perform a static analysis of the system performance). The proposed system-level performance analysis technique consists of: initial co-simulation performed after HW/SW partitioning and mapping, with the communication between components modeled in an abstract manner (e.g., as events or data transfers); extraction of abstracted symbolic traces, represented as a bus and synchronization event (BSE) graph, that captures the activity of the various system components and their communication over time; and manipulation of the BSE graph using the bus parameters, to derive the behavior of the system accounting for effects of the bus architecture. We present experimental results on several example systems, including a TCP/IP network interface card sub-system. The results indicate that our performance estimation technique is over two orders of magnitude faster than performing a complete system simulation, while being very accurate (within 2.2% of performance estimates derived from accurate HW/SW co-simulation).
本文讨论了高效和准确的性能分析问题,以推动基于总线的片上系统(SOC)通信架构的探索和设计。我们的技术填补了现有系统级性能分析技术的空白,这些技术要么太慢,无法在迭代通信架构设计框架中使用(例如,完整系统的模拟),要么不够精确,无法驱动通信架构的设计(例如,执行系统性能静态分析的技术)。提出的系统级性能分析技术包括:在硬件/软件分区和映射之后进行初始联合仿真,组件之间的通信以抽象的方式建模(例如,作为事件或数据传输);提取抽象的符号轨迹,表示为总线和同步事件(BSE)图,捕获各种系统组件的活动及其随时间的通信;以及使用总线参数对BSE图进行操作,以导出考虑总线体系结构影响的系统行为。我们给出了几个示例系统的实验结果,包括一个TCP/IP网络接口卡子系统。结果表明,我们的性能估计技术比执行完整的系统模拟快两个数量级以上,同时非常准确(从精确的硬件/软件联合模拟中得出的性能估计在2.2%以内)。
{"title":"Fast performance analysis of bus-based system-on-chip communication architectures","authors":"K. Lahiri, A. Raghunathan, S. Dey","doi":"10.1109/ICCAD.1999.810712","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810712","url":null,"abstract":"This paper addresses the problem of efficient and accurate performance analysis to drive the exploration and design of bus-based system-on-chip (SOC) communication architectures. Our technique fills a gap in existing techniques for system-level performance analysis, which are either too slow to use in an iterative communication architecture design framework (e.g., simulation of the complete system), or are not accurate enough to drive the design of the communication architecture (e.g., techniques that perform a static analysis of the system performance). The proposed system-level performance analysis technique consists of: initial co-simulation performed after HW/SW partitioning and mapping, with the communication between components modeled in an abstract manner (e.g., as events or data transfers); extraction of abstracted symbolic traces, represented as a bus and synchronization event (BSE) graph, that captures the activity of the various system components and their communication over time; and manipulation of the BSE graph using the bus parameters, to derive the behavior of the system accounting for effects of the bus architecture. We present experimental results on several example systems, including a TCP/IP network interface card sub-system. The results indicate that our performance estimation technique is over two orders of magnitude faster than performing a complete system simulation, while being very accurate (within 2.2% of performance estimates derived from accurate HW/SW co-simulation).","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"50 1","pages":"566-572"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90578813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Parameterized RTL power models for combinational soft macros 组合软宏的参数化RTL功率模型
A. Bogliolo, Roberto Corgnati, E. Macii, M. Poncino
We propose a new RTL power macromodel that is suitable for re-configurable, synthesizable soft-macros. The model is parameterized with respect to the input data size (i.e., bit-width), and can be automatically scaled with respect to different technology libraries and/or synthesis options. Scalability is obtained through a single additional characterization run, and does not require the disclosure of any intellectual property. The model is derived from empirical analysis of the sensitivity of power on input statistics, input data size and technology. The experiments prove that, with limited approximation, it is possible to de-couple the effects on power of these three factors. The proposed solution is innovative, since no previous macromodel supports automatic technology scaling, and yields estimation errors within 15%.
我们提出了一种新的RTL功率宏模型,它适用于可重构、可合成的软宏。该模型是根据输入数据大小(即位宽度)进行参数化的,并且可以根据不同的技术库和/或合成选项自动缩放。可伸缩性是通过一次额外的特性测试获得的,并且不需要披露任何知识产权。该模型是通过实证分析电力对输入统计、输入数据大小和技术的敏感性而得出的。实验证明,在有限近似下,这三个因素对功率的影响是可以分离的。提出的解决方案具有创新性,因为以前的宏模型都不支持自动技术缩放,并且估计误差在15%以内。
{"title":"Parameterized RTL power models for combinational soft macros","authors":"A. Bogliolo, Roberto Corgnati, E. Macii, M. Poncino","doi":"10.1109/ICCAD.1999.810663","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810663","url":null,"abstract":"We propose a new RTL power macromodel that is suitable for re-configurable, synthesizable soft-macros. The model is parameterized with respect to the input data size (i.e., bit-width), and can be automatically scaled with respect to different technology libraries and/or synthesis options. Scalability is obtained through a single additional characterization run, and does not require the disclosure of any intellectual property. The model is derived from empirical analysis of the sensitivity of power on input statistics, input data size and technology. The experiments prove that, with limited approximation, it is possible to de-couple the effects on power of these three factors. The proposed solution is innovative, since no previous macromodel supports automatic technology scaling, and yields estimation errors within 15%.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"32 1","pages":"284-287"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87025984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Localized watermarking: methodology and application to operation scheduling 局部水印:方法及其在作业调度中的应用
D. Kirovski, M. Potkonjak
Recently, a number of techniques for IP protection have been introduced that rely on a selection of a global solution to an optimization problem according to a unique user-specific digital signature. Although such techniques may provide convincing proof of authorship with low hardware overhead, they fail to protect parts of design, do not provide an easy procedure for watermark detection, and are not capable of detecting the watermark when the design or its part is augmented in another larger design. Since these demands are of the highest interest for the IP business, we introduce localized watermarking as an IP protection technique that enables these features while satisfying the demand for low-cost and transparency. We propose a set of protocols that implement the new watermarking methodology at the operation scheduling design level. We have demonstrated that the difficulty of erasing or finding another signature in the synthesized design can be made arbitrarily computationally difficult. The watermarking method has been tested on a set of real-life benchmarks where high likelihood of authorship has been achieved with negligible overhead in solution quality.
最近,已经引入了许多IP保护技术,这些技术依赖于根据独特的用户特定数字签名选择优化问题的全局解决方案。虽然这种技术可以以较低的硬件开销提供令人信服的作者证明,但它们不能保护部分设计,不能提供一个简单的水印检测过程,并且当设计或其部分在另一个更大的设计中扩展时不能检测水印。由于这些要求是IP业务最感兴趣的,我们引入本地化水印作为IP保护技术,在满足低成本和透明度需求的同时实现这些功能。我们提出了一套在操作调度设计层面实现新水印方法的协议。我们已经证明,在合成设计中擦除或找到另一个特征的难度可以任意地计算困难。水印方法已经在一组真实的基准测试中进行了测试,其中作者身份的可能性很高,解决方案质量的开销可以忽略不计。
{"title":"Localized watermarking: methodology and application to operation scheduling","authors":"D. Kirovski, M. Potkonjak","doi":"10.1109/ICCAD.1999.810717","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810717","url":null,"abstract":"Recently, a number of techniques for IP protection have been introduced that rely on a selection of a global solution to an optimization problem according to a unique user-specific digital signature. Although such techniques may provide convincing proof of authorship with low hardware overhead, they fail to protect parts of design, do not provide an easy procedure for watermark detection, and are not capable of detecting the watermark when the design or its part is augmented in another larger design. Since these demands are of the highest interest for the IP business, we introduce localized watermarking as an IP protection technique that enables these features while satisfying the demand for low-cost and transparency. We propose a set of protocols that implement the new watermarking methodology at the operation scheduling design level. We have demonstrated that the difficulty of erasing or finding another signature in the synthesized design can be made arbitrarily computationally difficult. The watermarking method has been tested on a set of real-life benchmarks where high likelihood of authorship has been achieved with negligible overhead in solution quality.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"3 1","pages":"596-599"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86202906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Bit-level arithmetic optimization for carry-save additions 进位保存加法的位级算法优化
Kei-Yong Khoo, Zhan Yu, A. Willson
Addresses the bit-level optimization of carry-save adder (CSA) arrays when the operands are of unequal wordlength (such as in some datapaths in digital signal processing circuits). We first show that by relaxing the carry-save representation to allow for more than two signals per bit position, we gain flexibility in the bit-level implementation of CSA arrays that can be exploited to achieve a more efficient design. We then propose algorithms to optimize a single adder array at the bit-level. In addition, we proposed a heuristic to optimize a series of adder arrays that might occur in a datapath. We have applied our algorithms to the optimization of high-speed digital FIR filters and have achieved 15% to 30% savings (weighted cost) in the overall filter implementation array in comparison to the standard carry-save implementation.
解决了当操作数字长不等时(如在数字信号处理电路中的某些数据路径中)进位省加器(CSA)数组的位级优化问题。我们首先表明,通过放宽进位保存表示以允许每个位位置有两个以上的信号,我们在CSA阵列的位级实现中获得了灵活性,可以利用这些灵活性来实现更有效的设计。然后,我们提出了在位级上优化单个加法器数组的算法。此外,我们提出了一种启发式方法来优化数据路径中可能出现的一系列加法器数组。我们已经将我们的算法应用于高速数字FIR滤波器的优化,与标准的carry-save实现相比,在整个滤波器实现阵列中节省了15%到30%(加权成本)。
{"title":"Bit-level arithmetic optimization for carry-save additions","authors":"Kei-Yong Khoo, Zhan Yu, A. Willson","doi":"10.1109/ICCAD.1999.810611","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810611","url":null,"abstract":"Addresses the bit-level optimization of carry-save adder (CSA) arrays when the operands are of unequal wordlength (such as in some datapaths in digital signal processing circuits). We first show that by relaxing the carry-save representation to allow for more than two signals per bit position, we gain flexibility in the bit-level implementation of CSA arrays that can be exploited to achieve a more efficient design. We then propose algorithms to optimize a single adder array at the bit-level. In addition, we proposed a heuristic to optimize a series of adder arrays that might occur in a datapath. We have applied our algorithms to the optimization of high-speed digital FIR filters and have achieved 15% to 30% savings (weighted cost) in the overall filter implementation array in comparison to the standard carry-save implementation.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"87 1","pages":"14-18"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81162619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Is wire tapering worthwhile? 电线变细值得吗?
C. Alpert, A. Devgan, Stephen T. Quay
Wire sizing and buffer insertion/sizing are critical optimizations in deep submicron design. The past years have seen several studies of buffer insertion, wire sizing, and their simultaneous optimization. When wiring long interconnect, tapering, i.e., reducing the wire width as the distance from the driver increases, has proven effective. However tapering is not widely utilized in industry since it is difficult to integrate into a complete routing methodology. The article examines the benefits of wire sizing with tapering when combined with buffer insertion. We perform several experiments with actual IBM technologies. Results indicate that wire tapering reduces delay typically by less than 5% compared to uniform wire sizing, when buffers can be inserted. Consequently, we suggest that it may not be worthwhile to maintain a routing methodology that supports wire tapering.
导线尺寸和缓冲插入/尺寸是深亚微米设计中的关键优化。在过去的几年里,我们已经看到了一些关于缓冲区插入、导线大小以及它们的同步优化的研究。当布线长互连时,逐渐变细,即随着与驱动器的距离增加而减小线宽,已被证明是有效的。然而,由于难以整合到一个完整的布线方法中,锥形并没有广泛应用于工业。本文考察了与缓冲器插入相结合的逐渐变细的钢丝尺寸的好处。我们使用实际的IBM技术进行了几个实验。结果表明,当可以插入缓冲器时,与均匀线材尺寸相比,线材变细通常可以减少不到5%的延迟。因此,我们建议维持支持线变细的路由方法可能是不值得的。
{"title":"Is wire tapering worthwhile?","authors":"C. Alpert, A. Devgan, Stephen T. Quay","doi":"10.1109/ICCAD.1999.810689","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810689","url":null,"abstract":"Wire sizing and buffer insertion/sizing are critical optimizations in deep submicron design. The past years have seen several studies of buffer insertion, wire sizing, and their simultaneous optimization. When wiring long interconnect, tapering, i.e., reducing the wire width as the distance from the driver increases, has proven effective. However tapering is not widely utilized in industry since it is difficult to integrate into a complete routing methodology. The article examines the benefits of wire sizing with tapering when combined with buffer insertion. We perform several experiments with actual IBM technologies. Results indicate that wire tapering reduces delay typically by less than 5% compared to uniform wire sizing, when buffers can be inserted. Consequently, we suggest that it may not be worthwhile to maintain a routing methodology that supports wire tapering.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"21 1","pages":"430-435"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89632719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Interface and cache power exploration for core-based embedded system design 基于核心的嵌入式系统设计的接口和缓存功率探索
T. Givargis, Jörg Henkel, F. Vahid
Minimizing power consumption is of paramount importance during the design of embedded (mobile computing) systems that come as systems-on-a-chip, since interdependencies between design characteristics like power, performance and area for various system parts (cores) are becoming increasingly influential. In this scenario, interfaces play a key role, since they allow one to control/exploit these interdependencies with the aim of meeting design constraints like power. In this paper, we present a comprehensive approach to explore this impact. We consider a whole system comprising a CPU, caches, a main memory and interfaces between those cores, and we demonstrate the high impact that an adequate adaptation between core parameters and interface parameters has in terms of power consumption. We find in particular that cache parameters and the configurations of cache buses have a significant impact in this respect. In addition, we make the important observation that optimizing for performance no longer implies that power is optimized as well in deep submicron technologies. Instead, we find that, especially for newer technologies, the relative interface power contribution increases, leading to scenarios where we obtain a real power/performance tradeoff. In summary, our explorations have revealed as yet uninvestigated interdependencies that represent the first step towards future efforts to optimize/adapt interfaces and caches in core-based systems for low-power designs.
在设计芯片上系统的嵌入式(移动计算)系统时,最小化功耗是至关重要的,因为各种系统部件(核心)的功耗、性能和面积等设计特征之间的相互依赖关系正变得越来越有影响力。在这种情况下,接口起着关键作用,因为它们允许人们控制/利用这些相互依赖关系,以满足设计约束(如功率)。在本文中,我们提出了一个全面的方法来探讨这种影响。我们考虑了一个由CPU、缓存、主存储器和这些核心之间的接口组成的整个系统,并且我们证明了核心参数和接口参数之间的适当适应对功耗的高影响。我们特别发现缓存参数和缓存总线的配置在这方面有重大影响。此外,我们做出了重要的观察,即优化性能不再意味着在深亚微米技术中优化功耗。相反,我们发现,特别是对于较新的技术,相对接口功率贡献增加,导致我们获得真正的功率/性能权衡的场景。总之,我们的探索揭示了尚未调查的相互依赖性,这代表了未来优化/适应低功耗设计的基于核心的系统中的接口和缓存的第一步。
{"title":"Interface and cache power exploration for core-based embedded system design","authors":"T. Givargis, Jörg Henkel, F. Vahid","doi":"10.5555/339492.340025","DOIUrl":"https://doi.org/10.5555/339492.340025","url":null,"abstract":"Minimizing power consumption is of paramount importance during the design of embedded (mobile computing) systems that come as systems-on-a-chip, since interdependencies between design characteristics like power, performance and area for various system parts (cores) are becoming increasingly influential. In this scenario, interfaces play a key role, since they allow one to control/exploit these interdependencies with the aim of meeting design constraints like power. In this paper, we present a comprehensive approach to explore this impact. We consider a whole system comprising a CPU, caches, a main memory and interfaces between those cores, and we demonstrate the high impact that an adequate adaptation between core parameters and interface parameters has in terms of power consumption. We find in particular that cache parameters and the configurations of cache buses have a significant impact in this respect. In addition, we make the important observation that optimizing for performance no longer implies that power is optimized as well in deep submicron technologies. Instead, we find that, especially for newer technologies, the relative interface power contribution increases, leading to scenarios where we obtain a real power/performance tradeoff. In summary, our explorations have revealed as yet uninvestigated interdependencies that represent the first step towards future efforts to optimize/adapt interfaces and caches in core-based systems for low-power designs.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"205 1","pages":"270-273"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89647636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1