首页 > 最新文献

1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)最新文献

英文 中文
Large scale circuit partitioning with loose/stable net removal and signal flow based clustering 基于松散/稳定网络去除和信号流聚类的大规模电路划分
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643573
J. Cong, H. Li, S. Lim, Toshiyuki Shibuya, D. Xu
In this paper, we present an efficient Iterative Improvement based Partitioning (IIP) algorithm called LSR/MFFS, that combines signal flow based Maximum Fanout Free Subgraph (MFFS) clustering algorithm with Loose and Stable net Removal (LSR) partitioning algorithm. The MFFS algorithm generalizes existing MFFC decomposition method from combinational circuits to general sequential circuits in order to handle cycles naturally. We also study the properties of the nets that straddle the cutline carefully, and introduce the concepts of the loose and stable nets as well as effective ways to remove them out of the cutset. The LSR/MFFS algorithm first applies LSR algorithm to clustered netlist generated by MFFS algorithm for global-level cutsize optimization and then declusters netlist for further cutsize refinement. As a result, the LSR/MFFS algorithm has achieved the best cutsize result among all the bipartitioning algorithms published in the literatures with very promising runtime performance. In particular, it outperforms the recent state-of-the-art IIP algorithms LA3-CDIP, CLIP-PROP/sub f/, Strawman, hMetis-FM, and MLc by 17.4%, 12.1%, 5.9%, 3.1%, and 1.9%, respectively. It also outperforms the state-of-the-art non-IIP algorithms Paraboli, FEB, and PANZA by 32.0%, 21.4%, and 1.4%, respectively.
本文提出了一种高效的基于迭代改进的分区(IIP)算法LSR/MFFS,该算法将基于信号流的最大扇出自由子图(MFFS)聚类算法与松散稳定的网络去除(LSR)分区算法相结合。MFFS算法将现有的MFFC分解方法从组合电路推广到一般顺序电路,以自然地处理周期。我们还仔细研究了跨越切线的网的特性,并介绍了松散和稳定网的概念以及将它们从切线中移除的有效方法。LSR/MFFS算法首先利用LSR算法对MFFS算法生成的聚类网表进行全局裁剪尺寸优化,然后对网表进行聚类,进一步细化裁剪尺寸。结果表明,LSR/MFFS算法在所有已发表的双分区算法中取得了最好的分割效果,并且具有很好的运行时性能。特别是,它比最新的最先进的IIP算法LA3-CDIP、CLIP-PROP/sub /、Strawman、hMetis-FM和MLc分别高出17.4%、12.1%、5.9%、3.1%和1.9%。它也比最先进的非iip算法抛物线、FEB和PANZA分别高出32.0%、21.4%和1.4%。
{"title":"Large scale circuit partitioning with loose/stable net removal and signal flow based clustering","authors":"J. Cong, H. Li, S. Lim, Toshiyuki Shibuya, D. Xu","doi":"10.1109/ICCAD.1997.643573","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643573","url":null,"abstract":"In this paper, we present an efficient Iterative Improvement based Partitioning (IIP) algorithm called LSR/MFFS, that combines signal flow based Maximum Fanout Free Subgraph (MFFS) clustering algorithm with Loose and Stable net Removal (LSR) partitioning algorithm. The MFFS algorithm generalizes existing MFFC decomposition method from combinational circuits to general sequential circuits in order to handle cycles naturally. We also study the properties of the nets that straddle the cutline carefully, and introduce the concepts of the loose and stable nets as well as effective ways to remove them out of the cutset. The LSR/MFFS algorithm first applies LSR algorithm to clustered netlist generated by MFFS algorithm for global-level cutsize optimization and then declusters netlist for further cutsize refinement. As a result, the LSR/MFFS algorithm has achieved the best cutsize result among all the bipartitioning algorithms published in the literatures with very promising runtime performance. In particular, it outperforms the recent state-of-the-art IIP algorithms LA3-CDIP, CLIP-PROP/sub f/, Strawman, hMetis-FM, and MLc by 17.4%, 12.1%, 5.9%, 3.1%, and 1.9%, respectively. It also outperforms the state-of-the-art non-IIP algorithms Paraboli, FEB, and PANZA by 32.0%, 21.4%, and 1.4%, respectively.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"211 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127040287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 80
Approximate timing analysis of combinational circuits under the XBDO model XBDO模型下组合电路的近似时序分析
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643404
Y. Kukimoto, W. Gosti, A. Saldanha, R. Brayton
This paper is concerned with approximate delay computation algorithms for combinational circuits. As a result of intensive research in the early 90's efficient tools exist which can analyze circuits of thousands of gates in a few minutes or even in seconds for many cases. However, the computation time of these tools is not so predictable since the internal engine of the analysis is either a SAT solver or a modified ATPG algorithm, both of which are just heuristic algorithms for an NP-complete problem. Although they are highly tuned for CAD applications, there exists a class of problem instances which exhibits the worst-case exponential CPU time behavior. In the context of timing analysis, circuits with a high amount of reconvergence, e.g. C6288 of the ISCAS benchmark suite, are known to be difficult to analyze under sophisticated delay models even with state-of-the-art techniques. To make timing analysis of such corner case circuits feasible we propose an approximate computation scheme to the timing analysis problem as an extension to the exact analysis method proposed previously. Sensitization conditions are conservatively approximated in a selective fashion so that the size of SAT problems solved during analysis is controlled. Experimental results show that the approximation technique is effective in reducing the total analysis time without losing accuracy for the case where the exact approach takes much time or cannot complete.
本文研究了组合电路的近似延迟计算算法。由于90年代早期的深入研究,已经有了在许多情况下可以在几分钟甚至几秒钟内分析数千个门电路的有效工具。然而,这些工具的计算时间并不是那么可预测,因为分析的内部引擎要么是SAT求解器,要么是改进的ATPG算法,这两种算法都只是np完全问题的启发式算法。虽然它们对CAD应用程序进行了高度调整,但存在一类问题实例,它们表现出最坏情况下的指数CPU时间行为。在时序分析的背景下,具有大量再收敛的电路,例如ISCAS基准套件的C6288,即使使用最先进的技术,也很难在复杂的延迟模型下进行分析。为了使拐角电路的时序分析可行,我们提出了时序分析问题的一种近似计算方案,作为前面提出的精确分析方法的扩展。敏化条件以一种选择性的方式保守地近似,以便在分析过程中解决的SAT问题的大小得到控制。实验结果表明,在精确方法耗时较长或无法完成的情况下,近似方法在不损失精度的情况下,可以有效地减少总分析时间。
{"title":"Approximate timing analysis of combinational circuits under the XBDO model","authors":"Y. Kukimoto, W. Gosti, A. Saldanha, R. Brayton","doi":"10.1109/ICCAD.1997.643404","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643404","url":null,"abstract":"This paper is concerned with approximate delay computation algorithms for combinational circuits. As a result of intensive research in the early 90's efficient tools exist which can analyze circuits of thousands of gates in a few minutes or even in seconds for many cases. However, the computation time of these tools is not so predictable since the internal engine of the analysis is either a SAT solver or a modified ATPG algorithm, both of which are just heuristic algorithms for an NP-complete problem. Although they are highly tuned for CAD applications, there exists a class of problem instances which exhibits the worst-case exponential CPU time behavior. In the context of timing analysis, circuits with a high amount of reconvergence, e.g. C6288 of the ISCAS benchmark suite, are known to be difficult to analyze under sophisticated delay models even with state-of-the-art techniques. To make timing analysis of such corner case circuits feasible we propose an approximate computation scheme to the timing analysis problem as an extension to the exact analysis method proposed previously. Sensitization conditions are conservatively approximated in a selective fashion so that the size of SAT problems solved during analysis is controlled. Experimental results show that the approximation technique is effective in reducing the total analysis time without losing accuracy for the case where the exact approach takes much time or cannot complete.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115367059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A test synthesis technique using redundant register transfers 一种使用冗余寄存器传输的测试合成技术
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643569
C. Papachristou, M. Baklashov
This paper presents a test synthesis technique for behavioral descriptions. The technique is guided by two testability metrics which quantify the controllability and observability of behavioral variables and structural signals. The method is based on utilizing redundant register transfers in the data path to produce a test behavior with better controllability and observability properties. This approach can avoid unnecessary insertions of test structures in the data path. A test scheme for conditional statements has been developed involving minimal changes in the controller. Our experimental results show improvements in fault coverage at modest hardware overhead.
提出了一种用于行为描述的测试综合技术。该技术以两个可测试性指标为指导,量化了行为变量和结构信号的可控性和可观察性。该方法基于利用数据路径中的冗余寄存器传输来产生具有更好可控性和可观察性的测试行为。这种方法可以避免在数据路径中插入不必要的测试结构。已经开发了一个条件语句的测试方案,涉及对控制器的最小更改。我们的实验结果表明,在适当的硬件开销下,故障覆盖率有所提高。
{"title":"A test synthesis technique using redundant register transfers","authors":"C. Papachristou, M. Baklashov","doi":"10.1109/ICCAD.1997.643569","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643569","url":null,"abstract":"This paper presents a test synthesis technique for behavioral descriptions. The technique is guided by two testability metrics which quantify the controllability and observability of behavioral variables and structural signals. The method is based on utilizing redundant register transfers in the data path to produce a test behavior with better controllability and observability properties. This approach can avoid unnecessary insertions of test structures in the data path. A test scheme for conditional statements has been developed involving minimal changes in the controller. Our experimental results show improvements in fault coverage at modest hardware overhead.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133613793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Library-less synthesis for static CMOS combinational logic circuits 静态CMOS组合逻辑电路的无库合成
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643608
S. Gavrilov, A. Glebov, S. Pullela, S. C. Moore, A. Dharchoudhury, R. Panda, G. Vijayan, D. Blaauw
Traditional synthesis techniques optimize CMOS circuits in two phases: i) logic minimization and ii) library mapping phase. Typically, the structures and the sizes of the gates in the library are chosen to yield good synthesis results over many blocks or even for an entire chip. Consequently this approach precludes an optimal design of individual blocks which may need custom structures. The authors present a new transistor level technique that optimizes CMOS circuits both structurally and size-wise. The technique is independent of a library and hence can explore a design space much larger than that possible due to gate level optimization. Results demonstrate a significant improvement in circuit performance of the resynthesized circuits.
传统合成技术对CMOS电路的优化分为两个阶段:逻辑最小化阶段和库映射阶段。通常,选择库中的门的结构和大小,以在许多块甚至整个芯片上产生良好的合成结果。因此,这种方法排除了可能需要定制结构的单个块的最佳设计。作者提出了一种新的晶体管级技术,优化了CMOS电路的结构和尺寸。该技术独立于库,因此可以探索比门级优化可能更大的设计空间。结果表明,复合电路的电路性能得到了显著改善。
{"title":"Library-less synthesis for static CMOS combinational logic circuits","authors":"S. Gavrilov, A. Glebov, S. Pullela, S. C. Moore, A. Dharchoudhury, R. Panda, G. Vijayan, D. Blaauw","doi":"10.1109/ICCAD.1997.643608","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643608","url":null,"abstract":"Traditional synthesis techniques optimize CMOS circuits in two phases: i) logic minimization and ii) library mapping phase. Typically, the structures and the sizes of the gates in the library are chosen to yield good synthesis results over many blocks or even for an entire chip. Consequently this approach precludes an optimal design of individual blocks which may need custom structures. The authors present a new transistor level technique that optimizes CMOS circuits both structurally and size-wise. The technique is independent of a library and hence can explore a design space much larger than that possible due to gate level optimization. Results demonstrate a significant improvement in circuit performance of the resynthesized circuits.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"656 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132156509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Verifying hardware in its software context 在软件上下文中验证硬件
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643621
R. Kurshan, V. Levin, M. Minea, D. Peled, Hüsnü Yenigün
We describe a method for verifying hardware whose correct behaviour depends upon its software interface. It is presumed that the hardware is presented as a synchronous RTL model whereas the software is presented as an asynchronous abstraction. Our methodology incorporates partial order reduction on the software side, and localization reduction, to deal with the computational complexity of the verification. The partial order reduction is implemented as a constraint on the transition relation of a synchronous transformation of the software model. The reduced transformed model then may be verified using a verification algorithm whose scope is purely synchronous models, without modification. Thus, independent of the interface verification problem, this gives a general method for combining partial order reduction with symbolic model checking.
我们描述了一种验证硬件的方法,其正确行为取决于其软件接口。假定硬件以同步RTL模型的形式呈现,而软件以异步抽象的形式呈现。我们的方法结合了软件侧的偏阶约简和本地化约简,以处理验证的计算复杂性。偏序约简作为软件模型同步转换转换关系的约束实现。然后可以使用验证算法验证简化后的转换模型,该算法的范围是纯同步模型,无需修改。因此,独立于接口验证问题,给出了一种将偏序约简与符号模型检查相结合的通用方法。
{"title":"Verifying hardware in its software context","authors":"R. Kurshan, V. Levin, M. Minea, D. Peled, Hüsnü Yenigün","doi":"10.1109/ICCAD.1997.643621","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643621","url":null,"abstract":"We describe a method for verifying hardware whose correct behaviour depends upon its software interface. It is presumed that the hardware is presented as a synchronous RTL model whereas the software is presented as an asynchronous abstraction. Our methodology incorporates partial order reduction on the software side, and localization reduction, to deal with the computational complexity of the verification. The partial order reduction is implemented as a constraint on the transition relation of a synchronous transformation of the software model. The reduced transformed model then may be verified using a verification algorithm whose scope is purely synchronous models, without modification. Thus, independent of the interface verification problem, this gives a general method for combining partial order reduction with symbolic model checking.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":" 20","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134506150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Maximum independent sets on transitive graphs and their applications in testing and CAD 传递图上的最大独立集及其在测试和CAD中的应用
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643620
D. Kagaris, S. Tragoudas
We present a polynomial time algorithm that finds the maximum weighted independent set of a transitive graph. The studied problem finds applications in a variety of VLSI contexts, including path delay fault testing, scheduling in high level synthesis and channel routing in physical design automation. The algorithm has been implemented and incorporated in a CAD tool for path delay fault testing. We experimentally verify its impact in the latter context.
给出了一个求传递图的最大加权独立集的多项式时间算法。所研究的问题可以在各种VLSI环境中找到应用,包括路径延迟故障测试,高级综合调度和物理设计自动化中的通道路由。该算法已被实现并集成到一个用于路径延迟故障检测的CAD工具中。我们通过实验验证了它在后一种情况下的影响。
{"title":"Maximum independent sets on transitive graphs and their applications in testing and CAD","authors":"D. Kagaris, S. Tragoudas","doi":"10.1109/ICCAD.1997.643620","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643620","url":null,"abstract":"We present a polynomial time algorithm that finds the maximum weighted independent set of a transitive graph. The studied problem finds applications in a variety of VLSI contexts, including path delay fault testing, scheduling in high level synthesis and channel routing in physical design automation. The algorithm has been implemented and incorporated in a CAD tool for path delay fault testing. We experimentally verify its impact in the latter context.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130062898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Efficient coupled noise estimation for on-chip interconnects 片上互连的高效耦合噪声估计
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643399
A. Devgan
Noise analysis and avoidance is an increasingly critical step in deep submicron design. Ever increasing requirements on performance have led to widespread use of dynamic logic circuit families and its other derivatives. These aggressive circuit families trade off noise margin for timing performance making them more susceptible to noise failure and increasing the need for noise analysis. Currently, noise analysis is performed either through circuit or timing simulation or through model order reduction. These techniques in use are still inefficient for analyzing massive amount of interconnect data found in present day integrated circuits. This paper presents efficient techniques for estimation of coupled noise in on-chip interconnects. This noise estimation metric is an upper bound for RC circuits, being similar in spirit to Elmore delay in timing analysis. Such an efficient noise metric is especially useful for noise criticality pruning and physical design based noise avoidance techniques.
噪声分析和避免是深亚微米设计中日益重要的一步。对性能要求的不断提高导致了动态逻辑电路家族及其衍生物的广泛使用。这些激进的电路家族为了时序性能而牺牲了噪声裕度,使它们更容易受到噪声故障的影响,并增加了对噪声分析的需求。目前,噪声分析要么通过电路或时序仿真,要么通过模型降阶来进行。目前使用的这些技术对于分析当前集成电路中发现的大量互连数据仍然效率低下。本文提出了片上互连中耦合噪声估计的有效方法。该噪声估计度量是RC电路的上界,与时序分析中的Elmore延迟在精神上类似。这种有效的噪声度量对于噪声临界修剪和基于噪声避免技术的物理设计特别有用。
{"title":"Efficient coupled noise estimation for on-chip interconnects","authors":"A. Devgan","doi":"10.1109/ICCAD.1997.643399","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643399","url":null,"abstract":"Noise analysis and avoidance is an increasingly critical step in deep submicron design. Ever increasing requirements on performance have led to widespread use of dynamic logic circuit families and its other derivatives. These aggressive circuit families trade off noise margin for timing performance making them more susceptible to noise failure and increasing the need for noise analysis. Currently, noise analysis is performed either through circuit or timing simulation or through model order reduction. These techniques in use are still inefficient for analyzing massive amount of interconnect data found in present day integrated circuits. This paper presents efficient techniques for estimation of coupled noise in on-chip interconnects. This noise estimation metric is an upper bound for RC circuits, being similar in spirit to Elmore delay in timing analysis. Such an efficient noise metric is especially useful for noise criticality pruning and physical design based noise avoidance techniques.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125205786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 171
Performance analysis of a system of communicating processes 沟通过程系统的性能分析
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643599
S. Dey, S. Bommu
Efficient exploration of the system design space necessitates fast and accurate performance estimation as opposed to the computationally prohibitive alternative of exhaustive simulation. The paper addresses the issue of worst case performance analysis of a system described as a set of concurrent communicating processes. We show that the synchronization overhead associated with inter process communication can contribute significantly to the overall system performance. Application of existing performance analysis techniques, which target single process descriptions, lead to inaccurate performance estimates as the synchronization overhead is not accounted for. We present PERC, a fast and accurate worst case performance analysis technique which analyzes inter process communication, and accounts for synchronization overhead while computing the worst case performance estimate of a given system implementation. Application of PERC to example systems described as multiple communicating processes shows the ability of the proposed method to accurately estimate the worst case performance of the system implementation.
系统设计空间的有效探索需要快速和准确的性能估计,而不是穷尽模拟的计算限制替代方案。本文讨论了描述为一组并发通信进程的系统的最坏情况性能分析问题。我们展示了与进程间通信相关的同步开销会对整个系统性能产生重大影响。现有的性能分析技术以单个进程描述为目标,由于没有考虑同步开销,因此应用这些技术会导致不准确的性能估计。我们提出了PERC,一种快速准确的最坏情况性能分析技术,它分析进程间通信,并在计算给定系统实现的最坏情况性能估计时考虑同步开销。将PERC应用于描述为多个通信进程的示例系统,表明所提出的方法能够准确估计系统实现的最坏情况性能。
{"title":"Performance analysis of a system of communicating processes","authors":"S. Dey, S. Bommu","doi":"10.1109/ICCAD.1997.643599","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643599","url":null,"abstract":"Efficient exploration of the system design space necessitates fast and accurate performance estimation as opposed to the computationally prohibitive alternative of exhaustive simulation. The paper addresses the issue of worst case performance analysis of a system described as a set of concurrent communicating processes. We show that the synchronization overhead associated with inter process communication can contribute significantly to the overall system performance. Application of existing performance analysis techniques, which target single process descriptions, lead to inaccurate performance estimates as the synchronization overhead is not accounted for. We present PERC, a fast and accurate worst case performance analysis technique which analyzes inter process communication, and accounts for synchronization overhead while computing the worst case performance estimate of a given system implementation. Application of PERC to example systems described as multiple communicating processes shows the ability of the proposed method to accurately estimate the worst case performance of the system implementation.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130835842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Multipoint Pade approximation using a rational block Lanczos algorithm 使用有理块Lanczos算法的多点分页逼近
Pub Date : 1997-11-13 DOI: 10.1109/ICCAD.1997.643370
Tuyen V. Nguyen, Jing Li
This paper presents a general rational block Lanczos algorithm for computing multipoint matrix Pade approximation of linear multiport networks, which model many important circuits in digital, analog, or mixed signal designs. This algorithm generalizes a novel block Lanczos algorithm with a reliable adaptive scheme for breakdown treatment to address two drawbacks of the single frequency Pade approximation: poor approximation of the transfer function in the frequency domain far away from the expansion point and the instability of the reduced model when the original system is stable. In addition, due to smaller Krylov subspace corresponding to each frequency point, the rational algorithm also alleviates the possible breakdowns when completing high order approximations. The cost of full backward orthogonalization with respect to all previous Lanczos vectors in a rational Lanczos algorithm, as compared to a partial backward orthogonalization in a single point Lanczos algorithm, is offset by more accurate and smaller order approximations.
本文提出了一种通用的有理块Lanczos算法,用于计算线性多端口网络的多点矩阵Pade逼近,该网络对数字、模拟或混合信号设计中的许多重要电路进行建模。该算法推广了一种新颖的块Lanczos算法,并采用可靠的自适应方案进行击穿处理,解决了单频Pade近似法在远离扩展点的频域内传递函数逼近性差以及原系统稳定时简化模型的不稳定性等缺点。此外,由于每个频率点对应的Krylov子空间更小,合理算法也减轻了在完成高阶近似时可能出现的故障。与单点Lanczos算法中的部分后向正交化相比,在有理Lanczos算法中对所有先前的Lanczos向量进行完全后向正交化的代价被更精确和更小阶的近似所抵消。
{"title":"Multipoint Pade approximation using a rational block Lanczos algorithm","authors":"Tuyen V. Nguyen, Jing Li","doi":"10.1109/ICCAD.1997.643370","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643370","url":null,"abstract":"This paper presents a general rational block Lanczos algorithm for computing multipoint matrix Pade approximation of linear multiport networks, which model many important circuits in digital, analog, or mixed signal designs. This algorithm generalizes a novel block Lanczos algorithm with a reliable adaptive scheme for breakdown treatment to address two drawbacks of the single frequency Pade approximation: poor approximation of the transfer function in the frequency domain far away from the expansion point and the instability of the reduced model when the original system is stable. In addition, due to smaller Krylov subspace corresponding to each frequency point, the rational algorithm also alleviates the possible breakdowns when completing high order approximations. The cost of full backward orthogonalization with respect to all previous Lanczos vectors in a rational Lanczos algorithm, as compared to a partial backward orthogonalization in a single point Lanczos algorithm, is offset by more accurate and smaller order approximations.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130996772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Exploiting off-chip memory access modes in high-level synthesis 在高级合成中利用片外存储器访问模式
P. Panda, N. Dutt, A. Nicolau
Memory-intensive behaviors often contain large arrays that are synthesized into off-chip memories. With the increasing gap between on-chip and off-chip memory access delays, it is imperative to exploit the efficient access mode features of modern-day memories (e.g. page-mode DRAMs) in order to alleviate the memory bandwidth bottleneck. Our work addresses this issue by: (a) modeling realistic off-chip memory access modes for High-level Synthesis (HLS), (b) presenting algorithms to infer applicability of HLS with these memory access modes, and (c) transforming input behavior to provide further memory access optimizations during HLS. We demonstrate the utility of our approach using a suite of memory-intensive benchmarks with a realistic DRAM library module. Experimental results show a significant performance improvement (more than 40%) as a result of our optimization techniques.
内存密集型行为通常包含合成为片外存储器的大型数组。随着片上和片外存储器访问延迟之间的差距越来越大,为了缓解存储器带宽瓶颈,必须利用现代存储器(例如页模式dram)的有效访问模式特征。我们的工作通过以下方式解决了这个问题:(a)为高级合成(HLS)建模现实的片外存储器访问模式,(b)提出算法来推断HLS与这些存储器访问模式的适用性,以及(c)转换输入行为以在HLS期间提供进一步的存储器访问优化。我们使用一套具有实际DRAM库模块的内存密集型基准测试来演示我们方法的实用性。实验结果表明,由于我们的优化技术,性能有了显著的提高(超过40%)。
{"title":"Exploiting off-chip memory access modes in high-level synthesis","authors":"P. Panda, N. Dutt, A. Nicolau","doi":"10.5555/266388.266503","DOIUrl":"https://doi.org/10.5555/266388.266503","url":null,"abstract":"Memory-intensive behaviors often contain large arrays that are synthesized into off-chip memories. With the increasing gap between on-chip and off-chip memory access delays, it is imperative to exploit the efficient access mode features of modern-day memories (e.g. page-mode DRAMs) in order to alleviate the memory bandwidth bottleneck. Our work addresses this issue by: (a) modeling realistic off-chip memory access modes for High-level Synthesis (HLS), (b) presenting algorithms to infer applicability of HLS with these memory access modes, and (c) transforming input behavior to provide further memory access optimizations during HLS. We demonstrate the utility of our approach using a suite of memory-intensive benchmarks with a realistic DRAM library module. Experimental results show a significant performance improvement (more than 40%) as a result of our optimization techniques.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133683041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1