首页 > 最新文献

1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)最新文献

英文 中文
Interconnect parasitic extraction in the digital IC design methodology 数字集成电路设计方法中的互连寄生提取
M. Kamon, S. McCormick, K. Sheperd
Accurate interconnect analysis has become essential not only for post-layout verification but also for synthesis. This tutorial explores interconnect analysis and extraction methodology on three levels: coarse extraction to guide synthesis, detailed extraction for full-chip analysis, and full 3D analysis for critical nets. We will also describe the electrical issues caused by parasitics and how they have, and will be, influenced by changing technology. The importance of model order reduction will be described as well as methodologies at the synthesis stage for avoiding parasitic problems.
准确的互连分析不仅对布局后验证,而且对综合也至关重要。本教程探讨互连分析和提取方法在三个层面:粗提取,以指导合成,详细提取全芯片分析,并为关键网全3D分析。我们还将描述由寄生虫引起的电气问题,以及它们如何受到不断变化的技术的影响。本文将描述模型降阶的重要性,以及在综合阶段避免寄生问题的方法。
{"title":"Interconnect parasitic extraction in the digital IC design methodology","authors":"M. Kamon, S. McCormick, K. Sheperd","doi":"10.1109/ICCAD.1999.810653","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810653","url":null,"abstract":"Accurate interconnect analysis has become essential not only for post-layout verification but also for synthesis. This tutorial explores interconnect analysis and extraction methodology on three levels: coarse extraction to guide synthesis, detailed extraction for full-chip analysis, and full 3D analysis for critical nets. We will also describe the electrical issues caused by parasitics and how they have, and will be, influenced by changing technology. The importance of model order reduction will be described as well as methodologies at the synthesis stage for avoiding parasitic problems.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"1 1","pages":"223-230"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79942624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Attractor-repeller approach for global placement 全局布局的吸引-排斥方法
H. Etawil, S. Areibi, A. Vannelli
Traditionally, analytic placement has used linear or quadratic wirelength objective functions. Minimizing either formulation attracts cells sharing common signals (nets) together. The result is a placement with a great deal of overlap among the cells. To reduce cell overlap, the methodology iterates between global optimization and repartitioning of the placement area. In this work, we added new attractive and repulsive forces to the traditional formulation so that overlap among cells is diminished without repartitioning the placement area. The superiority of our approach stems from the fact that our new formulations are convex and no hard constraints are required. A preliminary version of the new placement method is tested using a set of MCNC benchmarks and, on average, the new method achieved 3.96% and 7.6% reduction in wirelength and CPU time compared to TimberWolf v7.0 in the hierarchical mode.
传统上,解析放置使用线性或二次波长目标函数。最小化任何一种配方都能吸引共享共同信号(网络)的细胞。结果是细胞之间有大量重叠的位置。为了减少单元重叠,该方法在全局优化和重新划分放置区域之间迭代。在这项工作中,我们在传统配方中添加了新的吸引力和排斥力,从而减少了细胞之间的重叠,而无需重新划分放置区域。我们的方法的优越性源于这样一个事实,即我们的新公式是凸的,不需要硬约束。新放置方法的初步版本使用一组MCNC基准测试,平均而言,与分层模式下的TimberWolf v7.0相比,新方法的无线长度和CPU时间分别减少了3.96%和7.6%。
{"title":"Attractor-repeller approach for global placement","authors":"H. Etawil, S. Areibi, A. Vannelli","doi":"10.1109/ICCAD.1999.810613","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810613","url":null,"abstract":"Traditionally, analytic placement has used linear or quadratic wirelength objective functions. Minimizing either formulation attracts cells sharing common signals (nets) together. The result is a placement with a great deal of overlap among the cells. To reduce cell overlap, the methodology iterates between global optimization and repartitioning of the placement area. In this work, we added new attractive and repulsive forces to the traditional formulation so that overlap among cells is diminished without repartitioning the placement area. The superiority of our approach stems from the fact that our new formulations are convex and no hard constraints are required. A preliminary version of the new placement method is tested using a set of MCNC benchmarks and, on average, the new method achieved 3.96% and 7.6% reduction in wirelength and CPU time compared to TimberWolf v7.0 in the hierarchical mode.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"90 1","pages":"20-24"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84532256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Analytical macromodeling for high-level power estimation 用于高级功率估计的分析宏观建模
G. Bernacchia, M. Papaefthymiou
This paper presents a new macromodeling technique for high-level power estimation. Our technique is based on a parameterizable analytical model that relies exclusively on statistical information of the circuit's primary inputs. During estimation, the statistics of the required metrics are extracted from the input stream, and a power estimate is obtained by evaluating a model function that has been characterized in advance. Our model yields power estimates within seconds, because it does not rely on the statistics of the circuit's primary outputs and, consequently, does not perform any simulation during estimation. Moreover, it achieves better accuracy than previous macromodeling approaches by taking into account both spatial and temporal correlations in the input stream. In experiments with the ISCAS-85 combinational circuits, the average absolute relative error of our power macromodeling technique was at most 1.8%. The worst-case error was at most 12.8%. For a ripple-carry adder family, in comparison with power estimates that were obtained using Spice, the average absolute and worst-case errors of our model's estimates were at most 5.1% and 19.8%, respectively. In addition to power dissipation, our macromodeling technique can be used to estimate the statistics of a circuit's primary outputs with very low average errors. It is thus suitable for power estimation in core-based systems with pre-characterized blocks. Once the metrics of the primary inputs are known, the power dissipation of the entire system can be estimated by simply propagating this information through the blocks using their corresponding model functions.
本文提出了一种新的用于高阶功率估计的宏建模技术。我们的技术是基于一个可参数化的分析模型,该模型完全依赖于电路主要输入的统计信息。在估计过程中,从输入流中提取所需度量的统计信息,并通过评估预先表征的模型函数来获得功率估计。我们的模型在几秒钟内产生功率估计,因为它不依赖于电路主要输出的统计数据,因此,在估计期间不执行任何模拟。此外,通过考虑输入流中的空间和时间相关性,它比以前的宏观建模方法获得了更好的准确性。在ISCAS-85组合电路的实验中,我们的功率宏建模技术的平均绝对相对误差不超过1.8%。最坏情况下的误差最多为12.8%。对于纹波进位加法器家族,与使用Spice获得的功率估计相比,我们模型估计的平均绝对误差和最坏情况误差分别为5.1%和19.8%。除了功耗外,我们的宏观建模技术还可用于估计平均误差非常低的电路主输出的统计数据。因此,它适用于具有预特征块的基于核的系统的功率估计。一旦主输入的指标已知,整个系统的功耗就可以通过使用相应的模型函数在区块中传播这些信息来估计。
{"title":"Analytical macromodeling for high-level power estimation","authors":"G. Bernacchia, M. Papaefthymiou","doi":"10.1109/ICCAD.1999.810662","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810662","url":null,"abstract":"This paper presents a new macromodeling technique for high-level power estimation. Our technique is based on a parameterizable analytical model that relies exclusively on statistical information of the circuit's primary inputs. During estimation, the statistics of the required metrics are extracted from the input stream, and a power estimate is obtained by evaluating a model function that has been characterized in advance. Our model yields power estimates within seconds, because it does not rely on the statistics of the circuit's primary outputs and, consequently, does not perform any simulation during estimation. Moreover, it achieves better accuracy than previous macromodeling approaches by taking into account both spatial and temporal correlations in the input stream. In experiments with the ISCAS-85 combinational circuits, the average absolute relative error of our power macromodeling technique was at most 1.8%. The worst-case error was at most 12.8%. For a ripple-carry adder family, in comparison with power estimates that were obtained using Spice, the average absolute and worst-case errors of our model's estimates were at most 5.1% and 19.8%, respectively. In addition to power dissipation, our macromodeling technique can be used to estimate the statistics of a circuit's primary outputs with very low average errors. It is thus suitable for power estimation in core-based systems with pre-characterized blocks. Once the metrics of the primary inputs are known, the power dissipation of the entire system can be estimated by simply propagating this information through the blocks using their corresponding model functions.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"25 1","pages":"280-283"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84980800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Formulation of static circuit optimization with reduced size, degeneracy and redundancy by timing graph manipulation 用时序图处理的减小尺寸、退化和冗余的静态电路优化
C. Visweswariah, A. Conn
Static circuit optimization implies sizing of transistors and wires on a static timing basis, taking into account all paths through a circuit. Previous methods of formulating static circuit optimization produced problem statements that are very large and contain inherent redundancy and degeneracy. In this paper, a method of manipulating the timing formulation is presented which produces a dramatically more compact optimization problem, and reduces redundancy and degeneracy. The circuit optimization is therefore more efficient and effective. Numerical results to demonstrate these improvements are presented.
静态电路优化意味着晶体管和导线的尺寸在静态时序的基础上,考虑到通过电路的所有路径。以前制定静态电路优化的方法产生的问题陈述非常大,并且包含固有的冗余和简并性。本文提出了一种处理时序公式的方法,使优化问题更加紧凑,并减少了冗余和退化性。因此,电路优化更加高效和有效。给出了数值结果来证明这些改进。
{"title":"Formulation of static circuit optimization with reduced size, degeneracy and redundancy by timing graph manipulation","authors":"C. Visweswariah, A. Conn","doi":"10.1109/ICCAD.1999.810656","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810656","url":null,"abstract":"Static circuit optimization implies sizing of transistors and wires on a static timing basis, taking into account all paths through a circuit. Previous methods of formulating static circuit optimization produced problem statements that are very large and contain inherent redundancy and degeneracy. In this paper, a method of manipulating the timing formulation is presented which produces a dramatically more compact optimization problem, and reduces redundancy and degeneracy. The circuit optimization is therefore more efficient and effective. Numerical results to demonstrate these improvements are presented.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"26 1","pages":"244-251"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83069147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
The associative-skew clock routing problem 关联倾斜时钟路由问题
Yu Chen, A. Kahng, Gang Qu, A. Zelikovsky
We introduce the associative skew clock routing problem, which seeks a clock routing tree such that zero skew is preserved only within identified groups of sinks. The associative skew problem is easier to address within current EDA frameworks than useful-skew (skew-scheduling) approaches, and defines an interesting tradeoff between the traditional zero-skew clock routing problem (one sink group) and the Steiner minimum tree problem (n sink groups). We present a set of heuristic building blocks, including an efficient and optimal method of merging two zero-skew trees such that zero skew is preserved within the sink sets of each tree. Finally, we list a number of open issues for research and practical application.
我们引入了关联倾斜时钟路由问题,该问题寻求一种时钟路由树,使得零倾斜仅在确定的汇组内保持。在当前的EDA框架中,关联倾斜问题比有用的倾斜(倾斜调度)方法更容易解决,并且在传统的零倾斜时钟路由问题(一个接收器组)和斯坦纳最小树问题(n个接收器组)之间定义了一个有趣的权衡。我们提出了一组启发式构建块,包括一种高效且最优的合并两棵零偏树的方法,使得每棵树的汇聚集中都保持零偏。最后,我们列出了一些有待研究和实际应用的开放性问题。
{"title":"The associative-skew clock routing problem","authors":"Yu Chen, A. Kahng, Gang Qu, A. Zelikovsky","doi":"10.1109/ICCAD.1999.810643","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810643","url":null,"abstract":"We introduce the associative skew clock routing problem, which seeks a clock routing tree such that zero skew is preserved only within identified groups of sinks. The associative skew problem is easier to address within current EDA frameworks than useful-skew (skew-scheduling) approaches, and defines an interesting tradeoff between the traditional zero-skew clock routing problem (one sink group) and the Steiner minimum tree problem (n sink groups). We present a set of heuristic building blocks, including an efficient and optimal method of merging two zero-skew trees such that zero skew is preserved within the sink sets of each tree. Finally, we list a number of open issues for research and practical application.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"19 1","pages":"168-172"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88107120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Techniques for improving the efficiency of sequential circuit test generation 提高顺序电路测试生成效率的技术
X. Lin, I. Pomeranz, S. Reddy
New techniques are presented in this paper to improve the efficiency of a test generation procedure for synchronous sequential circuits. These techniques aid the test generation procedure by reducing the search space, carrying out non-chronological backtracking, and reusing the test generation effort. They have been integrated into an existing sequential test generation system MIX to constitute a new system, named MIX-PLUS. The experimental results for the ISCAS-89 and ADDENDUM-93 benchmark circuits demonstrate the effectiveness of these techniques in improving the fault coverage and test generation efficiency.
本文提出了提高同步时序电路测试生成程序效率的新技术。这些技术通过减少搜索空间、执行非时间顺序回溯和重用测试生成工作来帮助测试生成过程。它们已被集成到现有的顺序测试生成系统MIX中,构成一个名为MIX- plus的新系统。在ISCAS-89和ADDENDUM-93基准电路上的实验结果表明,这些技术在提高故障覆盖率和测试生成效率方面是有效的。
{"title":"Techniques for improving the efficiency of sequential circuit test generation","authors":"X. Lin, I. Pomeranz, S. Reddy","doi":"10.1109/ICCAD.1999.810639","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810639","url":null,"abstract":"New techniques are presented in this paper to improve the efficiency of a test generation procedure for synchronous sequential circuits. These techniques aid the test generation procedure by reducing the search space, carrying out non-chronological backtracking, and reusing the test generation effort. They have been integrated into an existing sequential test generation system MIX to constitute a new system, named MIX-PLUS. The experimental results for the ISCAS-89 and ADDENDUM-93 benchmark circuits demonstrate the effectiveness of these techniques in improving the fault coverage and test generation efficiency.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"23 1","pages":"147-151"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77876373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Least fixpoint approximations for reachability analysis 可达性分析的最小不动点近似
In-Ho Moon, J. Kukula, T. Shiple, F. Somenzi
The knowledge of the reachable states of a sequential circuit can dramatically speed up optimization and model checking. However, since exact reachability analysis may be intractable, approximate techniques are often preferable. H. Cho et al. (1996) presented the machine-by-machine (MBM) and frame-by-frame (FBF) methods to perform approximate finite state machine (FSM) traversal. FBF produces tighter upper bounds than MBM; however, it usually takes much more time and it may have convergence problems. In this paper, we show that there exists a class of methods-least fixpoint approximations-that compute the same results as RFBF ("reached FBF", one of the FBF methods). We show that one member of this class, which we call "least fixpoint MBM" (LMBM), is as efficient as MBM, but provably more accurate. Therefore, the trade-off that existed between MBM and RFBF has been eliminated. LMBM can compute RFBF-quality approximations for all the large ISCAS-89 benchmark circuits in a total of less than 9000 seconds.
时序电路可达状态的知识可以大大加快优化和模型检查的速度。然而,由于精确的可达性分析可能难以处理,所以近似技术通常更可取。H. Cho等人(1996)提出了机器对机器(MBM)和帧对帧(FBF)方法来执行近似有限状态机(FSM)遍历。FBF比MBM产生更严格的上界;然而,它通常需要更多的时间,并且可能有收敛问题。在本文中,我们证明了存在一类方法-最小不动点逼近-计算出与RFBF相同的结果(“到达FBF”,FBF方法之一)。我们证明了这类中的一个成员,我们称之为“最小不动点MBM”(LMBM),它与MBM一样有效,但可以证明它更准确。因此,消除了MBM和RFBF之间存在的权衡关系。LMBM可以在不到9000秒的时间内计算出所有大型ISCAS-89基准电路的rbf质量近似。
{"title":"Least fixpoint approximations for reachability analysis","authors":"In-Ho Moon, J. Kukula, T. Shiple, F. Somenzi","doi":"10.1109/ICCAD.1999.810618","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810618","url":null,"abstract":"The knowledge of the reachable states of a sequential circuit can dramatically speed up optimization and model checking. However, since exact reachability analysis may be intractable, approximate techniques are often preferable. H. Cho et al. (1996) presented the machine-by-machine (MBM) and frame-by-frame (FBF) methods to perform approximate finite state machine (FSM) traversal. FBF produces tighter upper bounds than MBM; however, it usually takes much more time and it may have convergence problems. In this paper, we show that there exists a class of methods-least fixpoint approximations-that compute the same results as RFBF (\"reached FBF\", one of the FBF methods). We show that one member of this class, which we call \"least fixpoint MBM\" (LMBM), is as efficient as MBM, but provably more accurate. Therefore, the trade-off that existed between MBM and RFBF has been eliminated. LMBM can compute RFBF-quality approximations for all the large ISCAS-89 benchmark circuits in a total of less than 9000 seconds.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"77 8","pages":"41-44"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72620686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Towards true crosstalk noise analysis 走向真正的串扰噪声分析
Pinhong Chen, K. Keutzer
Accurate noise analysis is currently of significant concern to high-performance designs, and the number of signals susceptible to noise effects will certainly increase in smaller process geometries. Our approach uses a combination of temporal and functional information to eliminate false transition combinations and thereby overcome insufficiencies in static noise analysis. A similar idea arises in timing analysis where functional and timing information is used to eliminate false paths. The goal of our work is to develop an algorithm, software tool, and noise analysis flow that provide an accurate and conservative approach to noise analysis. In particular, this paper proposes an approach to identifying a pair of vectors that exercises the maximum crosstalk noise.
准确的噪声分析目前是高性能设计的重要关注点,在较小的工艺几何形状中,易受噪声影响的信号数量肯定会增加。我们的方法使用时间和功能信息的组合来消除错误的过渡组合,从而克服静态噪声分析中的不足。在时序分析中也出现了类似的想法,其中使用功能和时序信息来消除错误路径。我们的工作目标是开发一种算法,软件工具和噪声分析流程,为噪声分析提供准确和保守的方法。特别地,本文提出了一种识别最大串扰噪声的一对矢量的方法。
{"title":"Towards true crosstalk noise analysis","authors":"Pinhong Chen, K. Keutzer","doi":"10.1109/ICCAD.1999.810637","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810637","url":null,"abstract":"Accurate noise analysis is currently of significant concern to high-performance designs, and the number of signals susceptible to noise effects will certainly increase in smaller process geometries. Our approach uses a combination of temporal and functional information to eliminate false transition combinations and thereby overcome insufficiencies in static noise analysis. A similar idea arises in timing analysis where functional and timing information is used to eliminate false paths. The goal of our work is to develop an algorithm, software tool, and noise analysis flow that provide an accurate and conservative approach to noise analysis. In particular, this paper proposes an approach to identifying a pair of vectors that exercises the maximum crosstalk noise.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"38 1","pages":"132-137"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78123868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Improved interconnect sharing by identity operation insertion 通过身份操作插入改进互连共享
D. Herrmann, R. Ernst
The paper presents an approach to reduce interconnect cost by insertion of identity operations in a control and data flow graph (CDFG). Other than previous approaches, it is based on systematic pattern analysis and automated transformation selection. The cost function controlling transformation selection is derived with statistical experiments and is optimized using practical benchmarks. The results show significantly reduced interconnect cost for most register architectures and application examples.
本文提出了一种在控制与数据流图(CDFG)中插入身份操作来降低互连成本的方法。与以前的方法不同,它基于系统的模式分析和自动转换选择。通过统计实验推导出控制变换选择的成本函数,并用实际基准对其进行优化。结果表明,对于大多数寄存器体系结构和应用实例,该方法显著降低了互连成本。
{"title":"Improved interconnect sharing by identity operation insertion","authors":"D. Herrmann, R. Ernst","doi":"10.1109/ICCAD.1999.810699","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810699","url":null,"abstract":"The paper presents an approach to reduce interconnect cost by insertion of identity operations in a control and data flow graph (CDFG). Other than previous approaches, it is based on systematic pattern analysis and automated transformation selection. The cost function controlling transformation selection is derived with statistical experiments and is optimized using practical benchmarks. The results show significantly reduced interconnect cost for most register architectures and application examples.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"39 1","pages":"489-492"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77397071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Memory binding for performance optimization of control-flow intensive behaviors 内存绑定用于控制流密集型行为的性能优化
K. Khouri, G. Lakshminarayana, N. Jha
The paper presents a memory binding algorithm for behaviors that are characterized by the presence of conditionals and deeply-nested loops that access memory extensively through arrays. Unlike previous works, this algorithm examines the effects of branch probabilities and allocation constraints. First, we demonstrate through examples, the importance of incorporating branch probabilities and allocation constraint information when searching for a performance-efficient memory binding. We also show the interdependence of these two factors and how varying one without considering the other may greatly affect the performance of the behavior. Second, we introduce a memory binding algorithm that has the ability to examine numerous bindings by employing an efficient performance estimation procedure. The estimation procedure exploits locality of execution, which is an inherent characteristic of target behaviors. This enables the performance estimation technique to look at the global impact of the different bindings, given the allocation constraints. We tested our algorithm using a number of benchmarks from the parallel computing domain. A series of experiments demonstrates the algorithm's ability to produce bindings that optimize performance, meet memory allocation constraints, and adapt to different resource constraints and branch probabilities. Results show that the algorithm requires 37% fewer memories with a performance loss of only 0.3% when compared to a parallel memory architecture. When compared to the best of a series of random memory bindings, the algorithm improves schedule performance by 21%.
本文提出了一种内存绑定算法,其特点是存在条件和深度嵌套循环,通过数组广泛访问内存。与以前的工作不同,该算法检查分支概率和分配约束的影响。首先,我们通过示例演示了在搜索性能高效的内存绑定时结合分支概率和分配约束信息的重要性。我们还展示了这两个因素的相互依存关系,以及如何在不考虑另一个因素的情况下改变一个因素可能会极大地影响行为的表现。其次,我们引入了一种内存绑定算法,该算法能够通过采用有效的性能估计过程来检查许多绑定。估计过程利用了执行的局部性,这是目标行为的固有特征。这使得性能评估技术能够在给定分配约束的情况下查看不同绑定的全局影响。我们使用并行计算领域的许多基准测试了我们的算法。一系列实验证明了该算法能够生成优化性能、满足内存分配约束、适应不同资源约束和分支概率的绑定。结果表明,与并行内存架构相比,该算法所需的内存减少了37%,性能损失仅为0.3%。与一系列最佳随机内存绑定相比,该算法将调度性能提高了21%。
{"title":"Memory binding for performance optimization of control-flow intensive behaviors","authors":"K. Khouri, G. Lakshminarayana, N. Jha","doi":"10.1109/ICCAD.1999.810698","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810698","url":null,"abstract":"The paper presents a memory binding algorithm for behaviors that are characterized by the presence of conditionals and deeply-nested loops that access memory extensively through arrays. Unlike previous works, this algorithm examines the effects of branch probabilities and allocation constraints. First, we demonstrate through examples, the importance of incorporating branch probabilities and allocation constraint information when searching for a performance-efficient memory binding. We also show the interdependence of these two factors and how varying one without considering the other may greatly affect the performance of the behavior. Second, we introduce a memory binding algorithm that has the ability to examine numerous bindings by employing an efficient performance estimation procedure. The estimation procedure exploits locality of execution, which is an inherent characteristic of target behaviors. This enables the performance estimation technique to look at the global impact of the different bindings, given the allocation constraints. We tested our algorithm using a number of benchmarks from the parallel computing domain. A series of experiments demonstrates the algorithm's ability to produce bindings that optimize performance, meet memory allocation constraints, and adapt to different resource constraints and branch probabilities. Results show that the algorithm requires 37% fewer memories with a performance loss of only 0.3% when compared to a parallel memory architecture. When compared to the best of a series of random memory bindings, the algorithm improves schedule performance by 21%.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"1996 1","pages":"482-488"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75667080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1