首页 > 最新文献

Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)最新文献

英文 中文
Analysis of on-chip inductance effects using a novel performance optimization methodology for distributed RLC interconnects 基于新型性能优化方法的分布式RLC互连片上电感效应分析
K. Banerjee, A. Mehrotra
This work presents a new and computationally efficient performance optimization technique for distributed RLC interconnects based on a rigorous delay computation scheme. The new optimization technique has been employed to analyze the impact of line inductance on the circuit behaviour and to illustrate the implications of technology scaling on wire inductance. It is shown that reduction in the driver capacitance and output resistance with scaling makes deep submicron (DSM) designs increasingly susceptible to inductance effects. Also, the impact of inductance variations on performance has been quantified. Additionally, the impact of the wire inductance on catastrophic logic failures and IC reliability issues have been analyzed.
本文提出了一种新的、计算效率高的分布式RLC互连性能优化技术,该技术基于严格的延迟计算方案。采用新的优化技术分析了线电感对电路性能的影响,并说明了技术缩放对线电感的影响。研究表明,驱动电容和输出电阻随缩放而减小,使得深亚微米(DSM)设计越来越容易受到电感效应的影响。此外,电感变化对性能的影响已被量化。此外,还分析了导线电感对灾难性逻辑故障和集成电路可靠性问题的影响。
{"title":"Analysis of on-chip inductance effects using a novel performance optimization methodology for distributed RLC interconnects","authors":"K. Banerjee, A. Mehrotra","doi":"10.1145/378239.379069","DOIUrl":"https://doi.org/10.1145/378239.379069","url":null,"abstract":"This work presents a new and computationally efficient performance optimization technique for distributed RLC interconnects based on a rigorous delay computation scheme. The new optimization technique has been employed to analyze the impact of line inductance on the circuit behaviour and to illustrate the implications of technology scaling on wire inductance. It is shown that reduction in the driver capacitance and output resistance with scaling makes deep submicron (DSM) designs increasingly susceptible to inductance effects. Also, the impact of inductance variations on performance has been quantified. Additionally, the impact of the wire inductance on catastrophic logic failures and IC reliability issues have been analyzed.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122611599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Fast power/ground network optimization based on equivalent circuit modeling 基于等效电路建模的快速电源/地网络优化
S. Tan, C. Shi
This paper presents an efficient algorithm for optimizing the area of power or ground networks in integrated circuits subject to the reliability constraints. Instead of solving the original power/ground networks extracted from circuit layouts as previous methods did, the new method first builds the equivalent models for many series resistors in the original networks, then the sequence of linear programming method is used to solve the simplified networks. The solutions of the original networks then are back solved from the optimized, simplified networks. The new algorithm simply exploits the regularities in the power/ground networks. Experimental results show that the complexities of simplified networks are typically significantly smaller than that of the original circuits, which renders the new algorithm extremely fast. For instance, power/ground networks with more than one million branches can be sized in a few minutes on modern SUN workstations.
本文提出了一种基于可靠性约束的集成电路电源网或地网面积优化算法。该方法不像以前的方法那样求解从电路布置图中提取的原始电源/地网络,而是首先在原始网络中建立多个串联电阻的等效模型,然后使用序列线性规划法求解简化网络。然后从优化的、简化的网络中反求原始网络的解。新算法简单地利用了电源/地网络的规律。实验结果表明,简化后的网络复杂度明显小于原始电路的复杂度,这使得新算法的速度非常快。例如,在现代SUN工作站上,拥有超过100万个分支的电源/接地网络可以在几分钟内确定大小。
{"title":"Fast power/ground network optimization based on equivalent circuit modeling","authors":"S. Tan, C. Shi","doi":"10.1145/378239.379021","DOIUrl":"https://doi.org/10.1145/378239.379021","url":null,"abstract":"This paper presents an efficient algorithm for optimizing the area of power or ground networks in integrated circuits subject to the reliability constraints. Instead of solving the original power/ground networks extracted from circuit layouts as previous methods did, the new method first builds the equivalent models for many series resistors in the original networks, then the sequence of linear programming method is used to solve the simplified networks. The solutions of the original networks then are back solved from the optimized, simplified networks. The new algorithm simply exploits the regularities in the power/ground networks. Experimental results show that the complexities of simplified networks are typically significantly smaller than that of the original circuits, which renders the new algorithm extremely fast. For instance, power/ground networks with more than one million branches can be sized in a few minutes on modern SUN workstations.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122440388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
SATIRE: A new incremental satisfiability engine 讽刺:一个新的增量满意度引擎
J. Whittemore, Joonyoung Kim, K. Sakallah
We introduce SATIRE, a new satisfiability solver that is particularly suited to verification and optimization problems in electronic design automation. SATIRE builds on the most recent advances in satisfiability research, and includes two new features to achieve even higher performance: a facility for incrementally solving sets of related problems, and the ability to handle non-CNF constraints. We provide experimental evidence showing the effectiveness of these additions to classical satisfiability solvers.
我们介绍了一种新的求解器,它特别适合于电子设计自动化中的验证和优化问题。讽刺建立在满意度研究的最新进展,并包括两个新功能,以实现更高的性能:用于增量解决相关问题集的设施,以及处理非cnf约束的能力。我们提供了实验证据,表明这些添加到经典的可满足性求解器的有效性。
{"title":"SATIRE: A new incremental satisfiability engine","authors":"J. Whittemore, Joonyoung Kim, K. Sakallah","doi":"10.1145/378239.379019","DOIUrl":"https://doi.org/10.1145/378239.379019","url":null,"abstract":"We introduce SATIRE, a new satisfiability solver that is particularly suited to verification and optimization problems in electronic design automation. SATIRE builds on the most recent advances in satisfiability research, and includes two new features to achieve even higher performance: a facility for incrementally solving sets of related problems, and the ability to handle non-CNF constraints. We provide experimental evidence showing the effectiveness of these additions to classical satisfiability solvers.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122950071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 198
Automatic generation of application-specific architectures for heterogeneous multiprocessor system-on-chip 异构多处理器片上系统的应用程序特定架构的自动生成
D. Lyonnard, S. Yoo, A. Baghdadi, A. Jerraya
We present a design flow for the generation of application-specific multiprocessor architectures. In the flow, architectural parameters are first extracted from a high-level system specification. Parameters are used to instantiate architectural components, such as processors, memory modules and communication networks. The flow includes the automatic generation of a communication coprocessor that adapts the processor to the communication network in an application-specific way. Experiments with two system examples show the effectiveness of the presented design flow.
我们提出了一个用于生成特定应用程序的多处理器体系结构的设计流程。在流程中,首先从高级系统规范中提取体系结构参数。参数用于实例化架构组件,如处理器、内存模块和通信网络。所述流包括通信协处理器的自动生成,该协处理器以特定于应用程序的方式使所述处理器适应所述通信网络。通过两个系统实例验证了设计流程的有效性。
{"title":"Automatic generation of application-specific architectures for heterogeneous multiprocessor system-on-chip","authors":"D. Lyonnard, S. Yoo, A. Baghdadi, A. Jerraya","doi":"10.1145/378239.379015","DOIUrl":"https://doi.org/10.1145/378239.379015","url":null,"abstract":"We present a design flow for the generation of application-specific multiprocessor architectures. In the flow, architectural parameters are first extracted from a high-level system specification. Parameters are used to instantiate architectural components, such as processors, memory modules and communication networks. The flow includes the automatic generation of a communication coprocessor that adapts the processor to the communication network in an application-specific way. Experiments with two system examples show the effectiveness of the presented design flow.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124134578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 167
From architecture to layout: partitioned memory synthesis for embedded systems-on-chip 从架构到布局:嵌入式片上系统的分区存储器综合
L. Benini, L. Macchiarulo, A. Macii, E. Macii, M. Poncino
We propose an integrated front-end/back-end flow for the automatic generation of a multi-bank memory architecture for embedded systems. The flow is based on an algorithm for the automatic partitioning of on-chip SRAM. Starting from the dynamic execution profile of an embedded application running on a given processor core, we synthesize a multi-banked SRAM architecture optimally fitted to the execution profile. The partitioning algorithm is integrated with the physical design phase into a complete flow that allows the back-annotation of layout information to drive the partitioning process. Results, collected on a set of embedded applications for the ARM processor, have shown average energy savings around 34%.
我们提出了一个集成的前端/后端流程,用于自动生成嵌入式系统的多银行内存架构。该流程基于片上SRAM的自动分区算法。从运行在给定处理器核心上的嵌入式应用程序的动态执行配置文件开始,我们合成了一个最适合执行配置文件的多银行SRAM架构。分区算法与物理设计阶段集成为一个完整的流程,该流程允许对布局信息进行反向注释来驱动分区过程。在一组ARM处理器的嵌入式应用程序上收集的结果显示,平均节能约为34%。
{"title":"From architecture to layout: partitioned memory synthesis for embedded systems-on-chip","authors":"L. Benini, L. Macchiarulo, A. Macii, E. Macii, M. Poncino","doi":"10.1145/378239.379066","DOIUrl":"https://doi.org/10.1145/378239.379066","url":null,"abstract":"We propose an integrated front-end/back-end flow for the automatic generation of a multi-bank memory architecture for embedded systems. The flow is based on an algorithm for the automatic partitioning of on-chip SRAM. Starting from the dynamic execution profile of an embedded application running on a given processor core, we synthesize a multi-banked SRAM architecture optimally fitted to the execution profile. The partitioning algorithm is integrated with the physical design phase into a complete flow that allows the back-annotation of layout information to drive the partitioning process. Results, collected on a set of embedded applications for the ARM processor, have shown average energy savings around 34%.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"97 18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121239374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Energy efficient fixed-priority scheduling for real-time systems on variable voltage processors 可变电压处理器上实时系统的高能效固定优先级调度
Gang Quan, X. Hu
Energy consumption has become an increasingly important consideration in designing many real-time embedded systems. Variable voltage processors, if used properly, can dramatically reduce such system energy consumption. In this paper, we present a technique to determine voltage settings for a variable voltage processor that utilizes a fixed priority assignment to schedule jobs. Our approach also produces the minimum constant voltage needed to feasibly schedule the entire job set. Our algorithms lead to significant energy saving compared with previously presented approaches.
在设计实时嵌入式系统时,能耗已成为越来越重要的考虑因素。可变电压处理器,如果使用得当,可以大大减少这样的系统能耗。在本文中,我们提出了一种技术来确定可变电压处理器的电压设置,该处理器利用固定优先级分配来调度作业。我们的方法还产生了可行地调度整个作业集所需的最小恒定电压。与以前提出的方法相比,我们的算法可以显著节省能源。
{"title":"Energy efficient fixed-priority scheduling for real-time systems on variable voltage processors","authors":"Gang Quan, X. Hu","doi":"10.1145/378239.379074","DOIUrl":"https://doi.org/10.1145/378239.379074","url":null,"abstract":"Energy consumption has become an increasingly important consideration in designing many real-time embedded systems. Variable voltage processors, if used properly, can dramatically reduce such system energy consumption. In this paper, we present a technique to determine voltage settings for a variable voltage processor that utilizes a fixed priority assignment to schedule jobs. Our approach also produces the minimum constant voltage needed to feasibly schedule the entire job set. Our algorithms lead to significant energy saving compared with previously presented approaches.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117042142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 235
Reducing the frequency gap between ASIC and custom designs: a custom perspective 减少ASIC和定制设计之间的频率差距:一个定制的视角
S. E. Rich, Matthew J. Parker, Jim Schwartz
This paper proposes that the ability to control the difference between the simulated and actual frequencies of a design is a key strategy to achieving high frequency in both ASIC and custom designs. We examine this principle and the methodologies that can be deployed to manage this gap.
本文提出控制设计的模拟频率和实际频率之间的差异的能力是在ASIC和定制设计中实现高频的关键策略。我们将研究这一原则和可用于管理这一差距的方法。
{"title":"Reducing the frequency gap between ASIC and custom designs: a custom perspective","authors":"S. E. Rich, Matthew J. Parker, Jim Schwartz","doi":"10.1145/378239.378548","DOIUrl":"https://doi.org/10.1145/378239.378548","url":null,"abstract":"This paper proposes that the ability to control the difference between the simulated and actual frequencies of a design is a key strategy to achieving high frequency in both ASIC and custom designs. We examine this principle and the methodologies that can be deployed to manage this gap.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115492652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Speeding up control-dominated applications through microarchitectural customizations in embedded processors 通过嵌入式处理器中的微架构定制加速控制主导的应用程序
Peter Petrov, A. Orailoglu
We present a methodology for microarchitectural customization of embedded processors by exploiting application information, thus attaining the twin benefits of processor standardization and application-specific customization. Such powerful techniques enable increased application fragments to be placed on the processor, with no sacrifice in system requirements, thus reducing the custom hardware and the concomitant area requirements in SOCs. We illustrate these ideas through the branch resolution problem, known to impose severe performance degradation on control-dominated embedded applications. A low-cost late customizable hardware that uses application information to fold out a set of frequently executed branches is described. Experimental results show that for a representative set of control dominated applications a reduction in the range of 7%-22% in processor cycles can be achieved, thus extending the scope of low-cost embedded processors in complex co-designs for control intensive systems.
我们提出了一种利用应用信息对嵌入式处理器进行微架构定制的方法,从而实现了处理器标准化和特定应用定制的双重好处。这种强大的技术可以将更多的应用程序片段放在处理器上,而不会牺牲系统需求,从而减少了soc中的自定义硬件和伴随的区域需求。我们通过分支解析问题来说明这些思想,已知分支解析问题会对控制为主的嵌入式应用程序造成严重的性能下降。描述了一种低成本的后期可定制硬件,它使用应用程序信息折叠出一组经常执行的分支。实验结果表明,对于一组具有代表性的控制主导应用,处理器周期可以减少7%-22%,从而扩大了低成本嵌入式处理器在控制密集型系统复杂协同设计中的范围。
{"title":"Speeding up control-dominated applications through microarchitectural customizations in embedded processors","authors":"Peter Petrov, A. Orailoglu","doi":"10.1145/378239.379014","DOIUrl":"https://doi.org/10.1145/378239.379014","url":null,"abstract":"We present a methodology for microarchitectural customization of embedded processors by exploiting application information, thus attaining the twin benefits of processor standardization and application-specific customization. Such powerful techniques enable increased application fragments to be placed on the processor, with no sacrifice in system requirements, thus reducing the custom hardware and the concomitant area requirements in SOCs. We illustrate these ideas through the branch resolution problem, known to impose severe performance degradation on control-dominated embedded applications. A low-cost late customizable hardware that uses application information to fold out a set of frequently executed branches is described. Experimental results show that for a representative set of control dominated applications a reduction in the range of 7%-22% in processor cycles can be achieved, thus extending the scope of low-cost embedded processors in complex co-designs for control intensive systems.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126160154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Reducing memory requirements of nested loops for embedded systems 减少嵌入式系统嵌套循环的内存需求
J. Ramanujam, Jinpyo Hong, M. Kandemir, A. Narayan
Most embedded systems have limited amount of memory. In contrast, the memory requirements of code (in particular loops) running on embedded systems is significant. This paper addresses the problem of estimating the amount of memory needed for transfers of data in embedded systems. The problem of estimating the region associated with a statement or the set of elements referenced by a statement during the execution of the entire set of nested loops is analyzed. A quantitative analysis of the number of elements referenced is presented; exact expressions for uniformly generated references and a close upper and lower bound for non-uniformly generated references are derived. In addition to presenting an algorithm that computes the total memory required, we discuss the effect of transformations on the lifetimes of array variables, i.e., the time between the first and last accesses to a given array location. A detailed analysis on the effect of unimodular transformations on data locality including the calculation of the maximum window size is discussed. The term maximum window size is introduced and quantitative expressions are derived to compute the window size. The smaller the value of the maximum window size, the higher the amount of data locality in the loop.
大多数嵌入式系统的内存都是有限的。相比之下,在嵌入式系统上运行的代码(特别是循环)的内存需求非常大。本文解决了在嵌入式系统中估计数据传输所需的内存量的问题。分析了在整个嵌套循环集的执行过程中,估计与语句或语句引用的元素集相关的区域的问题。对参考元素的数量进行了定量分析;导出了均匀生成引用的精确表达式和非均匀生成引用的接近上界和下界。除了介绍计算所需总内存的算法外,我们还讨论了转换对数组变量生命周期的影响,即对给定数组位置的第一次和最后一次访问之间的时间。详细分析了非模变换对数据局部性的影响,包括最大窗口大小的计算。引入了“最大窗口尺寸”一词,导出了计算窗口尺寸的定量表达式。最大窗口大小的值越小,循环中的数据局部性量就越高。
{"title":"Reducing memory requirements of nested loops for embedded systems","authors":"J. Ramanujam, Jinpyo Hong, M. Kandemir, A. Narayan","doi":"10.1145/378239.378523","DOIUrl":"https://doi.org/10.1145/378239.378523","url":null,"abstract":"Most embedded systems have limited amount of memory. In contrast, the memory requirements of code (in particular loops) running on embedded systems is significant. This paper addresses the problem of estimating the amount of memory needed for transfers of data in embedded systems. The problem of estimating the region associated with a statement or the set of elements referenced by a statement during the execution of the entire set of nested loops is analyzed. A quantitative analysis of the number of elements referenced is presented; exact expressions for uniformly generated references and a close upper and lower bound for non-uniformly generated references are derived. In addition to presenting an algorithm that computes the total memory required, we discuss the effect of transformations on the lifetimes of array variables, i.e., the time between the first and last accesses to a given array location. A detailed analysis on the effect of unimodular transformations on data locality including the calculation of the maximum window size is discussed. The term maximum window size is introduced and quantitative expressions are derived to compute the window size. The smaller the value of the maximum window size, the higher the amount of data locality in the loop.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127473041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
High-quality operation binding for clustered VLIW datapaths 群集VLIW数据路径的高质量操作绑定
V. Lapinskii, M. Jacome, G. Veciana
Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with large numbers of register file ports. Efficient utilization of a clustered datapath requires careful binding of operations to clusters. The paper proposes a binding algorithm that effectively explores tradeoffs between in-cluster operation serialization and delays associated with data transfers between clusters. Extensive experimental evidence is provided showing that the algorithm generates high quality solutions for basic blocks, with up to 29% improvement over a state-of-the-art advanced binding algorithm.
聚类是一种有效的方法,可以增加VLIW数据路径中的可用并行性,而不会产生与大量注册文件端口相关的严重损失。有效地利用集群数据路径需要小心地将操作绑定到集群。本文提出了一种绑定算法,该算法有效地探索了集群内操作序列化与集群间数据传输相关延迟之间的权衡。大量的实验证据表明,该算法为基本块生成高质量的解决方案,比最先进的高级绑定算法提高了29%。
{"title":"High-quality operation binding for clustered VLIW datapaths","authors":"V. Lapinskii, M. Jacome, G. Veciana","doi":"10.1145/378239.379051","DOIUrl":"https://doi.org/10.1145/378239.379051","url":null,"abstract":"Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with large numbers of register file ports. Efficient utilization of a clustered datapath requires careful binding of operations to clusters. The paper proposes a binding algorithm that effectively explores tradeoffs between in-cluster operation serialization and delays associated with data transfers between clusters. Extensive experimental evidence is provided showing that the algorithm generates high quality solutions for basic blocks, with up to 29% improvement over a state-of-the-art advanced binding algorithm.","PeriodicalId":154316,"journal":{"name":"Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125667455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1