首页 > 最新文献

2008 Asia and South Pacific Design Automation Conference最新文献

英文 中文
Architecture-level thermal behavioral characterization for multi-core microprocessors 多核微处理器的体系结构级热行为表征
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483994
Duo Li, S. Tan, M. Tirumala
In this paper, we investigate a new architecture-level thermal characterization problem from behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multi-core microprocessor design. We propose a new approach, called ThermPOF, to build the thermal behavioral models from the measured architecture thermal and power information. ThermPOF first builds the behavioral thermal model using generalized pencil-of-function (GPOF) method. And then to effectively model transient temperature changes, we proposed two new schemes to improve the GPOF. First we apply logarithmic-scale sampling instead of traditional linear sampling to better capture the temperature changing characteristics. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, Krylov subspace based model order reduction is performed to reduce the order of the models in the state-space form. Experimental results on a practical quad-core microprocessor show that generated thermal behavioral models match the measured data very well.
在本文中,我们从行为建模的角度研究了一个新的架构级热表征问题,以解决高性能多核微处理器设计中出现的热相关分析和优化问题。我们提出了一种新的方法,称为ThermPOF,从测量的建筑热和功率信息建立热行为模型。ThermPOF首先采用广义函数铅笔法(GPOF)建立了行为热模型。然后,为了有效地模拟瞬态温度变化,我们提出了两种新的改进方案。首先,我们采用对数尺度采样代替传统的线性采样来更好地捕捉温度变化特征。其次,对提取的热脉冲响应进行修正,使提取的极点在不损失精度的情况下保持稳定。为了进一步减小模型尺寸,采用基于Krylov子空间的模型降阶方法,以状态空间形式降低模型的阶数。在实际四核微处理器上的实验结果表明,所生成的热行为模型与实测数据吻合良好。
{"title":"Architecture-level thermal behavioral characterization for multi-core microprocessors","authors":"Duo Li, S. Tan, M. Tirumala","doi":"10.1109/ASPDAC.2008.4483994","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483994","url":null,"abstract":"In this paper, we investigate a new architecture-level thermal characterization problem from behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multi-core microprocessor design. We propose a new approach, called ThermPOF, to build the thermal behavioral models from the measured architecture thermal and power information. ThermPOF first builds the behavioral thermal model using generalized pencil-of-function (GPOF) method. And then to effectively model transient temperature changes, we proposed two new schemes to improve the GPOF. First we apply logarithmic-scale sampling instead of traditional linear sampling to better capture the temperature changing characteristics. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, Krylov subspace based model order reduction is performed to reduce the order of the models in the state-space form. Experimental results on a practical quad-core microprocessor show that generated thermal behavioral models match the measured data very well.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126784115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Application-specific Network-on-Chip architecture synthesis based on set partitions and Steiner Trees 基于集合分区和斯坦纳树的特定应用的片上网络体系结构综合
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483955
Shan Yan, Bill Lin
This paper considers the problem of synthesizing application-specific network-on-chip (NoC) architectures. We propose two heuristic algorithms called CLUSTER and DECOMPOSE that can systematically examine different set partitions of communication flows, and we propose Rectilinear-Steiner-tree (RST) based algorithms for generating an efficient network topology for each group in the partition. Different evaluation functions in fitting with the implementation backend and the corresponding implementation technology can be incorporated into our solution framework to evaluate the implementation cost of the set partitions and RST topologies generated. In particular, we experimented with an implementation cost model based on the power consumption parameters of a 70 nm process technology where leakage power is a major source of energy consumption. Experimental results on a variety of NoC benchmarks showed that our synthesis results can on average achieve a 6.92 x reduction in power consumption over the best standard mesh implementation. To further gauge the effectiveness of our heuristic algorithms, we also implemented an exact algorithm that enumerates all distinct set partitions. For the benchmarks where exact results could be obtained, our CLUSTER and DECOMPOSE algorithms on average can achieve results within 1% and 2% of exact results, with execution times all under 1 second whereas the exact algorithms took as much as 4.5 hours.
本文研究了应用专用片上网络(NoC)体系结构的综合问题。我们提出了两种启发式算法,称为CLUSTER和decomposition,它们可以系统地检查通信流的不同分区集,并且我们提出了基于直线斯坦纳树(RST)的算法,用于为分区中的每个组生成有效的网络拓扑。可以将适合实现后端和相应实现技术的不同评估函数合并到我们的解决方案框架中,以评估生成的集合分区和RST拓扑的实现成本。特别是,我们实验了一个基于70纳米工艺技术功耗参数的实施成本模型,其中泄漏功率是能耗的主要来源。在各种NoC基准测试上的实验结果表明,我们的合成结果平均可以比最佳标准网格实现降低6.92倍的功耗。为了进一步衡量启发式算法的有效性,我们还实现了一个精确的算法,该算法枚举所有不同的集合分区。对于可以获得精确结果的基准测试,我们的CLUSTER和分解算法平均可以在精确结果的1%和2%内获得结果,执行时间都在1秒以下,而精确算法则需要多达4.5小时。
{"title":"Application-specific Network-on-Chip architecture synthesis based on set partitions and Steiner Trees","authors":"Shan Yan, Bill Lin","doi":"10.1109/ASPDAC.2008.4483955","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483955","url":null,"abstract":"This paper considers the problem of synthesizing application-specific network-on-chip (NoC) architectures. We propose two heuristic algorithms called CLUSTER and DECOMPOSE that can systematically examine different set partitions of communication flows, and we propose Rectilinear-Steiner-tree (RST) based algorithms for generating an efficient network topology for each group in the partition. Different evaluation functions in fitting with the implementation backend and the corresponding implementation technology can be incorporated into our solution framework to evaluate the implementation cost of the set partitions and RST topologies generated. In particular, we experimented with an implementation cost model based on the power consumption parameters of a 70 nm process technology where leakage power is a major source of energy consumption. Experimental results on a variety of NoC benchmarks showed that our synthesis results can on average achieve a 6.92 x reduction in power consumption over the best standard mesh implementation. To further gauge the effectiveness of our heuristic algorithms, we also implemented an exact algorithm that enumerates all distinct set partitions. For the benchmarks where exact results could be obtained, our CLUSTER and DECOMPOSE algorithms on average can achieve results within 1% and 2% of exact results, with execution times all under 1 second whereas the exact algorithms took as much as 4.5 hours.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116595295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Automatic re-coding of reference code into structured and analyzable SoC models 将参考代码自动重新编码为结构化和可分析的SoC模型
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483991
Pramod Chandraiah, R. Dömer
The quality of the input system model has a direct bearing on the effectiveness of the system exploration and synthesis tools. Given a well-structured system model, tools today are effective in generating efficient implementations. However, readily available reference C codes are not conducive for system synthesis as they lack the necessary structure and analyzability needed by the design flow. Usually reference C code is manually converted into a SoC model by applying necessary transformations. The type of transformations depends on the underlying design flow and tools. Proper structural hierarchy is one essential feature needed for architectural exploration. In this paper, we provide automatic C code transformations to encapsulate functions and insert structural hierarchy to create well-structured and analyzable SoC models. Our automatic transformations, combined with interactive application of the designer's knowledge and experience, enable faster creation of structural hierarchy in C models and hence result in significant reduction of the overall design time.
输入系统模型的质量直接关系到系统探索和综合工具的有效性。给定一个结构良好的系统模型,今天的工具在生成高效实现方面是有效的。然而,现成的参考C代码不利于系统综合,因为它们缺乏设计流程所需的必要结构和可分析性。通常,通过应用必要的转换,将参考C代码手动转换为SoC模型。转换的类型取决于底层设计流和工具。适当的结构层次是建筑探索所需要的一个基本特征。在本文中,我们提供了自动C代码转换来封装函数和插入结构层次结构,以创建结构良好且可分析的SoC模型。我们的自动转换,结合设计人员的知识和经验的交互式应用程序,可以更快地在C模型中创建结构层次,从而显著减少总体设计时间。
{"title":"Automatic re-coding of reference code into structured and analyzable SoC models","authors":"Pramod Chandraiah, R. Dömer","doi":"10.1109/ASPDAC.2008.4483991","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483991","url":null,"abstract":"The quality of the input system model has a direct bearing on the effectiveness of the system exploration and synthesis tools. Given a well-structured system model, tools today are effective in generating efficient implementations. However, readily available reference C codes are not conducive for system synthesis as they lack the necessary structure and analyzability needed by the design flow. Usually reference C code is manually converted into a SoC model by applying necessary transformations. The type of transformations depends on the underlying design flow and tools. Proper structural hierarchy is one essential feature needed for architectural exploration. In this paper, we provide automatic C code transformations to encapsulate functions and insert structural hierarchy to create well-structured and analyzable SoC models. Our automatic transformations, combined with interactive application of the designer's knowledge and experience, enable faster creation of structural hierarchy in C models and hence result in significant reduction of the overall design time.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116597327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Heuristic power/ground network and floorplan co-design method 启发式电源/地网与平面图协同设计方法
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484025
Xiaoyi Wang, Jin Shi, Yici Cai, Xianlong Hong
It's a trend to consider power supply integrity at early stage to improve the design quality. In this paper, we propose a novel algorithm to optimize floorplan together with P/G network. Compared with previous methods, our algorithm can search the floorplan space more efficiently and therefore lead to better results. Further, we also propose a smart heuristic method to build P/G mesh grid with optimized topology. Experimental results show our method can speedup the floorplanning process by about 10 times and reduce the routing area of P/G network while maintaining the floorplan quality and P/G integrity.
在设计初期就考虑电源的完整性以提高设计质量是一种趋势。本文提出了一种结合P/G网络的平面优化算法。与以往的方法相比,我们的算法可以更有效地搜索平面空间,从而得到更好的结果。此外,我们还提出了一种智能启发式方法来构建具有优化拓扑结构的P/G网格。实验结果表明,该方法在保证平面图质量和P/G完整性的前提下,将平面规划速度提高了约10倍,减少了P/G网络的路由面积。
{"title":"Heuristic power/ground network and floorplan co-design method","authors":"Xiaoyi Wang, Jin Shi, Yici Cai, Xianlong Hong","doi":"10.1109/ASPDAC.2008.4484025","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484025","url":null,"abstract":"It's a trend to consider power supply integrity at early stage to improve the design quality. In this paper, we propose a novel algorithm to optimize floorplan together with P/G network. Compared with previous methods, our algorithm can search the floorplan space more efficiently and therefore lead to better results. Further, we also propose a smart heuristic method to build P/G mesh grid with optimized topology. Experimental results show our method can speedup the floorplanning process by about 10 times and reduce the routing area of P/G network while maintaining the floorplan quality and P/G integrity.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129055427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Block cache for embedded systems 嵌入式系统的块缓存
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483967
Dominic Hillenbrand, J. Henkel
On chip memories provide fast and energy efficient storage for code and data in comparison to caches or external memories. We present techniques and algorithms that allow for an automated use of on chip memory for code blocks of instructions which are dynamically scheduled at runtime to increase performance and reduce power consumption.
与高速缓存或外部存储器相比,片上存储器为代码和数据提供了快速和节能的存储。我们提出的技术和算法允许在运行时动态调度的指令代码块的芯片上内存的自动使用,以提高性能和降低功耗。
{"title":"Block cache for embedded systems","authors":"Dominic Hillenbrand, J. Henkel","doi":"10.1109/ASPDAC.2008.4483967","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483967","url":null,"abstract":"On chip memories provide fast and energy efficient storage for code and data in comparison to caches or external memories. We present techniques and algorithms that allow for an automated use of on chip memory for code blocks of instructions which are dynamically scheduled at runtime to increase performance and reduce power consumption.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123223972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic generation of hardware dependent software for MPSoCs from abstract system specifications 基于抽象系统规范的mpsoc硬件相关软件的自动生成
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483954
G. Schirner, A. Gerstlauer, R. Dömer
Increasing software content in embedded systems and SoCs drives the demand to automatically synthesize software binaries from abstract models. This is especially critical for Hardware dependent Software (HdS) due to the tight coupling. In this paper, we present our approach to automatically synthesize HdS from an abstract system model. We synthesize driver code, interrupt handlers and startup code. We furthermore automatically adjust the application to use RTOS services. We target traditional RTOS-based multi-tasking solutions, as well as a pure interrupt-based implementation (without any RTOS). Our experimental results show the automatic generation of final binary images for six real-life target applications and demonstrate significant productivity gains due to automation. Our HdS synthesis is an enabler for efficient MPSoC development and rapid design space exploration.
嵌入式系统和soc中不断增加的软件内容推动了从抽象模型自动合成软件二进制文件的需求。由于紧密耦合,这对于依赖硬件的软件(HdS)尤其重要。本文提出了一种从抽象系统模型自动合成HdS的方法。我们合成驱动程序代码、中断处理程序和启动代码。我们进一步自动调整应用程序以使用RTOS服务。我们的目标是传统的基于RTOS的多任务解决方案,以及纯基于中断的实现(没有任何RTOS)。我们的实验结果显示,自动生成最终的二值图像为六个现实生活中的目标应用,并证明了显著的生产力提高,由于自动化。我们的HdS合成是高效MPSoC开发和快速设计空间探索的推动者。
{"title":"Automatic generation of hardware dependent software for MPSoCs from abstract system specifications","authors":"G. Schirner, A. Gerstlauer, R. Dömer","doi":"10.1109/ASPDAC.2008.4483954","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483954","url":null,"abstract":"Increasing software content in embedded systems and SoCs drives the demand to automatically synthesize software binaries from abstract models. This is especially critical for Hardware dependent Software (HdS) due to the tight coupling. In this paper, we present our approach to automatically synthesize HdS from an abstract system model. We synthesize driver code, interrupt handlers and startup code. We furthermore automatically adjust the application to use RTOS services. We target traditional RTOS-based multi-tasking solutions, as well as a pure interrupt-based implementation (without any RTOS). Our experimental results show the automatic generation of final binary images for six real-life target applications and demonstrate significant productivity gains due to automation. Our HdS synthesis is an enabler for efficient MPSoC development and rapid design space exploration.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123479740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A low-cost cryptographic processor for security embedded system 一种用于安全嵌入式系统的低成本密码处理器
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483921
Ronghua Lu, Jun Han, Xiaoyang Zeng, Qing Li, L. Mai, Jia Zhao
A low-cost cryptographic processor for security embedded system is presented in this paper. The processor, without any assistance of dedicated cryptographic coprocessors, is scalable and very efficient for popular cryptographic algorithms such as RSA/ECC, AES, Hash, etc. Based on SMIC 0.18 um standard CMOS technology, the core circuit of the test chip has only about 32 k gates, and a max frequency of 200 MHz, under which the 1024-bit RSA algorithm takes only 150 ms and the throughout of AES reaches 256 Mbits/s.
本文提出了一种用于安全嵌入式系统的低成本密码处理器。该处理器无需任何专用加密协处理器的帮助,对于RSA/ECC, AES, Hash等流行的加密算法具有可扩展性和非常高效的性能。基于中芯国际0.18 um标准CMOS技术,测试芯片核心电路只有32k左右的门,最大频率为200mhz, 1024位RSA算法仅耗时150 ms, AES的传输速率达到256mbits /s。
{"title":"A low-cost cryptographic processor for security embedded system","authors":"Ronghua Lu, Jun Han, Xiaoyang Zeng, Qing Li, L. Mai, Jia Zhao","doi":"10.1109/ASPDAC.2008.4483921","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483921","url":null,"abstract":"A low-cost cryptographic processor for security embedded system is presented in this paper. The processor, without any assistance of dedicated cryptographic coprocessors, is scalable and very efficient for popular cryptographic algorithms such as RSA/ECC, AES, Hash, etc. Based on SMIC 0.18 um standard CMOS technology, the core circuit of the test chip has only about 32 k gates, and a max frequency of 200 MHz, under which the 1024-bit RSA algorithm takes only 150 ms and the throughout of AES reaches 256 Mbits/s.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126262927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Statistical power profile correlation for realistic thermal estimation 统计功率分布相关性的实际热估计
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4484038
L. Singhal, Sejong Oh, E. Bozorgzadeh
At system level, the on-chip temperature depends both on power density and the thermal coupling with the neighboring regions. The problem of finding the right set of input power profile(s) for accurate temperature estimation has not been studied. Considering only average or peak power density may lead either to underestimation or overestimation of the thermal crisis, respectively. To provide more realistic temperature estimation, we propose to incorporate multiple power profiles. Using the proposed statistical methods to determine the closeness between the power profiles, we apply a clustering algorithm to identify few input power profiles. We incorporate them in a thermal-aware floorplanner and empirical results show that using the single input power profile (average or peak) leads to 37% degradation in critical wire delay and 20% degradation in wire length, compared to using the multiple input power profiles.
在系统级,片上温度取决于功率密度和与邻近区域的热耦合。如何找到一组正确的输入功率分布以进行精确的温度估计,这一问题尚未得到研究。仅考虑平均或峰值功率密度可能分别导致对热危机的低估或高估。为了提供更真实的温度估计,我们建议合并多个功率分布。利用所提出的统计方法来确定功率分布之间的紧密性,我们应用聚类算法来识别少量的输入功率分布。我们将它们整合到热感知地板规划器中,经验结果表明,与使用多个输入功率配置文件相比,使用单个输入功率配置文件(平均或峰值)导致临界线延迟降低37%,线长度降低20%。
{"title":"Statistical power profile correlation for realistic thermal estimation","authors":"L. Singhal, Sejong Oh, E. Bozorgzadeh","doi":"10.1109/ASPDAC.2008.4484038","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484038","url":null,"abstract":"At system level, the on-chip temperature depends both on power density and the thermal coupling with the neighboring regions. The problem of finding the right set of input power profile(s) for accurate temperature estimation has not been studied. Considering only average or peak power density may lead either to underestimation or overestimation of the thermal crisis, respectively. To provide more realistic temperature estimation, we propose to incorporate multiple power profiles. Using the proposed statistical methods to determine the closeness between the power profiles, we apply a clustering algorithm to identify few input power profiles. We incorporate them in a thermal-aware floorplanner and empirical results show that using the single input power profile (average or peak) leads to 37% degradation in critical wire delay and 20% degradation in wire length, compared to using the multiple input power profiles.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126531564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Total power optimization combining placement, sizing and multi-Vt through slack distribution management 通过松弛分布管理,实现布局、尺寸和多vt相结合的总功率优化
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483973
T. Luo, D. Newmark, D. Pan
Power dissipation is quickly becoming one of the most important limiters in nanometer IC design for leakage increases exponentially as the technology scaling down. However, power and timing are often conflicting objectives during optimization. In this paper, we propose a novel total power optimization flow under performance constraint. Instead of using placement, gate sizing, and multiple-Vt assignment techniques independently, we combine them together through the concept of slack distribution management to maximize the potential for power reduction. We propose to use the linear programming (LP) based placement and the geometric programming (GP) based gate sizing formulations to improve the slack distribution, which helps to maximize the total power reduction during the Vt-assignment stage. Our formulations include important practical design constraints, such as slew, noise and short circuit power, which were often ignored previously. We tested our algorithm on a set of industrial-strength manually optimized circuits from a multi-GHz 65 nm microprocessor, and obtained very promising results. To our best knowledge, this is the first work that combines placement, gate sizing and Vt swapping systematically for total power (and in particular leakage) management.
在纳米集成电路设计中,功耗正迅速成为最重要的限制因素之一,因为随着技术规模的缩小,泄漏会呈指数级增长。然而,在优化过程中,功率和时间往往是相互冲突的目标。本文提出了一种新的性能约束下的总功率优化流程。我们不是单独使用布局、栅极尺寸和多电压分配技术,而是通过松弛分布管理的概念将它们结合在一起,以最大限度地降低功率的潜力。我们建议使用基于线性规划(LP)的布局和基于几何规划(GP)的栅极尺寸公式来改善松弛分布,这有助于在vt分配阶段最大限度地降低总功率。我们的公式包含了重要的实际设计约束,如摆压、噪声和短路功率,这些在以前经常被忽略。我们在一组多ghz 65nm微处理器的工业强度人工优化电路上测试了我们的算法,并获得了非常有希望的结果。据我们所知,这是第一次将放置、栅极尺寸和Vt交换系统地结合在一起,以实现总功率(特别是泄漏)管理。
{"title":"Total power optimization combining placement, sizing and multi-Vt through slack distribution management","authors":"T. Luo, D. Newmark, D. Pan","doi":"10.1109/ASPDAC.2008.4483973","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483973","url":null,"abstract":"Power dissipation is quickly becoming one of the most important limiters in nanometer IC design for leakage increases exponentially as the technology scaling down. However, power and timing are often conflicting objectives during optimization. In this paper, we propose a novel total power optimization flow under performance constraint. Instead of using placement, gate sizing, and multiple-Vt assignment techniques independently, we combine them together through the concept of slack distribution management to maximize the potential for power reduction. We propose to use the linear programming (LP) based placement and the geometric programming (GP) based gate sizing formulations to improve the slack distribution, which helps to maximize the total power reduction during the Vt-assignment stage. Our formulations include important practical design constraints, such as slew, noise and short circuit power, which were often ignored previously. We tested our algorithm on a set of industrial-strength manually optimized circuits from a multi-GHz 65 nm microprocessor, and obtained very promising results. To our best knowledge, this is the first work that combines placement, gate sizing and Vt swapping systematically for total power (and in particular leakage) management.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128158059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Pessimism reduction in coupling-aware static timing analysis using timing and logic filtering 基于时序和逻辑滤波的耦合感知静态时序分析中的悲观情绪降低
Pub Date : 2008-01-21 DOI: 10.1109/ASPDAC.2008.4483999
D. Das, Kip Killpack, Chandramouli V. Kashyap, A. Jas, H. Zhou
With continued scaling of technology into nanometer regimes, the impact of coupling induced delay variations is significant. While several coupling-aware static timers have been proposed, the results are often pessimistic with many false failures. We present an integrated iterative timing filtering and logic filtering based approach to reduce pessimism. We use a realistic coupling model based on arrival times and slews and show that non-iterative pessimism reduction algorithms proposed in previous research may give potentially non- conservative timing results. On a functional block from an industrial 65nm microprocessor, our algorithm produced a maximum pessimism reduction of 11.18% of cycle time over converged timing filtering analysis that does not consider logic constraints.
随着技术不断扩展到纳米级,耦合引起的延迟变化的影响是显著的。虽然已经提出了几种耦合感知静态计时器,但结果往往是悲观的,有许多假故障。我们提出了一种基于迭代定时滤波和逻辑滤波的集成方法来减少悲观情绪。我们使用了一个基于到达时间和slesles的现实耦合模型,并证明了先前研究中提出的非迭代悲观减少算法可能会给出潜在的非保守定时结果。在工业65nm微处理器的功能块上,我们的算法比不考虑逻辑约束的收敛时序滤波分析产生了11.18%的最大悲观周期时间减少。
{"title":"Pessimism reduction in coupling-aware static timing analysis using timing and logic filtering","authors":"D. Das, Kip Killpack, Chandramouli V. Kashyap, A. Jas, H. Zhou","doi":"10.1109/ASPDAC.2008.4483999","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483999","url":null,"abstract":"With continued scaling of technology into nanometer regimes, the impact of coupling induced delay variations is significant. While several coupling-aware static timers have been proposed, the results are often pessimistic with many false failures. We present an integrated iterative timing filtering and logic filtering based approach to reduce pessimism. We use a realistic coupling model based on arrival times and slews and show that non-iterative pessimism reduction algorithms proposed in previous research may give potentially non- conservative timing results. On a functional block from an industrial 65nm microprocessor, our algorithm produced a maximum pessimism reduction of 11.18% of cycle time over converged timing filtering analysis that does not consider logic constraints.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125669155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2008 Asia and South Pacific Design Automation Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1