首页 > 最新文献

2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)最新文献

英文 中文
Full-chip runtime error-tolerant thermal estimation and prediction for practical thermal management 用于实际热管理的全芯片运行时容错热估计和预测
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105408
Hai Wang, S. Tan, Guangdeng Liao, Rafael Quintanilla, Ashish Gupta
Temperature estimation and prediction are critical for online regulation of temperature and hot spots on today's high performance processors. In this paper, we present a new method, called FRETEP, to accurately estimate and predict the full-chip temperature at runtime under more practical conditions where we have inaccurate thermal model, less accurate power estimations and limited number of on-chip physical thermal sensors. FRETEP employs a number of new techniques to address this problem. First, we propose a new thermal sensor based error compensation method to correct the errors due to the inaccuracies in thermal model and power estimations. Second, we raise a new correlation based method for error compensation estimation with limited number of thermal sensors. Third, we optimize the compact modeling technique and integrate it into the error compensation process in order to perform the thermal estimation with error compensation at runtime. Last but not least, to enable accurate temperature prediction for the emerging predictive thermal management, we design a full-chip thermal prediction framework employing time series prediction method. Experimental results show FRETEP accurately estimates and predicts the full-chip thermal behavior with very low overhead introduced and compares very favorably with the Kalman filter based approach on standard SPEC benchmarks.
温度估计和预测对于当今高性能处理器的温度和热点的在线调节至关重要。在本文中,我们提出了一种新的方法,称为FRETEP,在更实际的条件下,我们有不准确的热模型,不准确的功率估计和片上物理热传感器数量有限的情况下,准确地估计和预测运行时的全芯片温度。FRETEP采用了许多新技术来解决这个问题。首先,我们提出了一种新的基于热传感器的误差补偿方法,以纠正由于热模型和功率估计不准确而导致的误差。其次,我们提出了一种新的基于相关的误差补偿估计方法。第三,我们优化了紧凑建模技术,并将其集成到误差补偿过程中,以便在运行时进行带有误差补偿的热估计。最后,为了能够对新兴的预测热管理进行准确的温度预测,我们设计了一个采用时间序列预测方法的全芯片热预测框架。实验结果表明,FRETEP在引入非常低的开销的情况下准确地估计和预测了全芯片的热行为,并且在标准SPEC基准测试中与基于卡尔曼滤波的方法相比非常有利。
{"title":"Full-chip runtime error-tolerant thermal estimation and prediction for practical thermal management","authors":"Hai Wang, S. Tan, Guangdeng Liao, Rafael Quintanilla, Ashish Gupta","doi":"10.1109/ICCAD.2011.6105408","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105408","url":null,"abstract":"Temperature estimation and prediction are critical for online regulation of temperature and hot spots on today's high performance processors. In this paper, we present a new method, called FRETEP, to accurately estimate and predict the full-chip temperature at runtime under more practical conditions where we have inaccurate thermal model, less accurate power estimations and limited number of on-chip physical thermal sensors. FRETEP employs a number of new techniques to address this problem. First, we propose a new thermal sensor based error compensation method to correct the errors due to the inaccuracies in thermal model and power estimations. Second, we raise a new correlation based method for error compensation estimation with limited number of thermal sensors. Third, we optimize the compact modeling technique and integrate it into the error compensation process in order to perform the thermal estimation with error compensation at runtime. Last but not least, to enable accurate temperature prediction for the emerging predictive thermal management, we design a full-chip thermal prediction framework employing time series prediction method. Experimental results show FRETEP accurately estimates and predicts the full-chip thermal behavior with very low overhead introduced and compares very favorably with the Kalman filter based approach on standard SPEC benchmarks.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82642018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Ripple: An effective routability-driven placer by iterative cell movement 涟漪:一个有效的可达性驱动的砂矿迭代细胞运动
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105308
Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui, Evangeline F. Y. Young
In this paper, we describe a routability-driven placer called Ripple. Two major techniques called cell inflation and net-based movement are used in global placement followed by a rough legalization step to reduce congestion. Cell inflation is performed in the horizontal and the vertical directions alternatively. We propose a new method called net-based movement, in which a target position is calculated for each cell by considering the movement of a net as a whole instead of working on each cell individually. In detailed placement, we use a combination of two kinds of strategy: the traditional HPWL-driven approach and our new congestion-driven approach. Experimental results show that Ripple is very effective in improving routability. Comparing with our pervious placer, which is the winner in the ISPD 2011 Contest, Ripple can further improve the overflow by 38% while reduce the runtime is reduced by 54%.
在本文中,我们描述了一个名为Ripple的可达性驱动的placer。两种主要的技术称为细胞膨胀和基于网络的移动在全局布局中使用,然后是一个粗略的合法化步骤来减少拥塞。细胞膨胀可在水平方向和垂直方向交替进行。我们提出了一种新的方法,称为基于网络的移动,其中通过考虑网络的整体运动来计算每个细胞的目标位置,而不是单独计算每个细胞。在详细的布局中,我们结合使用两种策略:传统的hpwl驱动方法和新的拥堵驱动方法。实验结果表明,Ripple在提高可达性方面是非常有效的。与我们在2011年ISPD竞赛中获胜的上一个placer相比,Ripple可以进一步提高溢出38%,同时减少运行时间54%。
{"title":"Ripple: An effective routability-driven placer by iterative cell movement","authors":"Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui, Evangeline F. Y. Young","doi":"10.1109/ICCAD.2011.6105308","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105308","url":null,"abstract":"In this paper, we describe a routability-driven placer called Ripple. Two major techniques called cell inflation and net-based movement are used in global placement followed by a rough legalization step to reduce congestion. Cell inflation is performed in the horizontal and the vertical directions alternatively. We propose a new method called net-based movement, in which a target position is calculated for each cell by considering the movement of a net as a whole instead of working on each cell individually. In detailed placement, we use a combination of two kinds of strategy: the traditional HPWL-driven approach and our new congestion-driven approach. Experimental results show that Ripple is very effective in improving routability. Comparing with our pervious placer, which is the winner in the ISPD 2011 Contest, Ripple can further improve the overflow by 38% while reduce the runtime is reduced by 54%.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89054526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
The approximation scheme for peak power driven voltage partitioning 峰值功率驱动电压分配的近似方案
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105411
Jia Wang, Xiaodao Chen, Chen Liao, Shiyan Hu
With advancing technology, large dynamic power consumption has significantly limited circuit miniaturization. Minimizing peak power consumption, which is defined as the maximum power consumption among all voltage partitions, is important since it enables energy saving from the voltage island shutdown mechanism. In this paper, we prove that the peak power driven voltage partitioning problem is NP-complete and propose an efficient provably good fully polynomial time approximation scheme for it. The new algorithm can approximate the optimal peak power driven voltage partitioning solution in O(m2 (mn/∊4)) time within a factor of (1 + ∊) for sufficiently small positive e, where n is the number of circuit blocks and m is the number of partitions which is a small constant in practice. Our experimental results demonstrate that the dynamic programming cannot finish for even 20 blocks while our new approximation algorithm runs fast. In particular, varying e, orders of magnitude speedup can be obtained with only 0.6% power increase. The tradeoff between the peak power minimization and the total power minimization is also investigated. We demonstrate that the total power minimization algorithm obtains good results in total power but with quite large peak power, while our peak power optimization algorithm can achieve on average 26.5% reduction in peak power with only 0.46% increase in total power. Moreover, our peak power driven voltage partitioning algorithm is integrated into a simulated annealing based floorplanning technique. Experimental results demonstrate that compared to total power driven floorplanning, the peak power driven floorplanning can significantly reduce peak power with only little impact in total power, HPWL, estimated power ground routing cost, level shifter cost and runtime. Further, when the voltage island shutdown is performed, peak power driven voltage partitioning can lead to over 10% more energy saving than a greedy frequency based voltage partitioning when multiple idle block sequences are considered.
随着技术的进步,巨大的动态功耗极大地限制了电路的小型化。最小化峰值功耗(定义为所有电压分区中的最大功耗)非常重要,因为它可以通过电压岛关闭机制节省能源。在本文中,我们证明了峰值功率驱动的电压分配问题是np完全的,并提出了一个有效且可证明良好的全多项式时间逼近方案。对于足够小的正e,新算法可以在O(m2 (mn/ 4))时间内在因子(1 +)范围内逼近最优峰值功率驱动电压划分解,其中n为电路块数,m为划分数,这在实际中是一个很小的常数。实验结果表明,动态规划甚至不能完成20个块,而新的近似算法运行速度很快。特别是,改变e,仅增加0.6%的功率就可以获得数量级的加速。对峰值功率最小化和总功率最小化之间的权衡进行了研究。我们证明了总功率最小化算法在总功率上取得了良好的效果,但峰值功率相当大,而我们的峰值功率优化算法在总功率仅增加0.46%的情况下,峰值功率平均降低了26.5%。此外,我们的峰值功率驱动的电压分配算法集成到一个基于模拟退火的地板规划技术。实验结果表明,与总功率驱动布局相比,峰值功率驱动布局可以显著降低峰值功率,而对总功率、HPWL、估计功率地路由成本、电平移位器成本和运行时间的影响很小。此外,当执行电压岛关闭时,考虑多个空闲块序列时,峰值功率驱动的电压分区比基于贪婪频率的电压分区节能10%以上。
{"title":"The approximation scheme for peak power driven voltage partitioning","authors":"Jia Wang, Xiaodao Chen, Chen Liao, Shiyan Hu","doi":"10.1109/ICCAD.2011.6105411","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105411","url":null,"abstract":"With advancing technology, large dynamic power consumption has significantly limited circuit miniaturization. Minimizing peak power consumption, which is defined as the maximum power consumption among all voltage partitions, is important since it enables energy saving from the voltage island shutdown mechanism. In this paper, we prove that the peak power driven voltage partitioning problem is NP-complete and propose an efficient provably good fully polynomial time approximation scheme for it. The new algorithm can approximate the optimal peak power driven voltage partitioning solution in O(m2 (mn/∊4)) time within a factor of (1 + ∊) for sufficiently small positive e, where n is the number of circuit blocks and m is the number of partitions which is a small constant in practice. Our experimental results demonstrate that the dynamic programming cannot finish for even 20 blocks while our new approximation algorithm runs fast. In particular, varying e, orders of magnitude speedup can be obtained with only 0.6% power increase. The tradeoff between the peak power minimization and the total power minimization is also investigated. We demonstrate that the total power minimization algorithm obtains good results in total power but with quite large peak power, while our peak power optimization algorithm can achieve on average 26.5% reduction in peak power with only 0.46% increase in total power. Moreover, our peak power driven voltage partitioning algorithm is integrated into a simulated annealing based floorplanning technique. Experimental results demonstrate that compared to total power driven floorplanning, the peak power driven floorplanning can significantly reduce peak power with only little impact in total power, HPWL, estimated power ground routing cost, level shifter cost and runtime. Further, when the voltage island shutdown is performed, peak power driven voltage partitioning can lead to over 10% more energy saving than a greedy frequency based voltage partitioning when multiple idle block sequences are considered.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83712963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A new method for multiparameter robust stability distribution analysis of linear analog circuits 线性模拟电路多参数鲁棒稳定性分布分析的新方法
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105363
Changhao Yan, Sheng-Guo Wang, Xuan Zeng
A correlation-first bisection method is proposed for analyzing the robust stability distribution of linear analog circuits in the multi-parameter space. This new method first transfers the complex multi-parameter robust stability problem into nonlinear inequalities by the Routh criterion, and then solves them by interval arithmetic and new bisection strategy. The axis with strong relationship to the functions dominating the stability is bisected. Furthermore, the Monte Carlo method is adopted for the uncertain subdomains to increase the convergence speed of bisection methods as the cube number increases. The proposed method has no error in both stable and unstable areas, and high efficiency to determine the complex boundaries between the stable and unstable areas. Numerical results validate this new method.
针对线性模拟电路在多参数空间中的鲁棒稳定性分布,提出了一种相关优先平分法。该方法首先利用Routh准则将复杂的多参数鲁棒稳定性问题转化为非线性不等式,然后利用区间算法和新的对分策略进行求解。对与控制稳定性的函数有密切关系的轴进行等分。此外,对不确定子域采用蒙特卡罗方法,随着立方体数的增加,提高了二分法的收敛速度。该方法在稳定区和不稳定区均无误差,在确定稳定区和不稳定区之间的复杂边界时效率高。数值结果验证了该方法的有效性。
{"title":"A new method for multiparameter robust stability distribution analysis of linear analog circuits","authors":"Changhao Yan, Sheng-Guo Wang, Xuan Zeng","doi":"10.1109/ICCAD.2011.6105363","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105363","url":null,"abstract":"A correlation-first bisection method is proposed for analyzing the robust stability distribution of linear analog circuits in the multi-parameter space. This new method first transfers the complex multi-parameter robust stability problem into nonlinear inequalities by the Routh criterion, and then solves them by interval arithmetic and new bisection strategy. The axis with strong relationship to the functions dominating the stability is bisected. Furthermore, the Monte Carlo method is adopted for the uncertain subdomains to increase the convergence speed of bisection methods as the cube number increases. The proposed method has no error in both stable and unstable areas, and high efficiency to determine the complex boundaries between the stable and unstable areas. Numerical results validate this new method.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90903604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Test-data volume and scan-power reduction with low ATE interface for multi-core SoCs 测试数据量和扫描功耗降低与低ATE接口的多核soc
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105413
V. Tenentes, X. Kavousianos
Symbol-based and linear-based test-data compression techniques have complementary properties which are very attractive for testing multi-core SoCs. However, only linear-based techniques have been adopted by industry as the symbol-based techniques have not yet revealed their real potential for testing large circuits. We present a novel compression method and a low-cost decompression architecture that combine the advantages of both symbol-based and linear-based techniques under a unified solution for multi-core SoCs. The proposed method offers higher compression than any other method presented so far, very low shift switching activity and very short test sequence length at the same time. Moreover, contrary to existing techniques, it offers a complete solution for testing multi-core SoCs as it is suitable for cores of both known and unknown structure (IP cores) that usually co-exist in modern SoCs. Finally, it supports very low pin-count interface as it needs only one tester channel to download fast the compressed test data on-chip.
基于符号的测试数据压缩技术和基于线性的测试数据压缩技术具有互补的特性,这对于测试多核soc非常有吸引力。然而,只有基于线性的技术被工业采用,因为基于符号的技术还没有显示出它们在测试大型电路方面的真正潜力。我们提出了一种新的压缩方法和低成本的解压缩架构,在统一的多核soc解决方案下结合了基于符号和基于线性的技术的优点。该方法具有比目前提出的任何其他方法更高的压缩率,同时具有极低的移位切换活性和极短的测试序列长度。此外,与现有技术相反,它为测试多核soc提供了完整的解决方案,因为它适用于现代soc中通常共存的已知和未知结构(IP核)的内核。最后,它支持极低的引脚数接口,因为它只需要一个测试通道来快速下载压缩的测试数据。
{"title":"Test-data volume and scan-power reduction with low ATE interface for multi-core SoCs","authors":"V. Tenentes, X. Kavousianos","doi":"10.1109/ICCAD.2011.6105413","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105413","url":null,"abstract":"Symbol-based and linear-based test-data compression techniques have complementary properties which are very attractive for testing multi-core SoCs. However, only linear-based techniques have been adopted by industry as the symbol-based techniques have not yet revealed their real potential for testing large circuits. We present a novel compression method and a low-cost decompression architecture that combine the advantages of both symbol-based and linear-based techniques under a unified solution for multi-core SoCs. The proposed method offers higher compression than any other method presented so far, very low shift switching activity and very short test sequence length at the same time. Moreover, contrary to existing techniques, it offers a complete solution for testing multi-core SoCs as it is suitable for cores of both known and unknown structure (IP cores) that usually co-exist in modern SoCs. Finally, it supports very low pin-count interface as it needs only one tester channel to download fast the compressed test data on-chip.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89884398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Statistical aging analysis with process variation consideration 考虑工艺变化的统计老化分析
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105362
Sangwoo Han, Joohee Choung, Byung-Su Kim, B. Lee, Hungbok Choi, Juho Kim
As CMOS devices become smaller, process and aging variations become a major issue for circuit reliability and yield. In this paper, we analyze the effects of process variations on aging effects such as hot carrier injection (HCI) and negative bias temperature instability (NBTI). Using Monte-Carlo based transistor-level simulations including principal component analysis (PCA), the correlations between process variations and aging variations are considered. The accuracy of analysis is improved (2–7%) compared to other methods in which the correlations are ignored, especially in smaller technologies.
随着CMOS器件的小型化,工艺和老化变化成为影响电路可靠性和良率的主要问题。本文分析了工艺变化对热载流子注入(HCI)和负偏置温度不稳定性(NBTI)等老化效应的影响。利用基于蒙特卡罗的晶体管级模拟,包括主成分分析(PCA),考虑了工艺变化与老化变化之间的相关性。与忽略相关性的其他方法相比,分析的准确性得到了提高(2-7%),特别是在较小的技术中。
{"title":"Statistical aging analysis with process variation consideration","authors":"Sangwoo Han, Joohee Choung, Byung-Su Kim, B. Lee, Hungbok Choi, Juho Kim","doi":"10.1109/ICCAD.2011.6105362","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105362","url":null,"abstract":"As CMOS devices become smaller, process and aging variations become a major issue for circuit reliability and yield. In this paper, we analyze the effects of process variations on aging effects such as hot carrier injection (HCI) and negative bias temperature instability (NBTI). Using Monte-Carlo based transistor-level simulations including principal component analysis (PCA), the correlations between process variations and aging variations are considered. The accuracy of analysis is improved (2–7%) compared to other methods in which the correlations are ignored, especially in smaller technologies.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90510205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Optimizing data locality using array tiling 使用数组平铺优化数据局部性
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105318
W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir
Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.
数据转换是最大化缓存局部性的关键优化之一。对于多维数组,传统的数据转换策略采用线性数据布局,例如行为主或列为主。尽管线性布局在大多数情况下可以很好地匹配线性内存空间,但它只能优化单个引用的自空间局部性。在这项工作中,我们提出了一种新的数据布局转换框架,它能够确定应用程序中每个数组的平铺布局。平铺布局可以利用不同引用之间的组空间局部性,提高缓存线利用率。在我们的策略中,在一次循环迭代中被不同引用访问的数据元素被放在一个平铺中,并在运行时被提取到相同的缓存行中。这有助于减少缓存中的冲突丢失。我们在商用多核机器上使用30个基准测试来评估我们的数据布局转换框架。实验结果表明,我们的方法优于最先进的数据转换策略,并且可以很好地处理大型核心计数。
{"title":"Optimizing data locality using array tiling","authors":"W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir","doi":"10.1109/ICCAD.2011.6105318","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105318","url":null,"abstract":"Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84204552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimal statistical chip disposition 最佳统计芯片配置
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105312
V. Zolotov, Jinjun Xiong
A chip disposition criterion is used to decide whether to accept or discard a chip during chip testing. Its quality directly impacts both yield and product quality loss (PQL). The importance becomes even more significant with the increasingly large process variation. For the first time, this paper rigorously formulates the optimal chip disposition problem, and proposes an elegant solution. We show that the optimal chip disposition criterion is different from the existing industry practice. Our solution can find the optimal disposition criterion efficiently with better yield under the same PQL constraint, or lower PQL under the same yield constraint.
芯片处置准则用于在芯片测试期间决定是否接受或丢弃芯片。其质量直接影响到成品率和产品质量损失(PQL)。随着越来越大的过程变化,其重要性变得更加重要。本文首次严谨地表述了最优芯片配置问题,并提出了一种简洁的解决方案。结果表明,该优化芯片配置准则不同于现有的行业实践。我们的解决方案可以在相同的PQL约束下有效地找到具有更好的产量或相同产量约束下更低的PQL的最优配置准则。
{"title":"Optimal statistical chip disposition","authors":"V. Zolotov, Jinjun Xiong","doi":"10.1109/ICCAD.2011.6105312","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105312","url":null,"abstract":"A chip disposition criterion is used to decide whether to accept or discard a chip during chip testing. Its quality directly impacts both yield and product quality loss (PQL). The importance becomes even more significant with the increasingly large process variation. For the first time, this paper rigorously formulates the optimal chip disposition problem, and proposes an elegant solution. We show that the optimal chip disposition criterion is different from the existing industry practice. Our solution can find the optimal disposition criterion efficiently with better yield under the same PQL constraint, or lower PQL under the same yield constraint.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84206273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Balanced reconfiguration of storage banks in a hybrid electrical energy storage system 混合电力储能系统中储能组的均衡重构
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105395
Younghyun Kim, Sangyoung Park, Yanzhi Wang, Q. Xie, N. Chang, M. Poncino, Massoud Pedram
Compared with the conventional homogeneous electrical energy storage (EES) systems, hybrid electrical energy storage (HEES) systems provide high output power and energy density as well as high power conversion efficiency and low self-discharge at a low capital cost. Cycle efficiency of a HEES system (which is defined as the ratio of energy which is delivered by the HEES system to the load device to energy which is supplied by the power source to the HEES system) is one of the most important factors in determining the overall operational cost of the system. Therefore, EES banks within the HEES system should be prudently designed in order to maximize the overall cycle efficiency. However, the cycle efficiency is not only dependent on the EES element type, but also the dynamic conditions such as charge and discharge rates and energy efficiency of peripheral power circuitries. Also, due to the practical limitations of the power conversion circuitry, the specified capacity of the EES bank cannot be fully utilized, which in turn results in over-provisioning and thus additional capital expenditure for a HEES system with a specified level of service. This is the first paper that presents an EES bank reconfiguration architecture aiming at cycle efficiency and capacity utilization enhancement. We first provide a formal definition of balanced configurations and provide a general reconfigurable architecture for a HEES system, analyze key properties of the balanced reconfiguration, and propose a dynamic reconfiguration algorithm for optimal, online adaptation of the HEES system configuration to the characteristics of the power sources and the load devices as well as internal states of the EES banks. Experimental results demonstrate an overall cycle efficiency improvement of by up to 108% for a DC power demand profile, and pulse duty cycle improvement of by up to 127% for high-current pulsed power profile. We also present analysis results for capacity utilization improvement for a reconfigurable EES bank.
与传统的均质储能(EES)系统相比,混合储能(HEES)系统具有高输出功率和能量密度、高功率转换效率和低自放电的特点,且投资成本低。HEES系统的循环效率(定义为HEES系统向负载设备提供的能量与电源向HEES系统提供的能量的比率)是决定系统总体运行成本的最重要因素之一。因此,应谨慎设计HEES系统内的EES银行,以实现整体循环效率的最大化。然而,循环效率不仅取决于EES元件类型,还与周边电源电路的充放电速率和能量效率等动态条件有关。此外,由于功率转换电路的实际限制,EES库的指定容量不能被充分利用,这反过来导致过度供应,从而为具有指定服务水平的HEES系统增加了额外的资本支出。本文首次提出了以提高循环效率和产能利用率为目标的EES银行重构体系结构。本文首先给出了平衡配置的正式定义,并为HEES系统提供了一个通用的可重构架构,分析了平衡重构的关键特性,并提出了一种动态重构算法,以优化、在线地适应HEES系统配置,以适应电源和负载设备的特性以及EES银行的内部状态。实验结果表明,对于直流电源需求曲线,总体循环效率提高了108%,对于大电流脉冲功率曲线,脉冲占空比提高了127%。我们还提出了可重构EES银行容量利用率改进的分析结果。
{"title":"Balanced reconfiguration of storage banks in a hybrid electrical energy storage system","authors":"Younghyun Kim, Sangyoung Park, Yanzhi Wang, Q. Xie, N. Chang, M. Poncino, Massoud Pedram","doi":"10.1109/ICCAD.2011.6105395","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105395","url":null,"abstract":"Compared with the conventional homogeneous electrical energy storage (EES) systems, hybrid electrical energy storage (HEES) systems provide high output power and energy density as well as high power conversion efficiency and low self-discharge at a low capital cost. Cycle efficiency of a HEES system (which is defined as the ratio of energy which is delivered by the HEES system to the load device to energy which is supplied by the power source to the HEES system) is one of the most important factors in determining the overall operational cost of the system. Therefore, EES banks within the HEES system should be prudently designed in order to maximize the overall cycle efficiency. However, the cycle efficiency is not only dependent on the EES element type, but also the dynamic conditions such as charge and discharge rates and energy efficiency of peripheral power circuitries. Also, due to the practical limitations of the power conversion circuitry, the specified capacity of the EES bank cannot be fully utilized, which in turn results in over-provisioning and thus additional capital expenditure for a HEES system with a specified level of service. This is the first paper that presents an EES bank reconfiguration architecture aiming at cycle efficiency and capacity utilization enhancement. We first provide a formal definition of balanced configurations and provide a general reconfigurable architecture for a HEES system, analyze key properties of the balanced reconfiguration, and propose a dynamic reconfiguration algorithm for optimal, online adaptation of the HEES system configuration to the characteristics of the power sources and the load devices as well as internal states of the EES banks. Experimental results demonstrate an overall cycle efficiency improvement of by up to 108% for a DC power demand profile, and pulse duty cycle improvement of by up to 127% for high-current pulsed power profile. We also present analysis results for capacity utilization improvement for a reconfigurable EES bank.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84436015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
High-quality global routing for multiple dynamic supply voltage designs 多动态电源电压设计的高质量全局路由
Pub Date : 2011-11-07 DOI: 10.5555/2132325.2132387
Wen-Hao Liu, Yih-Lang Li, Kai-Yuan Chao
Multiple dynamic supply voltage (MDSV) provides an effective way to reduce dynamic power and is widely used in high-end or low-power designs. The challenge of routing MDSV designs is that the net in MDSV designs needs to be planned carefully to avoid electrical problems or functional failure as a long interconnect path pass through the shutdown power domains. As the first work to address the MDSV global routing problem, power domain-aware routing (PDR) problem is defined and the point-to-point PDR algorithm is also presented herein with look-ahead path selection method and look-up table acceleration approach. For multi-pin net routings, a novel constant-time table-lookup mechanism by invoking four enhanced monotonic routings to fast compute the least-cost monotonic path from every node to the target sub-tree is presented to speed up the query about routing cost (including driven-length slack) to target during multi-source multi-target PDR. Experimental results confirm that the proposed MDSV-based global router can efficiently identify legally optimized routing results for MDSV designs, and can effectively reduce overflow, wire length, inserted level shifters and runtime.
多动态电源电压(MDSV)提供了一种有效的降低动态功率的方法,广泛应用于高端或低功耗设计中。路由MDSV设计的挑战在于,MDSV设计中的网络需要仔细规划,以避免电气问题或功能故障,因为长互连路径通过关闭电源域。作为解决MDSV全局路由问题的第一步,定义了功率域感知路由(PDR)问题,并采用前瞻性路径选择方法和查找表加速方法提出了点对点PDR算法。针对多引脚网络路由,提出了一种新的恒时查找表机制,通过调用4条增强的单调路由,快速计算出从每个节点到目标子树的最小开销单调路径,从而加快了多源多目标PDR中到目标路由开销(包括驱动长度松弛)的查询速度。实验结果表明,所提出的基于MDSV的全局路由器能够有效地识别合法优化的MDSV路由结果,并能有效地减少溢出、导线长度、插入电平移位器和运行时间。
{"title":"High-quality global routing for multiple dynamic supply voltage designs","authors":"Wen-Hao Liu, Yih-Lang Li, Kai-Yuan Chao","doi":"10.5555/2132325.2132387","DOIUrl":"https://doi.org/10.5555/2132325.2132387","url":null,"abstract":"Multiple dynamic supply voltage (MDSV) provides an effective way to reduce dynamic power and is widely used in high-end or low-power designs. The challenge of routing MDSV designs is that the net in MDSV designs needs to be planned carefully to avoid electrical problems or functional failure as a long interconnect path pass through the shutdown power domains. As the first work to address the MDSV global routing problem, power domain-aware routing (PDR) problem is defined and the point-to-point PDR algorithm is also presented herein with look-ahead path selection method and look-up table acceleration approach. For multi-pin net routings, a novel constant-time table-lookup mechanism by invoking four enhanced monotonic routings to fast compute the least-cost monotonic path from every node to the target sub-tree is presented to speed up the query about routing cost (including driven-length slack) to target during multi-source multi-target PDR. Experimental results confirm that the proposed MDSV-based global router can efficiently identify legally optimized routing results for MDSV designs, and can effectively reduce overflow, wire length, inserted level shifters and runtime.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84792938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1