Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105408
Hai Wang, S. Tan, Guangdeng Liao, Rafael Quintanilla, Ashish Gupta
Temperature estimation and prediction are critical for online regulation of temperature and hot spots on today's high performance processors. In this paper, we present a new method, called FRETEP, to accurately estimate and predict the full-chip temperature at runtime under more practical conditions where we have inaccurate thermal model, less accurate power estimations and limited number of on-chip physical thermal sensors. FRETEP employs a number of new techniques to address this problem. First, we propose a new thermal sensor based error compensation method to correct the errors due to the inaccuracies in thermal model and power estimations. Second, we raise a new correlation based method for error compensation estimation with limited number of thermal sensors. Third, we optimize the compact modeling technique and integrate it into the error compensation process in order to perform the thermal estimation with error compensation at runtime. Last but not least, to enable accurate temperature prediction for the emerging predictive thermal management, we design a full-chip thermal prediction framework employing time series prediction method. Experimental results show FRETEP accurately estimates and predicts the full-chip thermal behavior with very low overhead introduced and compares very favorably with the Kalman filter based approach on standard SPEC benchmarks.
{"title":"Full-chip runtime error-tolerant thermal estimation and prediction for practical thermal management","authors":"Hai Wang, S. Tan, Guangdeng Liao, Rafael Quintanilla, Ashish Gupta","doi":"10.1109/ICCAD.2011.6105408","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105408","url":null,"abstract":"Temperature estimation and prediction are critical for online regulation of temperature and hot spots on today's high performance processors. In this paper, we present a new method, called FRETEP, to accurately estimate and predict the full-chip temperature at runtime under more practical conditions where we have inaccurate thermal model, less accurate power estimations and limited number of on-chip physical thermal sensors. FRETEP employs a number of new techniques to address this problem. First, we propose a new thermal sensor based error compensation method to correct the errors due to the inaccuracies in thermal model and power estimations. Second, we raise a new correlation based method for error compensation estimation with limited number of thermal sensors. Third, we optimize the compact modeling technique and integrate it into the error compensation process in order to perform the thermal estimation with error compensation at runtime. Last but not least, to enable accurate temperature prediction for the emerging predictive thermal management, we design a full-chip thermal prediction framework employing time series prediction method. Experimental results show FRETEP accurately estimates and predicts the full-chip thermal behavior with very low overhead introduced and compares very favorably with the Kalman filter based approach on standard SPEC benchmarks.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"30 1","pages":"716-723"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82642018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105308
Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui, Evangeline F. Y. Young
In this paper, we describe a routability-driven placer called Ripple. Two major techniques called cell inflation and net-based movement are used in global placement followed by a rough legalization step to reduce congestion. Cell inflation is performed in the horizontal and the vertical directions alternatively. We propose a new method called net-based movement, in which a target position is calculated for each cell by considering the movement of a net as a whole instead of working on each cell individually. In detailed placement, we use a combination of two kinds of strategy: the traditional HPWL-driven approach and our new congestion-driven approach. Experimental results show that Ripple is very effective in improving routability. Comparing with our pervious placer, which is the winner in the ISPD 2011 Contest, Ripple can further improve the overflow by 38% while reduce the runtime is reduced by 54%.
{"title":"Ripple: An effective routability-driven placer by iterative cell movement","authors":"Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui, Evangeline F. Y. Young","doi":"10.1109/ICCAD.2011.6105308","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105308","url":null,"abstract":"In this paper, we describe a routability-driven placer called Ripple. Two major techniques called cell inflation and net-based movement are used in global placement followed by a rough legalization step to reduce congestion. Cell inflation is performed in the horizontal and the vertical directions alternatively. We propose a new method called net-based movement, in which a target position is calculated for each cell by considering the movement of a net as a whole instead of working on each cell individually. In detailed placement, we use a combination of two kinds of strategy: the traditional HPWL-driven approach and our new congestion-driven approach. Experimental results show that Ripple is very effective in improving routability. Comparing with our pervious placer, which is the winner in the ISPD 2011 Contest, Ripple can further improve the overflow by 38% while reduce the runtime is reduced by 54%.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"25 1","pages":"74-79"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89054526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105411
Jia Wang, Xiaodao Chen, Chen Liao, Shiyan Hu
With advancing technology, large dynamic power consumption has significantly limited circuit miniaturization. Minimizing peak power consumption, which is defined as the maximum power consumption among all voltage partitions, is important since it enables energy saving from the voltage island shutdown mechanism. In this paper, we prove that the peak power driven voltage partitioning problem is NP-complete and propose an efficient provably good fully polynomial time approximation scheme for it. The new algorithm can approximate the optimal peak power driven voltage partitioning solution in O(m2 (mn/∊4)) time within a factor of (1 + ∊) for sufficiently small positive e, where n is the number of circuit blocks and m is the number of partitions which is a small constant in practice. Our experimental results demonstrate that the dynamic programming cannot finish for even 20 blocks while our new approximation algorithm runs fast. In particular, varying e, orders of magnitude speedup can be obtained with only 0.6% power increase. The tradeoff between the peak power minimization and the total power minimization is also investigated. We demonstrate that the total power minimization algorithm obtains good results in total power but with quite large peak power, while our peak power optimization algorithm can achieve on average 26.5% reduction in peak power with only 0.46% increase in total power. Moreover, our peak power driven voltage partitioning algorithm is integrated into a simulated annealing based floorplanning technique. Experimental results demonstrate that compared to total power driven floorplanning, the peak power driven floorplanning can significantly reduce peak power with only little impact in total power, HPWL, estimated power ground routing cost, level shifter cost and runtime. Further, when the voltage island shutdown is performed, peak power driven voltage partitioning can lead to over 10% more energy saving than a greedy frequency based voltage partitioning when multiple idle block sequences are considered.
{"title":"The approximation scheme for peak power driven voltage partitioning","authors":"Jia Wang, Xiaodao Chen, Chen Liao, Shiyan Hu","doi":"10.1109/ICCAD.2011.6105411","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105411","url":null,"abstract":"With advancing technology, large dynamic power consumption has significantly limited circuit miniaturization. Minimizing peak power consumption, which is defined as the maximum power consumption among all voltage partitions, is important since it enables energy saving from the voltage island shutdown mechanism. In this paper, we prove that the peak power driven voltage partitioning problem is NP-complete and propose an efficient provably good fully polynomial time approximation scheme for it. The new algorithm can approximate the optimal peak power driven voltage partitioning solution in O(m2 (mn/∊4)) time within a factor of (1 + ∊) for sufficiently small positive e, where n is the number of circuit blocks and m is the number of partitions which is a small constant in practice. Our experimental results demonstrate that the dynamic programming cannot finish for even 20 blocks while our new approximation algorithm runs fast. In particular, varying e, orders of magnitude speedup can be obtained with only 0.6% power increase. The tradeoff between the peak power minimization and the total power minimization is also investigated. We demonstrate that the total power minimization algorithm obtains good results in total power but with quite large peak power, while our peak power optimization algorithm can achieve on average 26.5% reduction in peak power with only 0.46% increase in total power. Moreover, our peak power driven voltage partitioning algorithm is integrated into a simulated annealing based floorplanning technique. Experimental results demonstrate that compared to total power driven floorplanning, the peak power driven floorplanning can significantly reduce peak power with only little impact in total power, HPWL, estimated power ground routing cost, level shifter cost and runtime. Further, when the voltage island shutdown is performed, peak power driven voltage partitioning can lead to over 10% more energy saving than a greedy frequency based voltage partitioning when multiple idle block sequences are considered.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"3 1","pages":"736-741"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83712963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105363
Changhao Yan, Sheng-Guo Wang, Xuan Zeng
A correlation-first bisection method is proposed for analyzing the robust stability distribution of linear analog circuits in the multi-parameter space. This new method first transfers the complex multi-parameter robust stability problem into nonlinear inequalities by the Routh criterion, and then solves them by interval arithmetic and new bisection strategy. The axis with strong relationship to the functions dominating the stability is bisected. Furthermore, the Monte Carlo method is adopted for the uncertain subdomains to increase the convergence speed of bisection methods as the cube number increases. The proposed method has no error in both stable and unstable areas, and high efficiency to determine the complex boundaries between the stable and unstable areas. Numerical results validate this new method.
{"title":"A new method for multiparameter robust stability distribution analysis of linear analog circuits","authors":"Changhao Yan, Sheng-Guo Wang, Xuan Zeng","doi":"10.1109/ICCAD.2011.6105363","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105363","url":null,"abstract":"A correlation-first bisection method is proposed for analyzing the robust stability distribution of linear analog circuits in the multi-parameter space. This new method first transfers the complex multi-parameter robust stability problem into nonlinear inequalities by the Routh criterion, and then solves them by interval arithmetic and new bisection strategy. The axis with strong relationship to the functions dominating the stability is bisected. Furthermore, the Monte Carlo method is adopted for the uncertain subdomains to increase the convergence speed of bisection methods as the cube number increases. The proposed method has no error in both stable and unstable areas, and high efficiency to determine the complex boundaries between the stable and unstable areas. Numerical results validate this new method.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"47 1","pages":"420-427"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90903604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105413
V. Tenentes, X. Kavousianos
Symbol-based and linear-based test-data compression techniques have complementary properties which are very attractive for testing multi-core SoCs. However, only linear-based techniques have been adopted by industry as the symbol-based techniques have not yet revealed their real potential for testing large circuits. We present a novel compression method and a low-cost decompression architecture that combine the advantages of both symbol-based and linear-based techniques under a unified solution for multi-core SoCs. The proposed method offers higher compression than any other method presented so far, very low shift switching activity and very short test sequence length at the same time. Moreover, contrary to existing techniques, it offers a complete solution for testing multi-core SoCs as it is suitable for cores of both known and unknown structure (IP cores) that usually co-exist in modern SoCs. Finally, it supports very low pin-count interface as it needs only one tester channel to download fast the compressed test data on-chip.
{"title":"Test-data volume and scan-power reduction with low ATE interface for multi-core SoCs","authors":"V. Tenentes, X. Kavousianos","doi":"10.1109/ICCAD.2011.6105413","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105413","url":null,"abstract":"Symbol-based and linear-based test-data compression techniques have complementary properties which are very attractive for testing multi-core SoCs. However, only linear-based techniques have been adopted by industry as the symbol-based techniques have not yet revealed their real potential for testing large circuits. We present a novel compression method and a low-cost decompression architecture that combine the advantages of both symbol-based and linear-based techniques under a unified solution for multi-core SoCs. The proposed method offers higher compression than any other method presented so far, very low shift switching activity and very short test sequence length at the same time. Moreover, contrary to existing techniques, it offers a complete solution for testing multi-core SoCs as it is suitable for cores of both known and unknown structure (IP cores) that usually co-exist in modern SoCs. Finally, it supports very low pin-count interface as it needs only one tester channel to download fast the compressed test data on-chip.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"747-754"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89884398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105362
Sangwoo Han, Joohee Choung, Byung-Su Kim, B. Lee, Hungbok Choi, Juho Kim
As CMOS devices become smaller, process and aging variations become a major issue for circuit reliability and yield. In this paper, we analyze the effects of process variations on aging effects such as hot carrier injection (HCI) and negative bias temperature instability (NBTI). Using Monte-Carlo based transistor-level simulations including principal component analysis (PCA), the correlations between process variations and aging variations are considered. The accuracy of analysis is improved (2–7%) compared to other methods in which the correlations are ignored, especially in smaller technologies.
{"title":"Statistical aging analysis with process variation consideration","authors":"Sangwoo Han, Joohee Choung, Byung-Su Kim, B. Lee, Hungbok Choi, Juho Kim","doi":"10.1109/ICCAD.2011.6105362","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105362","url":null,"abstract":"As CMOS devices become smaller, process and aging variations become a major issue for circuit reliability and yield. In this paper, we analyze the effects of process variations on aging effects such as hot carrier injection (HCI) and negative bias temperature instability (NBTI). Using Monte-Carlo based transistor-level simulations including principal component analysis (PCA), the correlations between process variations and aging variations are considered. The accuracy of analysis is improved (2–7%) compared to other methods in which the correlations are ignored, especially in smaller technologies.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"56 1","pages":"412-419"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90510205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105318
W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir
Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.
{"title":"Optimizing data locality using array tiling","authors":"W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir","doi":"10.1109/ICCAD.2011.6105318","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105318","url":null,"abstract":"Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"3 1","pages":"142-149"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84204552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105312
V. Zolotov, Jinjun Xiong
A chip disposition criterion is used to decide whether to accept or discard a chip during chip testing. Its quality directly impacts both yield and product quality loss (PQL). The importance becomes even more significant with the increasingly large process variation. For the first time, this paper rigorously formulates the optimal chip disposition problem, and proposes an elegant solution. We show that the optimal chip disposition criterion is different from the existing industry practice. Our solution can find the optimal disposition criterion efficiently with better yield under the same PQL constraint, or lower PQL under the same yield constraint.
{"title":"Optimal statistical chip disposition","authors":"V. Zolotov, Jinjun Xiong","doi":"10.1109/ICCAD.2011.6105312","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105312","url":null,"abstract":"A chip disposition criterion is used to decide whether to accept or discard a chip during chip testing. Its quality directly impacts both yield and product quality loss (PQL). The importance becomes even more significant with the increasingly large process variation. For the first time, this paper rigorously formulates the optimal chip disposition problem, and proposes an elegant solution. We show that the optimal chip disposition criterion is different from the existing industry practice. Our solution can find the optimal disposition criterion efficiently with better yield under the same PQL constraint, or lower PQL under the same yield constraint.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"12 1","pages":"95-102"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84206273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-07DOI: 10.1109/ICCAD.2011.6105395
Younghyun Kim, Sangyoung Park, Yanzhi Wang, Q. Xie, N. Chang, M. Poncino, Massoud Pedram
Compared with the conventional homogeneous electrical energy storage (EES) systems, hybrid electrical energy storage (HEES) systems provide high output power and energy density as well as high power conversion efficiency and low self-discharge at a low capital cost. Cycle efficiency of a HEES system (which is defined as the ratio of energy which is delivered by the HEES system to the load device to energy which is supplied by the power source to the HEES system) is one of the most important factors in determining the overall operational cost of the system. Therefore, EES banks within the HEES system should be prudently designed in order to maximize the overall cycle efficiency. However, the cycle efficiency is not only dependent on the EES element type, but also the dynamic conditions such as charge and discharge rates and energy efficiency of peripheral power circuitries. Also, due to the practical limitations of the power conversion circuitry, the specified capacity of the EES bank cannot be fully utilized, which in turn results in over-provisioning and thus additional capital expenditure for a HEES system with a specified level of service. This is the first paper that presents an EES bank reconfiguration architecture aiming at cycle efficiency and capacity utilization enhancement. We first provide a formal definition of balanced configurations and provide a general reconfigurable architecture for a HEES system, analyze key properties of the balanced reconfiguration, and propose a dynamic reconfiguration algorithm for optimal, online adaptation of the HEES system configuration to the characteristics of the power sources and the load devices as well as internal states of the EES banks. Experimental results demonstrate an overall cycle efficiency improvement of by up to 108% for a DC power demand profile, and pulse duty cycle improvement of by up to 127% for high-current pulsed power profile. We also present analysis results for capacity utilization improvement for a reconfigurable EES bank.
{"title":"Balanced reconfiguration of storage banks in a hybrid electrical energy storage system","authors":"Younghyun Kim, Sangyoung Park, Yanzhi Wang, Q. Xie, N. Chang, M. Poncino, Massoud Pedram","doi":"10.1109/ICCAD.2011.6105395","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105395","url":null,"abstract":"Compared with the conventional homogeneous electrical energy storage (EES) systems, hybrid electrical energy storage (HEES) systems provide high output power and energy density as well as high power conversion efficiency and low self-discharge at a low capital cost. Cycle efficiency of a HEES system (which is defined as the ratio of energy which is delivered by the HEES system to the load device to energy which is supplied by the power source to the HEES system) is one of the most important factors in determining the overall operational cost of the system. Therefore, EES banks within the HEES system should be prudently designed in order to maximize the overall cycle efficiency. However, the cycle efficiency is not only dependent on the EES element type, but also the dynamic conditions such as charge and discharge rates and energy efficiency of peripheral power circuitries. Also, due to the practical limitations of the power conversion circuitry, the specified capacity of the EES bank cannot be fully utilized, which in turn results in over-provisioning and thus additional capital expenditure for a HEES system with a specified level of service. This is the first paper that presents an EES bank reconfiguration architecture aiming at cycle efficiency and capacity utilization enhancement. We first provide a formal definition of balanced configurations and provide a general reconfigurable architecture for a HEES system, analyze key properties of the balanced reconfiguration, and propose a dynamic reconfiguration algorithm for optimal, online adaptation of the HEES system configuration to the characteristics of the power sources and the load devices as well as internal states of the EES banks. Experimental results demonstrate an overall cycle efficiency improvement of by up to 108% for a DC power demand profile, and pulse duty cycle improvement of by up to 127% for high-current pulsed power profile. We also present analysis results for capacity utilization improvement for a reconfigurable EES bank.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"30 1","pages":"624-631"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84436015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multiple dynamic supply voltage (MDSV) provides an effective way to reduce dynamic power and is widely used in high-end or low-power designs. The challenge of routing MDSV designs is that the net in MDSV designs needs to be planned carefully to avoid electrical problems or functional failure as a long interconnect path pass through the shutdown power domains. As the first work to address the MDSV global routing problem, power domain-aware routing (PDR) problem is defined and the point-to-point PDR algorithm is also presented herein with look-ahead path selection method and look-up table acceleration approach. For multi-pin net routings, a novel constant-time table-lookup mechanism by invoking four enhanced monotonic routings to fast compute the least-cost monotonic path from every node to the target sub-tree is presented to speed up the query about routing cost (including driven-length slack) to target during multi-source multi-target PDR. Experimental results confirm that the proposed MDSV-based global router can efficiently identify legally optimized routing results for MDSV designs, and can effectively reduce overflow, wire length, inserted level shifters and runtime.
{"title":"High-quality global routing for multiple dynamic supply voltage designs","authors":"Wen-Hao Liu, Yih-Lang Li, Kai-Yuan Chao","doi":"10.5555/2132325.2132387","DOIUrl":"https://doi.org/10.5555/2132325.2132387","url":null,"abstract":"Multiple dynamic supply voltage (MDSV) provides an effective way to reduce dynamic power and is widely used in high-end or low-power designs. The challenge of routing MDSV designs is that the net in MDSV designs needs to be planned carefully to avoid electrical problems or functional failure as a long interconnect path pass through the shutdown power domains. As the first work to address the MDSV global routing problem, power domain-aware routing (PDR) problem is defined and the point-to-point PDR algorithm is also presented herein with look-ahead path selection method and look-up table acceleration approach. For multi-pin net routings, a novel constant-time table-lookup mechanism by invoking four enhanced monotonic routings to fast compute the least-cost monotonic path from every node to the target sub-tree is presented to speed up the query about routing cost (including driven-length slack) to target during multi-source multi-target PDR. Experimental results confirm that the proposed MDSV-based global router can efficiently identify legally optimized routing results for MDSV designs, and can effectively reduce overflow, wire length, inserted level shifters and runtime.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"75 1","pages":"263-269"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84792938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}